Building Encoderfiles with Docker
We provide a Docker image to build Encoderfiles without installing any dependencies on your system.
Use it for when you don't want to manage a local toolchain, or when you prefer running builds in an isolated environment for things like CI or ephemeral workers.
You can pull the image from our image registry:
Note on Architecture
Images are published for both x86_64 and arm64. If you're on a more exotic architecture, you'll need to build the encoderfile CLI from source — see our guide on Building from Source for more details.
Mounting Assets
The Docker container needs access to the following elements to build an Encoderfile:
- Config file - Your
encoderfile.ymlpassed via-fflag - Model assets - ONNX file, tokenizer,
config.jsonreferenced byencoderfile.yml. - Output directory - Where the
.encoderfilebinary will be written
All paths in your config must exist inside the container. Mount your project directory to /opt/encoderfile (the default working directory) so encoderfile can find everything and write the output back to your host machine.
Minimal Example
Assuming your directory looks like this:
And your build config (encoderfile.yml) looks like this:
encoderfile:
name: my-embedding-model
path: ./model
model_type: embedding
output_path: ./my-embedding-model.encoderfile
transform: |
--- Applies L2 normalization across the embedding dimension.
--- Each token embedding is scaled to unit length independently.
---
--- Args:
--- arr (Tensor): A tensor of shape [batch_size, n_tokens, hidden_dim].
--- Normalization is applied along the third axis (hidden_dim).
---
--- Returns:
--- Tensor: The input tensor with L2-normalized embeddings.
---@param arr Tensor
---@return Tensor
function Postprocess(arr)
return arr:lp_normalize(2, 3)
end
Run the following:
docker run \
-it \
-v "$(pwd):/opt/encoderfile" \
ghcr.io/mozilla-ai/encoderfile:latest \
build -f encoderfile.yml
What happens:
- Your current directory is mounted into the container at
/opt/encoderfile. - Inside the container, Encoderfile sees
encoderfile.ymland any model paths exactly as they appear in your project. - The resulting
.encoderfilebinary is written back into your project directory
Troubleshooting
“File not found: model.onnx”
Your path in config.yml doesn’t match where the file appears inside the container. Most of the time this is a missing -v "$(pwd):/opt/encoderfile" or a mismatched working directory.
“cargo not found”
You’re not using the correct image. Make sure you are using ghcr.io/mozilla-ai/encoderfile:latest
Paths behave differently on Windows
Use absolute paths or WSL. Docker-for-Windows path translation varies by shell.