Skip to content

Source installation

Developing on llamafile requires a modern version of the GNU make command (called gmake on some systems), sha256sum (otherwise cc will be used to build it), wget (or curl), and unzip available at https://cosmo.zip/pub/cosmos/bin/. Windows users need cosmos bash shell too.

Dependency Setup

Some dependencies are managed as git submodules with llamafile-specific patches. Before building, you need to initialize and configure these dependencies:

make setup

The patches modify code in the git submodules. These modifications remain as local changes in the submodule working directories.

make setup also downloads the Cosmopolitan C compiler for you, saving it under the .cosmocc directory.

Building

.cosmocc/4.0.2/bin/make -j8
sudo .cosmocc/4.0.2/bin/make install PREFIX=/usr/local

Build outputs will appear in the ./o directory, e.g.:

  • ./o/llama.cpp/server/llama-server: the original llama.cpp inference server, compiled with cosmocc
  • o/llamafile/llamafile: the llamafile executable, running both as a TUI and a server (with the --server flag)
  • o/third_party/zipalign/zipalign: the zipalign tool used to bundle llamafile executable, model weights, and default args into llamafiles

NOTE: Calling make should automatically run cosmocc's make when required. If that does not happen for any reason, you can still directly run the one provided by cosmocc: .cosmocc/4.0.2/bin/make.

Testing

Optionally, you can verify the build with:

make check

This runs our unit tests to ensure everything is built correctly.

Some integration tests in tests/integration are available to test llamafile with real models. Check the README to learn how to run them.

Running llamafile

After the build, you can run llamafile as:

./o/llamafile/llamafile --model <gguf_model>

or just the llama.cpp server as:

./o/llamafile/llamafile --model <gguf_model> --server

or the llamafile CLI command as:

./o/llamafile/llamafile --model <gguf_model> --cli -p "Hello world"

Documentation

There's a manual page for each of the llamafile programs installed when you run sudo make install. Most commands will also display that information when passing the --help flag.