Skip to content

This document provides instructions to replicate our Windows DLL builds.

Requirements:

In the MSYS shell

As our Makefile makes massive use of unix shell applications, it's much easier to just replicate that environment. We'll need it just to setup the repo (with make setup), which initializes all the submodules and applies our patches to the original code. In theory after the setup you should also be able to run make -j8 to build llamafile with cosmocc on Windows (but honestly, why?)

  • install the required tools (vim is not really required)

    pacman -S git patch unzip wget make vim
    

  • create a build workspace

    mkdir /c/Users/Your_Username/workspace
    cd /c/Users/Your_Username/workspace
    

  • clone the repo

    git clone https://github.com/mozilla-ai/llamafile
    

  • setup

    cd llamafile
    make setup
    

In the Windows terminal

After the repo is set up, you can build the cuda / rocm / vulkan DLLs as follows.

  • from powershell, open a Visual Studio 2022 Developer Command Prompt

     cmd /k "`"C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Auxiliary\Build\vcvarsall.bat`" x64"
    

  • cd to the llamafile dir and start CUDA parallel build (this will run for a while...)

    cd c:\Users\Your_Username\Workspace\llamafile
    llamafile\cuda_parallel.bat
    

images/win_cuda_build.png

  • run ROCm parallel build as follows:

    llamafile\rocm_parallel.bat
    

  • run Vulkan build as follows (parallel is not needed, this is usually much faster than the other two):

    llamafile\vulkan.bat
    

At the end of this process, you should have the following libraries available in your llamafile directory (note that sizes might differ):

03/31/2026  02:15 PM       717,095,936 ggml-cuda.dll
03/31/2026  02:44 PM       502,854,656 ggml-rocm.dll
03/31/2026  02:46 PM        31,482,880 ggml-vulkan.dll

To run llamafile with these libraries, add them in your home directory or bundle them in your llamafile (see Creating a llamafile).