Example llamafiles

We provide example llamafiles for a variety of models, so you can easily try out llamafile with different kinds of LLMs.

Model	Size	License	llamafile	other quants
LLaMA 3.2 1B Instruct	1.11 GB	LLaMA 3.2	Llama-3.2-1B-Instruct.Q6_K.llamafile	See HF repo
LLaMA 3.2 3B Instruct	2.62 GB	LLaMA 3.2	Llama-3.2-3B-Instruct.Q6_K.llamafile	See HF repo
LLaMA 3.1 8B Instruct	5.23 GB	LLaMA 3.1	Llama-3.1-8B-Instruct.Q4_K_M.llamafile	See HF repo
Gemma 3 1B Instruct	1.32 GB	Gemma 3	gemma-3-1b-it.Q6_K.llamafile	See HF repo
Gemma 3 4B Instruct	3.50 GB	Gemma 3	gemma-3-4b-it.Q6_K.llamafile	See HF repo
Gemma 3 12B Instruct	7.61 GB	Gemma 3	gemma-3-12b-it.Q4_K_M.llamafile	See HF repo
QwQ 32B	7.61 GB	Apache 2.0	Qwen_QwQ-32B-Q4_K_M.llamafile	See HF repo
R1 Distill Qwen 14B	9.30 GB	MIT	DeepSeek-R1-Distill-Qwen-14B-Q4_K_M	See HF repo
R1 Distill Llama 8B	5.23 GB	MIT	DeepSeek-R1-Distill-Llama-8B-Q4_K_M	See HF repo
LLaVA 1.5	3.97 GB	LLaMA 2	llava-v1.5-7b-q4.llamafile	See HF repo
Mistral-7B-Instruct v0.3	4.42 GB	Apache 2.0	mistral-7b-instruct-v0.3.Q4_0.llamafile	See HF repo
Granite 3.2 8B Instruct	5.25 GB	Apache 2.0	granite-3.2-8b-instruct-Q4_K_M.llamafile	See HF repo
Phi-3-mini-4k-instruct	7.67 GB	Apache 2.0	Phi-3-mini-4k-instruct.F16.llamafile	See HF repo
Mixtral-8x7B-Instruct	30.03 GB	Apache 2.0	mixtral-8x7b-instruct-v0.1.Q5_K_M.llamafile	See HF repo
OLMo-7B	5.68 GB	Apache 2.0	OLMo-7B-0424.Q6_K.llamafile	See HF repo
Text Embedding Models
E5-Mistral-7B-Instruct	5.16 GB	MIT	e5-mistral-7b-instruct-Q5_K_M.llamafile	See HF repo
mxbai-embed-large-v1	0.7 GB	Apache 2.0	mxbai-embed-large-v1-f16.llamafile	See HF Repo

Here is an example for the Mistral command-line llamafile:

./mistral-7b-instruct-v0.2.Q5_K_M.llamafile --temp 0.7 -p '[INST]Write a story about llamas[/INST]'

And here is an example for WizardCoder-Python command-line llamafile:

./wizardcoder-python-13b.llamafile --temp 0 -e -r '```\n' -p '```c\nvoid *memcpy_sse2(char *dst, const char *src, size_t size) {\n'

And here's an example for the LLaVA command-line llamafile:

./llava-v1.5-7b-q4.llamafile --temp 0.2 --image lemurs.jpg -e -p '### User: What do you see?\n### Assistant:'

As before, macOS, Linux, and BSD users will need to use the "chmod" command to grant execution permissions to the file before running these llamafiles for the first time.

Unfortunately, Windows users cannot make use of many of these example llamafiles because Windows has a maximum executable file size of 4GB, and all of these examples exceed that size. (The LLaVA llamafile works on Windows because it is 30MB shy of the size limit.) But don't lose heart: llamafile allows you to use external weights; this is described later in this document.

Having trouble? See the Troubleshooting page.

A note about models

The example llamafiles provided above should not be interpreted as endorsements or recommendations of specific models, licenses, or data sets on the part of Mozilla.