Suggested Modelsο
Lumigator supports any model uploaded to the Hugging Face Hub and trained for summarization, provided the model is compatible with the required library versions (e.g., Transformers) and runtime dependencies (e.g., vLLM). Practical factors such as compute availability and system configurations may also impact the successful use of a model. To get started, we have extensively tested a few models and created an endpoint to easily retrieve them.
In this guide, we assume that you have already installed Lumigator locally, and have a running instance. To get a list of suggested models, you can use the following command:
user@host:~/lumigator$ curl -s http://localhost:8000/api/v1/models/summarization | jq
{
"total": 7,
"items": [
{
"name": "facebook/bart-large-cnn",
"uri": "hf://facebook/bart-large-cnn",
"description": "BART is a large-sized model fine-tuned on the CNN Daily Mail dataset.",
"info": {
"parameter_count": "406M",
"tensor_type": "F32",
"model_size": "1.63GB"
},
"tasks": [
{
"summarization": {
...
from lumigator_sdk.lumigator import LumigatorClient
# The default port for Lumigator is 8000
lm_client = LumigatorClient("localhost:8000")
lm_client.models.get_suggested_models("summarization")
Note
Note that the default port for Lumigator is 8000
. If you are running Lumigator on a different
port, you should replace 8000
with the correct port number.
The output will show a list of suggested models we have tested. The uri
field is the one you
should use when creating a new evaluation job. The response also includes other useful information,
such the model size and the default parameters used for evaluation. These fields are not applicable
to every model, but they are included for the ones we have tested.
Note
These are the models we recommend users try for specific tasks, meaning they are suggested for
evaluation on user-provided datasets. Note that Lumigator may also use additional models indirectly
during evaluation, through third-party libraries it uses. For instance, calculating the BERTScore
metric requires the roberta-large
model by
default. While this model is not listed among the recommended ones, it is still used under the hood.
Model Types and Parametersο
The following table shows the models we have tested and their respective types.
The HuggingFace
column shows if the model is on the Hugging Face Hub, API
indicates availability
via an external API, and llamafile
shows if it is distributed as a
llamafile.
Note
Please note we do not, at present, launch a llamafile for you, Lumigator assumes you have already launched it.
Model Type |
Model |
HuggingFace |
API |
llamafile |
---|---|---|---|---|
seq2seq |
facebook/bart-large-cnn |
X |
||
seq2seq |
Falconsai/text_summarization |
X |
||
causal |
gpt-4o-mini, gpt-4o |
X |
||
causal |
open-mistral-7b |
X |
||
causal |
Mistral-7B-Instruct |
X |
BART Large CNNο
The facebook/bart-large-cnn
model is pre-trained
on English language, and fine-tuned on CNN Daily Mail.
It was introduced in the paper
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension
by Lewis et al. and first released here.
The model has 406M parameters (FP32), and the model size is 1.63GB. The default parameters used for evaluation are:
Parameter Name |
Description |
Value |
---|---|---|
|
Maximum length of the summary |
142 |
|
Minimum length of the summary |
56 |
|
Length penalty to apply during beam search |
2.0 |
|
Controls the stopping condition for beam-based methods |
true |
|
All n-grams of that size can only occur once |
3 |
|
Number of beams for beam search |
4 |
Falconsai Text Summarizationο
The Falconsai/text_summarization
model is
a variant of the T5 transformer model, designed for the task of text summarization. It is adapted
and fine-tuned to generate concise and coherent summaries of input text.
The model has 60.5M parameters (FP32), and the model size is 242MB. The default parameters used for evaluation are:
Parameter Name |
Description |
Value |
---|---|---|
|
Maximum length of the summary |
200 |
|
Minimum length of the summary |
30 |
|
Length penalty to apply during beam search |
2.0 |
|
Controls the stopping condition for beam-based methods |
true |
|
All n-grams of that size can only occur once |
3 |
|
Number of beams for beam search |
4 |
Mistral 7B Instructο
The [mistralai/Mistral-7B-Instruct-v0.3]https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3) Large Language Model (LLM) is an instruct fine-tuned version of the Mistral-7B-v0.3.
The model has 7.25B parameters (BF16), and the model size is 14.5GB. There are no summarization-specific parameters for this model.
GPT-4o Mini and GPT-4oο
The GPT-4o Mini and GPT-4o models are causal language models developed by OpenAI.
There are no summarization-specific parameters for these models.
Open Mistral 7Bο
The Open Mistral 7B model is a causal language model developed by Mistral AI. It is the smaller version of the Mistal AI family of models.
There are no summarization-specific parameters for this model.
Mistral 7B Instruct Llamafileο
The mistralai/Mistral-7B-Instruct-v0.2 model is a causal language model developed by Mistral AI, packaged as a llamafile. A llamafile is an executable LLM that you can run on your own computer. It contains the weights for a given open LLM, as well as everything needed to actually run that model on your computer. Thereβs nothing to install or configure.
There are no summarization-specific parameters for this model.
Reference modelsο
Before you jump into evaluating datasets, you should consider the following importance of quality ground-truth and annotations.
Ground-truth would be the actual, expected output or correct answer in for a given task (such as summarization), serving as a reference to compare the modelβs predictions. Typically, a human with enough expertise in the task will annotate or label a dataset with those references for each sample (for example, an acceptable summary of the input text).
To evaluate a model as reliably as possible, we encourage using human-provided ground-truth to compare against. Failing that, Lumigator enables the user to do automatic annotation with a well tested model (BART for summarization task).
You can do this through the API, using one of Lumigator jobs: /jobs/annotate
.
user@host:~/lumigator$
curl -X 'POST' \
'http://localhost:8000/api/v1/jobs/annotate/' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"name": "test_annotate",
"description": "annotate",
"dataset": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
"max_samples": -1,
"task": "summarization"
}'
Under the hood, this will launch an inference job with the reference model for summarization (BART, described above).