# Translation Use Case

This guide will walk you through the process of running a translation experiment using Lumigator with two models: a many-to-many sequence-to-sequence model from the Hugging Face Hub and `gpt-4o-mini` from  OpenAI. Please refer to the list of [suggested models](../get-started/suggested-models.md#model-types-and-parameters) for translation use case for more details.

## What You'll Need

1. A running instance of [Lumigator](../get-started/quickstart.md).
1. A dataset for translation use case. You can use the [sample English-Spanish dataset](../../../lumigator/sample_data/translation/sample_translation_en_es.csv) or prepare your own dataset. Refer to [this guide](./prepare-evaluation-dataset.md) for more details.
1. (Optional) An `OPENAI_API_KEY` if you would like to evaluate one of the OpenAI models. Please refer to the [UI instructions](../get-started/ui-guide.md#settings) for setting up the API keys.

## Procedure

To run a translation experiment, one can either use the UI or the API/SDK. If using the UI, we recommend following the steps in the [UI guide](../get-started/ui-guide.md) - from a high level, it will involve uploading the dataset and then creating an experiment (whereby you can select the use case as Translation, specify the source and targer language, and proceed with one or more of the available models). Once the experiment status is `SUCCEEDED`, you can view the experimental results in a tabular format.

Alternatively, you can also use the API/SDK to run the experiment. The following steps outline the process:

### Upload the Dataset
The dataset upload process is the same as outlined in the [quick start guide](../get-started/quickstart.md#upload-a-dataset). We just update the `DATASET_PATH` to point to your translation dataset in csv format.

::::{tab-set}

:::{tab-item} cURL
:sync: tab1
```console
user@host:~/lumigator$ export DATASET_PATH=lumigator/sample_data/translation/sample_translation_en_es.csv
user@host:~/lumigator$ curl -s http://localhost:8000/api/v1/datasets/ \
  -H 'Accept: application/json' \
  -H 'Content-Type: multipart/form-data' \
  -F 'dataset=@'$DATASET_PATH';type=text/csv' \
  -F 'format=job' | jq
```
:::

:::{tab-item} Python SDK
:sync: tab2
```python
from lumigator_sdk.lumigator import LumigatorClient
from lumigator_schemas.datasets import DatasetFormat

dataset_path = 'lumigator/sample_data/translation/sample_translation_en_es.csv'
client = LumigatorClient('localhost:8000')

response = client.datasets.create_dataset(
    open(dataset_path, 'rb'),
    DatasetFormat.JOB
)
```
:::

::::

### Create an Experiment
Next, lets proceed to creating an experiment. The main point to note here is the `task_definition` field, which is a dictionary that specifies the task as `translation` and the `source_language` and the `target_language`.

::::{tab-set}

:::{tab-item} cURL
:sync: tab1

Set the following variables:
```console
user@host:~/lumigator$ export EXP_NAME="English to Spanish Translation" \
EXP_DESC="Evaluate which model best translates English to Spanish" \
EXP_DATASET="$(curl -s http://localhost:8000/api/v1/datasets/ | jq -r '.items | .[0].id')" \
TASK_DEFINITION='{"task": "translation", "source_language": "English", "target_language": "Spanish"}'
```

Define the JSON string:
```console
user@host:~/lumigator$ export JSON_STRING=$(jq -n \
--arg name "$EXP_NAME" \
--arg desc "$EXP_DESC" \
--arg dataset_id "$EXP_DATASET" \
--argjson task_definition "$TASK_DEFINITION" \
'{name: $name, description: $desc, dataset: $dataset_id, task_definition: $task_definition}')
```

Create the experiment:
```console
user@host:~/lumigator$ curl -s http://localhost:8000/api/v1/experiments/ \
-H 'Accept: application/json' \
-H 'Content-Type: application/json' \
-d "$JSON_STRING" | jq
{
  "id": "48",
  "name": "English to Spanish Translation",
  "description": "Evaluate which model best translates English to Spanish",
  "created_at": "2025-03-17T14:01:18.783000",
  "task_definition": {
    "task": "translation",
    "source_language": "English",
    "target_language": "Spanish"
  },
  "dataset": "4fbfc81d-938c-4703-beaf-af404fa5285f",
  "updated_at": null,
  "workflows": null
}
```

:::

:::{tab-item} Python SDK
:sync: tab2
```python
from lumigator_schemas.experiments import ExperimentCreate

dataset_id = datasets.items[-1].id
task_definition = {
    "task": "translation",
    "source_language": "English",
    "target_language": "Spanish"
}
request = ExperimentCreate(
    name = "English to Spanish Translation",
    description = "Evaluate which model best translates English to Spanish",
    dataset=dataset_id,
    task_definition=task_definition
)
experiment_response = client.experiments.create_experiment(request)
experiment_id = experiment_response.id
print(f"Experiment created and has ID: {experiment_id}")
```
:::

::::

### Trigger the Workflows
Next, lets trigger workflows to evaluate two models - `gpt-4o-mini` from OpenAI  and [`facebook/m2m100_418M`](https://huggingface.co/facebook/m2m100_418M) from Hugging Face Model Hub. This process can be repeated for as many models as you would like to evaluate in the experiment. In the workflow creation request, we also specify the following metrics to be computed: [BLEU](https://github.com/huggingface/evaluate/tree/main/metrics/bleu) and [METEOR](https://github.com/huggingface/evaluate/tree/main/metrics/meteor) which are word overlap metrics, and [COMET](https://unbabel.github.io/COMET/html/index.html) which is a neural translation metric.

Setup the following environment variables in a file called `common_variables.sh`:
```bash
user@host:~/lumigator$ cat <<EOF > common_variables.sh
#!/bin/bash
# Common API configuration for workflows
export WORKFLOW_DATASET="\$(curl -s http://localhost:8000/api/v1/datasets/ | jq -r '.items | .[0].id')"
export EXPERIMENT_ID="\$(curl -s http://localhost:8000/api/v1/experiments/ | jq -r '.items | .[0].id')"
export METRICS='["bleu", "meteor", "comet"]'
EOF
```

And then source the file:
```console
user@host:~/lumigator$ source common_variables.sh
```


::::{tab-set}

:::{tab-item} cURL (OpenAI)
:sync: openai-curl

Set the following variables:
```console
user@host:~/lumigator$ export WORKFLOW_NAME="OpenAI Translation" \
WORKFLOW_DESC="Translate English to Spanish with OpenAI"
```
Define the JSON string for OpenAI model:
```console
user@host:~/lumigator$ export JSON_STRING=$(jq -n \
--arg name "$WORKFLOW_NAME" \
--arg model "gpt-4o-mini" \
--arg provider "openai" \
--arg secret_key_name "openai_api_key" \
--arg desc "$WORKFLOW_DESC" \
--arg exp_id "$EXPERIMENT_ID" \
--argjson batch_size 5 \
--argjson task_definition "$TASK_DEFINITION" \
--argjson metrics "$METRICS" \
'{name: $name, description: $desc, model: $model, provider: $provider, secret_key_name: $secret_key_name, batch_size: $batch_size, experiment_id: $exp_id, task_definition: $task_definition, metrics: $metrics}')
```

Trigger the workflow:
```console
user@host:~/lumigator$ curl -s http://localhost:8000/api/v1/workflows/ \
-H 'Accept: application/json' \
-H 'Content-Type: application/json' \
-d "$JSON_STRING" | jq
{
  "id": "6e757bc0334645749d57023ed0a509df",
  "experiment_id": "48",
  "model": "gpt-4o-mini",
  "name": "OpenAI Translation",
  "description": "Translate English to Spanish with OpenAI",
  "system_prompt": "translate English to Spanish: ",
  "status": "created",
  "created_at": "2025-03-17T15:46:50.775000",
  "updated_at": null
}
```

:::

:::{tab-item} SDK (OpenAI)
:sync: openai-python
```python
from lumigator_schemas.workflows import WorkflowCreateRequest

batch_size = 5
metrics = ["bleu", "meteor", "comet"]
request = WorkflowCreateRequest(
    name="OpenAI Translation",
    description="Translate English to Spanish with OpenAI",
    model="gpt-4o-mini",
    provider="openai",
    secret_key_name="openai_api_key",
    experiment_id=experiment_id,
    task_definition=task_definition,
    batch_size=batch_size,
    metrics=metrics
)
client.workflows.create_workflow(request).model_dump()
```
:::

:::{tab-item} cURL (Hugging Face)
:sync: hf-curl

Set the following variables:
```console
user@host:~/lumigator$ export WORKFLOW_NAME="Hugging Face Translation" \
       export WORKFLOW_DESC="Translate English to Spanish with M2M100"
```

Define the JSON string for HF model:
```console
user@host:~/lumigator$ export JSON_STRING=$(jq -n \
--arg name "$WORKFLOW_NAME" \
--arg model "facebook/m2m100_418M" \
--arg provider "hf" \
--arg desc "$WORKFLOW_DESC" \
--arg dataset_id "$WORKFLOW_DATASET" \
--arg exp_id "$EXPERIMENT_ID" \
--arg batch_size 5 \
--argjson task_definition "$TASK_DEFINITION" \
--argjson metrics "$METRICS" \
'{name: $name, description: $desc, model: $model, provider: $provider, experiment_id: $exp_id, batch_size: $batch_size, dataset: $dataset_id, task_definition: $task_definition, metrics: $metrics}')
```

Trigger the workflow:
```console
user@host:~/lumigator$ curl -s http://localhost:8000/api/v1/workflows/ \
-H 'Accept: application/json' \
-H 'Content-Type: application/json' \
-d "$JSON_STRING" | jq
{
  "id": "169c3169b7d549598b8b094c0dd9c806",
  "experiment_id": "48",
  "model": "facebook/m2m100_418M",
  "name": "Hugging Face Translation",
  "description": "Translate English to Spanish with M2M100",
  "system_prompt": "translate English to Spanish: ",
  "status": "created",
  "created_at": "2025-03-17T16:37:04.211000",
  "updated_at": null
}
```
:::

:::{tab-item} SDK (Hugging Face)
:sync: hf-python
```
from lumigator_schemas.workflows import WorkflowCreateRequest

batch_size = 5
metrics = ["bleu", "meteor", "comet"]
request = WorkflowCreateRequest(
    name="Hugging Face Translation",
    description="Translate English to Spanish with M2M100",
    model="facebook/m2m100_418M",
    provider="hf",
    experiment_id=experiment_id,
    task_definition=task_definition,
    batch_size=batch_size,
    metrics=metrics
)
client.workflows.create_workflow(request).model_dump()
```
:::

::::

## Verify
After the workflows has been triggered, you may need to wait a few minutes for the jobs to complete - you can check the status on the Experiments Page in the UI. Once completed, you can retrieve the experiment details with the following commands, allowing you to compare results and review performance. All the evaluation details can also be [viewed in the UI](../get-started/ui-guide.md#view-results).

::::{tab-set}

:::{tab-item} cURL
:sync: tab1

Set the following variables:
```console
user@host:~/lumigator$ export EXPERIMENT_ID="$(curl -s http://localhost:8000/api/v1/experiments/ | jq -r '.items | .[0].id')"
```

Get the experiment and check the `metrics` field for both the workflows:

```console
user@host:~/lumigator$ curl -s http://localhost:8000/api/v1/experiments/$EXPERIMENT_ID | jq
{
  "id": "48",
  "name": "English to Spanish Translation",
  "description": "Evaluate which model best translates English to Spanish",
  "created_at": "2025-03-17T14:01:18.783000",
  "task_definition": {
    "task": "translation",
    "source_language": "English",
    "target_language": "Spanish"
  },
  "dataset": "4fbfc81d-938c-4703-beaf-af404fa5285f",
  "updated_at": "2025-03-17T14:01:18.783000",
  "workflows": [
    {
      "id": "169c3169b7d549598b8b094c0dd9c806",
      "experiment_id": "48",
      "model": "facebook/m2m100_418M",
      "name": "Hugging Face Translation",
      "description": "Translate English to Spanish with M2M100",
      "system_prompt": "translate English to Spanish: ",
      "status": "succeeded",
      "created_at": "2025-03-17T16:37:04.211000",
      "updated_at": null,
      "jobs": [
        {
          "id": "baa73e2a-3c81-4643-8797-513d31825922",
          "metrics": [
            {
              "name": "meteor_meteor_mean",
              "value": 0.811
            },
            {
              "name": "bleu_bleu_mean",
              "value": 0.472
            },
            {
              "name": "comet_mean_score",
              "value": 1.158
            }
          ],
          ...
```
:::

:::{tab-item} Python SDK
:sync: tab2
```python
experiment_details = client.experiments.get_experiment(experiment_id)
print(experiment_details.model_dump_json())
```
:::

::::


## Next Steps

You have successfully run an translation evaluation experiment using the Lumigator with a sample dataset. You can now test out other models on your own datasets or translation datasets from the [Hugging Face Hub](https://huggingface.co/datasets?task_categories=task_categories:translation&sort=downloads).