Datasets API
A dataset is a collection of data points (or samples) used for training, testing, or evaluating machine learning models.
The API provides endpoints to list all datasets, retrieve details of a specific dataset, delete a dataset, and download datasets. The upload process ensures that the dataset is compatible with HuggingFace standards, although the recreated CSV file may have different delimiters. The API supports various operations with appropriate status codes to indicate the success or failure of each request.
Endpoints
- POST /api/v1/datasets/
Upload Dataset
Uploads the dataset for use in Lumigator.
An uploaded dataset is parsed into HuggingFace format files and stored alongside a recreated version of the input dataset.
NOTE: The recreated version of the CSV file may not have identical delimiters as it will follow the format that HuggingFace uses when it generates the CSV.
- Status Codes:
201 Created – Dataset successfully uploaded
413 Request Entity Too Large – Max dataset size (50MB)
422 Unprocessable Entity – Invalid CSV file
- GET /api/v1/datasets/
List Datasets
- Query Parameters:
skip (integer)
limit (integer)
- Status Codes:
200 OK – Successful Response
422 Unprocessable Entity – Validation Error
- GET /api/v1/datasets/{dataset_id}
Get Dataset
- Parameters:
dataset_id (string)
- Status Codes:
200 OK – Successful Response
422 Unprocessable Entity – Validation Error
- DELETE /api/v1/datasets/{dataset_id}
Delete Dataset
- Parameters:
dataset_id (string)
- Status Codes:
204 No Content – Successful Response
422 Unprocessable Entity – Validation Error
- GET /api/v1/datasets/{dataset_id}/download
Get Dataset Download
Returns a collection of pre-signed URLs which can be used to download the dataset.
- Parameters:
dataset_id (string)
- Query Parameters:
extension ({'null', 'string'}) – When specified, will be used to return only URLs for files which have a matching file extension. Wildcards are not accepted. By default all files are returned. e.g. csv
- Status Codes:
200 OK – Successful Response
422 Unprocessable Entity – Validation Error