EvaluationDatasets
Datasets
Create and manage evaluation datasets and dataset runs from the dashboard, SDK, or API.
Datasets
Datasets store collections of input/expected-output pairs used to evaluate LLM pipelines. Each item has an input, an optional expectedOutput, and optional context and metadata.
You can manage datasets from:
- Dashboard UI — create datasets, add items (single or CSV import), run evaluations, and export results from Datasets in the left sidebar.
- SDK — TypeScript, Python, and Java SDKs for programmatic dataset management.
- REST API — see the Datasets API reference.