BrowserStack AI Evals
EvaluationDatasets

Dataset Items

Add items to datasets from the dashboard or SDK — single items, batch upload, or CSV import.

Dataset Items

From the Dashboard

Open any dataset and select the Items tab to view and manage items.

Dataset items table

Items Table

Each row shows one dataset item with these columns:

ColumnDescription
Item IDUnique item identifier (click to open item detail)
SourceLink to the originating trace, observation, or session (if the item was created from a trace)
StatusACTIVE or ARCHIVED
InputJSON viewer (expandable)
Expected OutputJSON viewer (expandable)
Expected Tool CallsJSON viewer (expandable, hidden by default)
ContextJSON viewer (hidden by default)
MetadataJSON viewer (hidden by default)
Added byAutomation rule name or user who created the item

Use the column visibility toggle to show/hide columns.

Add a Single Item

Click New item in the top-right of the Items tab.

Fill in the form:

  • Dataset (required) — pre-selected to the current dataset. You can add the same item to multiple datasets.
  • Input (required) — JSON object, array, or double-quoted string
  • Expected Output (optional) — JSON
  • Expected Tool Calls (optional) — JSON
  • Metadata (optional) — JSON key-value pairs

Click Add to Dataset to add the item.

Upload CSV

Click Upload CSV in the top-right of the Items tab.

Drag and drop a CSV file or click to browse. Maximum file size is 100 MB.

Preview the parsed data. The CSV columns are mapped to item fields: input, expectedOutput, context, metadata.

Confirm to import. Items are added to the dataset in bulk.

Item Actions

Click the actions menu on any item row for:

  • Archive / Unarchive — toggle the item's status between ACTIVE and ARCHIVED
  • Delete — permanently remove the item

Bulk Actions

Select multiple items using the checkboxes, then use the Actions dropdown:

  • Compare — compare selected items side by side (requires 2+ items)
  • Export Selected — export selected items

Use the Batch Export button in the toolbar to export all visible items as CSV or JSON.


From the SDK

Create Items (Batch)

Add multiple items to a dataset at once. Each item has an input, optional expectedOutput, optional context, and optional metadata.

await datasets.createItems({
  datasetName: 'qa-golden-set',
  items: [
    {
      input: { question: 'What is the capital of France?' },
      expectedOutput: { answer: 'Paris' },
      metadata: { difficulty: 'easy' },
    },
    {
      input: { question: 'What is 2 + 2?' },
      expectedOutput: { answer: '4' },
    },
    {
      input: { question: 'Who wrote Hamlet?' },
      expectedOutput: { answer: 'William Shakespeare' },
      context: 'Classic English literature',
    },
  ],
});
result = client.datasets.create_items(
    dataset_name="qa-dataset-v1",
    items=[
        {
            "input": {"question": "What is your return policy?"},
            "expectedOutput": "Items can be returned within 30 days.",
            "metadata": {"category": "returns"},
        },
        {
            "input": {"question": "How do I track my order?"},
            "expectedOutput": "Log in and visit the Orders page.",
        },
    ],
)
print(f"Created {result['itemCount']} items")

Items are sent in batches of 100.

import com.browserstack.aisdk.eval.model.CreateDatasetItemRequest;
import java.util.List;

List<CreateDatasetItemRequest> items = List.of(
    CreateDatasetItemRequest.builder()
        .input("What causes Northern Lights?")
        .expectedOutput("Solar wind particles interact with Earth's magnetic field...")
        .context("Reference document: Aurora Borealis — NASA")
        .build(),

    CreateDatasetItemRequest.builder()
        .input("How far is the Moon from Earth?")
        .expectedOutput("Approximately 384,400 km on average.")
        .build(),

    CreateDatasetItemRequest.builder()
        .input("What is photosynthesis?")
        .expectedOutput("The process plants use to convert sunlight into energy.")
        .metadata(Map.of("category", "biology", "difficulty", "easy"))
        .build()
);

CreateDatasetItemsResponse result = datasets.createItems("my-dataset", items);
System.out.println("Added " + result.getItemCount() + " items");

Import from CSV

Your CSV file should have headers that match the dataset item fields:

input,expectedOutput,context,metadata
"What is 2+2?","4","math textbook","{""difficulty"": ""easy""}"
"Capital of France?","Paris","geography quiz","{""difficulty"": ""medium""}"
"Explain recursion","A function that calls itself","CS fundamentals","{""difficulty"": ""hard""}"

Supported columns: input, expectedOutput, context, metadata, id, sourceTraceId, sourceObservationId, status. Only input is required.

const result = await datasets.createItems({
  datasetName: 'qa-golden-set',
  fileUrl: '/path/to/dataset.csv',
});
console.log(`Imported ${result.itemCount} items`);
result = client.datasets.create_items(
    dataset_name="qa-golden-set",
    file_url="/path/to/dataset.csv",
    options={"batchSize": 50},
)
print(f"Imported {result['itemCount']} items")
CreateDatasetItemsResponse result = datasets.createItemsFromCsv(
    "qa-golden-set",
    "/path/to/dataset.csv"
);
System.out.println("Imported " + result.getItemCount() + " items");

Override batch size:

datasets.createItemsFromCsv("qa-golden-set", "/path/to/dataset.csv", 50);