BrowserStack AI Evals

Evaluators API

Manage evaluators and evaluator lists for automated LLM evaluation.

Evaluators API

Evaluators define the scoring functions applied to traces. Evaluator lists group multiple evaluators together and are referenced when creating experiments and evaluation executions.

Evaluator Lists

Create Evaluator List

POST /api/public/evaluator-lists
FieldTypeRequiredDescription
namestringYesUnique name for the evaluator list
descriptionstringNoDescription
evaluatorsarrayYesAt least one evaluator config
evaluators[].evaluatorIdstringYesID of the evaluator
evaluators[].paramsarrayYesParameter bindings
evaluators[].params[].keystringYesParameter key
evaluators[].params[].valuestringNoParameter value
evaluators[].params[].dataTypestringYesOne of: string, integer, float, boolean, string[], integer[], float[], boolean[], list, dict
curl -X POST https://evals-api.browserstack.com/api/public/evaluator-lists \
  -u "pk-lf-...:sk-lf-..." \
  -H "Content-Type: application/json" \
  -d '{
    "name": "rag-evaluators",
    "description": "Evaluators for RAG pipeline",
    "evaluators": [
      {
        "evaluatorId": "correctness-evaluator-id",
        "params": [
          { "key": "threshold", "dataType": "float", "value": "0.7" }
        ]
      },
      {
        "evaluatorId": "faithfulness-evaluator-id",
        "params": [
          { "key": "strict", "dataType": "boolean", "value": "true" }
        ]
      }
    ]
  }'

Response:

{
  "id": "eval-list-uuid-1",
  "name": "rag-evaluators",
  "description": "Evaluators for RAG pipeline",
  "projectId": "proj-xyz",
  "createdAt": "2026-04-03T10:00:00.000Z",
  "updatedAt": "2026-04-03T10:00:00.000Z",
  "evaluatorConfigs": [
    {
      "id": "config-uuid-1",
      "evaluatorId": "correctness-evaluator-id",
      "evaluator": {
        "id": "correctness-evaluator-id",
        "name": "Correctness",
        "description": "Measures factual accuracy",
        "order": 1,
        "createdAt": "2026-01-01T00:00:00.000Z",
        "updatedAt": "2026-01-01T00:00:00.000Z"
      },
      "params": [
        { "key": "threshold", "value": "0.7", "dataType": "float" }
      ]
    }
  ]
}

List Evaluator Lists

GET /api/public/evaluator-lists
ParameterTypeDescription
pageintegerPage number (default: 1)
limitintegerItems per page (default: 50)
orderByobject{ column: string, order: "ASC" | "DESC" }
curl "https://evals-api.browserstack.com/api/public/evaluator-lists?page=1&limit=20" \
  -u "pk-lf-...:sk-lf-..."

Response:

{
  "evaluators": [
    {
      "id": "eval-list-uuid-1",
      "name": "rag-evaluators",
      "description": "Evaluators for RAG pipeline",
      "projectId": "proj-xyz",
      "createdAt": "2026-04-03T10:00:00.000Z",
      "updatedAt": "2026-04-03T10:00:00.000Z",
      "evaluatorConfigs": [...]
    }
  ],
  "totalCount": 1
}

Get Evaluator List

GET /api/public/evaluator-lists/{evaluatorListId}
curl "https://evals-api.browserstack.com/api/public/evaluator-lists/eval-list-uuid-1" \
  -u "pk-lf-...:sk-lf-..."

Delete Evaluator List

DELETE /api/public/evaluator-lists/{evaluatorListId}
curl -X DELETE "https://evals-api.browserstack.com/api/public/evaluator-lists/eval-list-uuid-1" \
  -u "pk-lf-...:sk-lf-..."

Evaluators

Individual evaluator definitions are managed via the evaluators endpoint.

List Evaluators

GET /api/public/evaluators
curl "https://evals-api.browserstack.com/api/public/evaluators" \
  -u "pk-lf-...:sk-lf-..."

Get Evaluator

GET /api/public/evaluators/{evaluatorId}
curl "https://evals-api.browserstack.com/api/public/evaluators/correctness-evaluator-id" \
  -u "pk-lf-...:sk-lf-..."

Parameter Data Types

dataTypeDescription
stringSingle string value
integerInteger number
floatFloating-point number
booleantrue or false
string[]Array of strings
integer[]Array of integers
float[]Array of floats
boolean[]Array of booleans
listGeneric list
dictGeneric object/dictionary