BrowserStack AI Evals
Tracing

Tracing Overview

Capture LLM calls, spans, events, and scores across your AI pipeline with BrowserStack AI Evals tracing.

Tracing

Tracing gives you full visibility into every step of your AI pipeline — from the first user input to the final model response. BrowserStack AI Evals captures LLM calls, retrieval steps, tool invocations, and evaluation scores in a structured, searchable hierarchy.

What Gets Captured

SignalDescription
TracesTop-level container for a single request or pipeline run. Carries user ID, session ID, tags, and metadata.
GenerationsIndividual LLM calls with model name, input messages, output, token usage, and latency.
SpansNon-LLM steps such as retrieval, tool calls, or pre/post-processing.
EventsPoint-in-time observations within a trace or span (e.g., cache miss, tool selected).
ScoresEvaluation results attached to any trace or observation.

Two Approaches

Auto-instrumentation — import one line and every LLM call in your app is traced automatically. Supports OpenAI, Anthropic, Bedrock, Vertex AI, LangChain, and more.

Manual tracing — create traces, spans, and generations explicitly for full control over the trace structure. Useful for custom pipelines or non-LLM steps.

You can mix both approaches: auto-instrumentation captures LLM calls while manual traces add context and scores around them.

Get Started