Capture LLM calls, spans, events, and scores across your AI pipeline with BrowserStack AI Evals tracing.

Tracing

Tracing gives you full visibility into every step of your AI pipeline — from the first user input to the final model response. BrowserStack AI Evals captures LLM calls, retrieval steps, tool invocations, and evaluation scores in a structured, searchable hierarchy.

What Gets Captured

Signal	Description
Traces	Top-level container for a single request or pipeline run. Carries user ID, session ID, tags, and metadata.
Generations	Individual LLM calls with model name, input messages, output, token usage, and latency.
Spans	Non-LLM steps such as retrieval, tool calls, or pre/post-processing.
Events	Point-in-time observations within a trace or span (e.g., cache miss, tool selected).
Scores	Evaluation results attached to any trace or observation.

Two Approaches

Auto-instrumentation — import one line and every LLM call in your app is traced automatically. Supports OpenAI, Anthropic, Bedrock, Vertex AI, LangChain, and more.

Manual tracing — create traces, spans, and generations explicitly for full control over the trace structure. Useful for custom pipelines or non-LLM steps.

You can mix both approaches: auto-instrumentation captures LLM calls while manual traces add context and scores around them.

Tracing Overview

Tracing

What Gets Captured

Two Approaches

Get Started

Setup & Installation

Manual Tracing

Auto-Instrumentation

Model Providers

Frameworks

HTTP Instrumentation

Distributed Tracing

Media & File Uploads

PII Masking

On this page