Manual Tracing
Create traces, spans, generations, events, and scores to instrument any AI pipeline.
Manual Tracing
Manual tracing lets you instrument any code path — whether or not it uses a supported LLM provider. Use it to structure multi-step pipelines, capture inputs and outputs, and add evaluation scores.
Hierarchy
Trace
├── Generation (LLM call)
├── Span (sub-step: retrieval, tool call, etc.)
│ └── Event (point-in-time observation)
└── Score (evaluation result)Creating a Trace
A trace is the top-level container for a single request or pipeline run.
import { AISDK } from '@browserstack/ai-sdk';
const testOps = new AISDK({
publicKey: process.env.AISDK_PUBLIC_KEY,
secretKey: process.env.AISDK_SECRET_KEY,
});
const trace = testOps.trace({
name: 'qa-pipeline',
userId: 'user-123',
sessionId: 'session-456',
input: { question: 'What is the capital of France?' },
metadata: { source: 'api' },
tags: ['production', 'v2'],
});import os
from browserstack_ai_sdk import AISDK
client = AISDK(
public_key=os.environ["AISDK_PUBLIC_KEY"],
secret_key=os.environ["AISDK_SECRET_KEY"],
)
trace = client.trace(
name="rag-pipeline",
user_id="user-123",
session_id="session-abc",
tags=["rag", "production"],
metadata={"version": "2.0"},
)import com.browserstack.aisdk.TestOps;
import com.browserstack.aisdk.tracing.TraceManager;
import com.browserstack.aisdk.tracing.model.TraceBody;
TestOps sdk = TestOps.fromEnv();
TraceManager tm = sdk.traceManager();
var trace = tm.trace(TraceBody.builder()
.name("answer-question")
.input("What causes Northern Lights?")
.userId("user-42")
.sessionId("session-abc")
.environment("production")
.build());Trace Options
| Field | Type | Required | Description |
|---|---|---|---|
name | string | No | Display name for the trace. |
userId | string | No | ID of the user that triggered this trace. |
sessionId | string | No | Group multiple traces into a session. |
input | any | No | Input data (shown in the dashboard). |
output | any | No | Output data. Usually set when the trace completes. |
metadata | Record<string, any> | No | Arbitrary key-value metadata. |
tags | string[] | No | Labels for filtering in the dashboard. |
id | string | No | Custom trace ID (auto-generated if omitted). |
| Field | Type | Required | Description |
|---|---|---|---|
name | str | Yes | Display name for the trace. |
user_id | str | No | ID of the user that triggered this trace. |
session_id | str | No | Group multiple traces into a session. |
input | any | No | Input data (shown in the dashboard). |
output | any | No | Output data. Usually set via trace.update(). |
metadata | dict | No | Arbitrary key-value metadata. |
tags | list[str] | No | Labels for filtering in the dashboard. |
public | bool | No | Whether the trace is publicly visible. |
id | str | No | Custom trace ID (auto-generated if omitted). |
expected_output | any | No | Expected output for evaluation. |
All TraceBody fields:
| Field | Type | Required | Description |
|---|---|---|---|
name | String | No | Display name for the trace |
input | Object | No | Root input (String, Map, or any serializable object) |
output | Object | No | Final output |
userId | String | No | User identifier |
sessionId | String | No | Session identifier for grouping traces |
environment | String | No | Environment name (e.g., "production") |
release | String | No | Release version |
tags | List<String> | No | Arbitrary tags |
metadata | Map<String, Object> | No | Custom key-value metadata |
isPublic | Boolean | No | Whether the trace is publicly visible |
id | String | No | Custom trace ID (auto-generated if omitted) |
Custom trace IDs
You can attach a business or external identifier to a trace — useful for idempotent retries (the same request ID always maps to the same trace) and for looking up traces later by an ID your system already owns (an order ID, a request UUID, a user-message UUID).
There are two ways to do this. Both produce the same result:
- Pass any string as
idwhen creating the trace. The SDK detects whether the value is a 32-character hex string and either preserves it or hashes it deterministically. - Pre-compute the trace ID with
generateTraceId(customId)/generate_trace_id(custom_id)and pass the result asid. Useful when you need the trace ID before the trace is created (for example, to return it in an API response).
How the SDK resolves the id you pass:
| Input | Stored trace ID | customId field |
|---|---|---|
null / undefined / empty / non-string | random UUID (no dashes) | not set |
| 32-character hex (with or without dashes, any case) | normalised lowercase hex | the original string, if it differed from the normalised form |
| any other string | SHA-256 of the input, truncated to 32 chars | the original string |
Once a trace exists, read the original custom string via trace.customId (Node) / trace.custom_id (Python). The dashboard surfaces it on the trace detail page and lets you filter the Traces list by Custom ID.
import { AISDK } from '@browserstack/ai-sdk';
const testOps = new AISDK({
publicKey: process.env.AISDK_PUBLIC_KEY,
secretKey: process.env.AISDK_SECRET_KEY,
});
// Option 1 — pass the custom string directly
const trace = testOps.trace({
id: 'request-abc-123',
name: 'qa-pipeline',
});
console.log(trace.id); // 32-char hex (deterministic for 'request-abc-123')
console.log(trace.customId); // 'request-abc-123'
// Option 2 — pre-compute when you need the hex up front
const traceId = testOps.generateTraceId('request-abc-123');
const sameTrace = testOps.trace({ id: traceId, name: 'qa-pipeline' });
// sameTrace.id === trace.idimport os
from browserstack_ai_sdk import AISDK
client = AISDK(
public_key=os.environ["AISDK_PUBLIC_KEY"],
secret_key=os.environ["AISDK_SECRET_KEY"],
)
# Option 1 — pass the custom string directly
trace = client.trace(id="request-abc-123", name="qa-pipeline")
print(trace.id) # 32-char hex (deterministic for 'request-abc-123')
print(trace.custom_id) # 'request-abc-123'
# Option 2 — pre-compute when you need the hex up front
trace_id = client.generate_trace_id("request-abc-123")
same_trace = client.trace(id=trace_id, name="qa-pipeline")
# same_trace.id == trace.idimport com.browserstack.aisdk.TestOps;
import com.browserstack.aisdk.tracing.model.TraceBody;
TestOps sdk = TestOps.fromEnv();
// Pass the custom string directly as `id`
TraceBody body = TraceBody.builder()
.id("request-abc-123")
.name("qa-pipeline")
.build();
var trace = sdk.tracing().trace(body);
System.out.println(trace.getId());The Java SDK does not expose a generateTraceId helper yet. Pass the custom string directly as id; the backend applies the same resolution rules.
Custom IDs are deterministic: the same input string always produces the same trace ID. This is what makes retries idempotent — a retried request with the same custom ID writes to the same trace rather than creating a duplicate.
Updating a Trace
const trace = testOps.trace({ name: 'qa-pipeline', input: { q: 'Hello?' } });
// ... run your pipeline ...
trace.update({
output: 'Paris is the capital of France.',
metadata: { latencyMs: 320 },
});trace = client.trace(name="qa-pipeline", input={"question": "What is the capital of France?"})
# ... run your pipeline ...
trace.update(output="Paris is the capital of France.")trace.update(TraceBody.builder()
.output("The Aurora Borealis is caused by solar wind particles...")
.build());Creating a Generation
A generation records a single LLM call. Select your language and provider:
import OpenAI from 'openai';
const openai = new OpenAI();
const generation = trace.generation({
name: 'openai-call',
model: 'gpt-4o',
modelParameters: { temperature: 0.3, maxTokens: 512 },
input: [
{ role: 'system', content: 'You are a helpful assistant.' },
{ role: 'user', content: 'What is the capital of France?' },
],
});
const result = await openai.chat.completions.create({
model: 'gpt-4o',
messages: [
{ role: 'system', content: 'You are a helpful assistant.' },
{ role: 'user', content: 'What is the capital of France?' },
],
});
generation.end({
output: result.choices[0].message,
usage: {
input: result.usage?.prompt_tokens,
output: result.usage?.completion_tokens,
},
});Generation Parameters
| Field | Type | Required | Description |
|---|---|---|---|
name | string | No | Display name. |
model | string | No | Model identifier (e.g. gpt-4o). |
modelParameters | Record<string, any> | No | Parameters passed to the model. |
input | any | No | Prompt or messages sent to the model. |
output | any | No | Model response. Usually set via generation.end(). |
usage | { input?: number; output?: number; total?: number } | No | Token counts. |
metadata | Record<string, any> | No | Arbitrary metadata. |
startTime | Date | No | Override start timestamp. |
endTime | Date | No | Override end timestamp. |
import openai
openai_client = openai.OpenAI(api_key=os.environ["OPENAI_API_KEY"])
generation = trace.start_generation(
name="generate-answer",
model="gpt-4o",
input=[
{"role": "system", "content": "Answer based on the provided context."},
{"role": "user", "content": "What is the capital of France?"},
],
)
response = openai_client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": "Answer based on the provided context."},
{"role": "user", "content": "What is the capital of France?"},
],
)
answer = response.choices[0].message.content
generation.update(
output=answer,
usage_details={"input": response.usage.prompt_tokens, "output": response.usage.completion_tokens},
)
generation.end()Generation Parameters
| Field | Type | Required | Description |
|---|---|---|---|
name | str | Yes | Display name. |
model | str | No | Model identifier (e.g. gpt-4o). |
input | any | No | Prompt or messages sent to the model. |
output | any | No | Model response. Set via generation.update(). |
prompt | any | No | Alias for input. |
response | any | No | Alias for output. |
expected_output | any | No | Expected output for evaluation. |
context | any | No | Context to attach to the generation. |
usage_details | dict | No | Token counts (set via generation.update()). |
metadata | dict | No | Arbitrary metadata. |
import com.browserstack.aisdk.tracing.model.GenerationBody;
import com.browserstack.aisdk.tracing.model.Usage;
import com.openai.client.OpenAIClient;
import com.openai.client.okhttp.OpenAIOkHttpClient;
import com.openai.models.*;
OpenAIClient openai = OpenAIOkHttpClient.fromEnv();
long genStart = System.currentTimeMillis();
ChatCompletion response = openai.chat().completions().create(
ChatCompletionCreateParams.builder()
.model("gpt-4o")
.addMessage(ChatCompletionMessageParam.ofUser("What causes Northern Lights?"))
.build()
);
String answer = response.choices().get(0).message().content();
trace.generation(GenerationBody.builder()
.name("gpt-4o-call")
.model("gpt-4o")
.input(List.of(Map.of("role", "user", "content", "What causes Northern Lights?")))
.output(answer)
.usage(Usage.builder()
.promptTokens(120)
.completionTokens(85)
.totalTokens(205)
.build())
.startTime(genStart)
.endTime(System.currentTimeMillis())
.modelParameters(Map.of("temperature", 0.7, "maxTokens", 512))
.build());Creating a Span
A span tracks a sub-step that is not an LLM call — for example a retrieval, a tool call, or a preprocessing step.
const trace = testOps.trace({ name: 'rag-pipeline' });
const retrievalSpan = trace.span({
name: 'vector-search',
input: { query: 'capital of France', topK: 5 },
});
const vectorStore = {
search: async (query, { topK }) => [
{ id: 'doc1', text: 'Paris is the capital of France.', score: 0.92 },
{ id: 'doc2', text: 'France is a country in Europe.', score: 0.81 },
],
};
const docs = await vectorStore.search('capital of France', { topK: 5 });
retrievalSpan.end({
output: docs,
metadata: { source: 'pinecone' },
});retrieval_span = trace.span(
name="retrieve-documents",
input={"query": "What is the capital of France?"},
)
documents = ["France is a country in Western Europe. Its capital is Paris."]
retrieval_span.update(output={"documents": documents})
retrieval_span.end()import com.browserstack.aisdk.tracing.model.SpanBody;
long start = System.currentTimeMillis();
List<String> docs = vectorDb.search(query);
var retrieval = trace.span(SpanBody.builder()
.name("vector-retrieval")
.input(query)
.output(docs)
.startTime(start)
.endTime(System.currentTimeMillis())
.metadata(Map.of("collection", "knowledge-base", "topK", 5))
.build());Call span.end() to record the current timestamp if you didn't set endTime in the builder:
var span = trace.span(SpanBody.builder().name("processing").build());
// ... do work ...
span.end(); // records endTime = nowSpan Parameters
| Field | Type | Required | Description |
|---|---|---|---|
name | string | No | Display name. |
input | any | No | Input data. |
output | any | No | Output data. Usually set via span.end(). |
metadata | Record<string, any> | No | Arbitrary metadata. |
startTime | Date | No | Override start timestamp. |
endTime | Date | No | Override end timestamp. |
| Field | Type | Required | Description |
|---|---|---|---|
name | str | Yes | Display name. |
input | any | No | Input data. |
expected_output | any | No | Expected output for evaluation. |
context | any | No | Context to attach to the span. |
metadata | dict | No | Arbitrary metadata. |
Use span.update(output=...) to set output before calling span.end().
| Field | Type | Required | Description |
|---|---|---|---|
name | String | No | Display name. |
input | Object | No | Input data. |
output | Object | No | Output data. |
metadata | Map<String, Object> | No | Arbitrary metadata. |
startTime | long | No | Override start timestamp (epoch ms). |
endTime | long | No | Override end timestamp (epoch ms). |
Creating an Event
An event is a point-in-time observation within a trace or span.
const trace = testOps.trace({ name: 'agent-run' });
trace.event({
name: 'tool-selected',
input: { toolName: 'search_web' },
metadata: { reasoning: 'User asked for current info' },
});span = trace.start_span(name="agent-step")
span.create_event(
name="cache-miss",
input={"key": "query-123"},
)
span.end()import com.browserstack.aisdk.tracing.model.EventBody;
trace.event(EventBody.builder()
.name("cache-miss")
.input(Map.of("key", query))
.metadata(Map.of("reason", "ttl-expired"))
.build());Event Parameters
| Field | Type | Required | Description |
|---|---|---|---|
name | string | No | Display name. |
input | any | No | Input data. |
output | any | No | Output data. |
metadata | Record<string, any> | No | Arbitrary metadata. |
level | string | No | Observation level (e.g. "DEBUG", "WARNING", "ERROR"). |
startTime | Date | No | Override timestamp. |
| Field | Type | Required | Description |
|---|---|---|---|
name | str | Yes | Display name. |
input | any | No | Input data. |
expected_output | any | No | Expected output for evaluation. |
context | any | No | Context to attach to the event. |
metadata | dict | No | Arbitrary metadata. |
| Field | Type | Required | Description |
|---|---|---|---|
name | String | No | Display name. |
input | Object | No | Input data. |
metadata | Map<String, Object> | No | Arbitrary metadata. |
Adding Scores
Scores attach evaluation results to a trace or observation.
// Score on a trace
testOps.score({
traceId: trace.id,
name: 'relevance',
value: 0.92,
comment: 'Answer was highly relevant to the question.',
});
// Score on a specific generation
testOps.score({
traceId: trace.id,
observationId: generation.id,
name: 'faithfulness',
value: 1.0,
});trace.score(
name="correctness",
value=1.0,
comment="Answer is correct",
)import com.browserstack.aisdk.tracing.model.ScoreBody;
// Score a generation
gen.score(ScoreBody.builder()
.name("faithfulness")
.value(0.92)
.comment("Answer stays within the provided context.")
.build());
// Score a trace
trace.score(ScoreBody.builder()
.name("overall-quality")
.value(0.85)
.dataType("NUMERIC")
.build());Score Parameters
| Field | Type | Required | Description |
|---|---|---|---|
name | string | Yes | Score name. |
value | number | string | Yes | Score value. Numeric for NUMERIC/BOOLEAN, string for CATEGORICAL. |
traceId | string | No | Trace to attach the score to. |
observationId | string | No | Observation to attach the score to. |
comment | string | No | Comment explaining the score. |
dataType | string | No | "NUMERIC", "BOOLEAN", or "CATEGORICAL". Auto-inferred if omitted. |
metadata | Record<string, any> | No | Arbitrary metadata. |
| Field | Type | Required | Description |
|---|---|---|---|
name | str | Yes | Score name. |
value | float | str | Yes | Score value. Numeric for NUMERIC/BOOLEAN, string for CATEGORICAL. |
comment | str | No | Comment explaining the score. |
data_type | str | No | "NUMERIC", "BOOLEAN", or "CATEGORICAL". Auto-inferred if omitted. |
observation_id | str | No | Observation to attach the score to. |
metadata | dict | No | Arbitrary metadata. |
| Field | Type | Required | Description |
|---|---|---|---|
name | String | Yes | Score name. |
value | double | Yes | Score value. |
comment | String | No | Comment explaining the score. |
dataType | String | No | "NUMERIC", "BOOLEAN", or "CATEGORICAL". |
Nesting Observations
Spans and generations can be nested inside other spans.
const trace = testOps.trace({ name: 'agent' });
const agentSpan = trace.span({ name: 'agent-step-1' });
// Nest a generation inside a span
const generation = agentSpan.generation({
name: 'tool-decision',
model: 'gpt-4o',
input: [{ role: 'user', content: 'Which tool should I use?' }],
});
generation.end({ output: 'search_web' });
agentSpan.end({});Use start_span() and start_generation() to nest observations explicitly:
trace = client.trace(name="agent")
span = trace.start_span(
name="agent-step-1",
input={"query": "hello"},
)
gen = span.start_generation(
name="llm-call",
model="gpt-4o",
input="Say hello in French",
)
gen.update(output="Bonjour!")
gen.end()
span.update(output="Bonjour!")
span.end()
client.flush()Use the withWorkflow() helper to wrap a callable in an auto-created trace:
String result = tm.withWorkflow("answer-question", () -> {
// All tracing here is nested under the auto-created trace
return generateAnswer(question);
});Frameworks
Select your language to see how to combine manual tracing with each framework:
Use the SDK directly — create a trace, record generations and spans around your LLM calls, then update the trace with the final output.
import { AISDK } from '@browserstack/ai-sdk';
import OpenAI from 'openai';
const testOps = new AISDK({
publicKey: process.env.AISDK_PUBLIC_KEY,
secretKey: process.env.AISDK_SECRET_KEY,
});
const openai = new OpenAI();
async function runRagPipeline(question) {
const trace = testOps.trace({
name: 'rag-pipeline',
input: { question },
tags: ['rag', 'production'],
});
// Step 1: Retrieve context
const retrievalSpan = trace.span({
name: 'retrieval',
input: { query: question },
});
const context = await retrieveDocuments(question);
retrievalSpan.end({ output: context });
// Step 2: Generate answer
const generation = trace.generation({
name: 'answer-generation',
model: 'gpt-4o',
input: [
{ role: 'system', content: 'Answer using the provided context.' },
{ role: 'user', content: `Context:\n${context}\n\nQuestion: ${question}` },
],
});
const response = await openai.chat.completions.create({
model: 'gpt-4o',
messages: [
{ role: 'system', content: 'Answer using the provided context.' },
{ role: 'user', content: `Context:\n${context}\n\nQuestion: ${question}` },
],
});
const answer = response.choices[0].message.content ?? '';
generation.end({
output: answer,
usage: {
input: response.usage?.prompt_tokens,
output: response.usage?.completion_tokens,
},
});
trace.update({ output: answer });
await testOps.shutdown();
return answer;
}
async function retrieveDocuments(question) {
return 'Paris is the capital of France. France is a country in Europe.';
}
await runRagPipeline('What is the capital of France?');Use the SDK directly — create a trace, record generations and spans around your LLM calls, then update the trace with the final output.
import os
from browserstack_ai_sdk import AISDK
import openai
client = AISDK(
public_key=os.environ["AISDK_PUBLIC_KEY"],
secret_key=os.environ["AISDK_SECRET_KEY"],
)
openai_client = openai.OpenAI()
def run_rag_pipeline(question: str) -> str:
trace = client.trace(
name="rag-pipeline",
user_id="user-123",
session_id="session-abc",
tags=["rag", "production"],
)
# Step 1: Retrieve context
retrieval_span = trace.span(
name="retrieve-documents",
input={"query": question},
)
documents = ["France is a country in Western Europe. Its capital is Paris."]
retrieval_span.update(output={"documents": documents})
retrieval_span.end()
# Step 2: Generate answer
generation = trace.start_generation(
name="generate-answer",
model="gpt-4o",
input=[
{"role": "system", "content": "Answer based on the provided context."},
{"role": "user", "content": question},
],
)
response = openai_client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": "Answer based on the provided context."},
{"role": "user", "content": question},
],
)
answer = response.choices[0].message.content
generation.update(output=answer, usage_details={"input": 50, "output": 20})
generation.end()
trace.update(output=answer)
trace.score(name="answer-quality", value=0.9)
client.flush()
return answer
result = run_rag_pipeline("What is the capital of France?")
print(result)Use the SDK directly — create a trace, record generations and spans around your LLM calls, then update the trace with the final output.
import com.browserstack.aisdk.TestOps;
import com.browserstack.aisdk.tracing.TraceManager;
import com.browserstack.aisdk.tracing.model.*;
import com.openai.client.OpenAIClient;
import com.openai.client.okhttp.OpenAIOkHttpClient;
import com.openai.models.*;
import java.util.List;
import java.util.Map;
public class TracingExample {
public static void main(String[] args) throws Exception {
TestOps sdk = TestOps.fromEnv();
TraceManager tm = sdk.traceManager();
String question = "What causes Northern Lights?";
var trace = tm.trace(TraceBody.builder()
.name("rag-pipeline")
.input(question)
.userId("user-42")
.environment("production")
.build());
// Retrieve documents
long t0 = System.currentTimeMillis();
List<String> docs = List.of("Aurora borealis occur when...", "Solar particles...");
trace.span(SpanBody.builder()
.name("retrieval")
.input(question)
.output(docs)
.startTime(t0)
.endTime(System.currentTimeMillis())
.build());
// Call the LLM
OpenAIClient openai = OpenAIOkHttpClient.fromEnv();
long genStart = System.currentTimeMillis();
ChatCompletion response = openai.chat().completions().create(
ChatCompletionCreateParams.builder()
.model("gpt-4o")
.addMessage(ChatCompletionMessageParam.ofUser(question))
.build()
);
String answer = response.choices().get(0).message().content();
var gen = trace.generation(GenerationBody.builder()
.name("gpt-4o-call")
.model("gpt-4o")
.input(List.of(Map.of("role", "user", "content", question)))
.output(answer)
.usage(Usage.builder().promptTokens(200).completionTokens(120).totalTokens(320).build())
.startTime(genStart)
.endTime(System.currentTimeMillis())
.build());
gen.score(ScoreBody.builder().name("faithfulness").value(0.95).build());
trace.update(TraceBody.builder().output(answer).build());
tm.flush();
sdk.shutdown();
}
}Python: @observe Decorator
The Python SDK also provides an @observe decorator for automatic span wrapping:
import os
from browserstack_ai_sdk import observe, AISDK
import openai
client = AISDK(
public_key=os.environ["AISDK_PUBLIC_KEY"],
secret_key=os.environ["AISDK_SECRET_KEY"],
)
openai_client = openai.OpenAI()
@observe(name="summarize-article")
def summarize(article: str) -> str:
response = openai_client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": "Summarize the following article in 3 bullet points."},
{"role": "user", "content": article},
],
)
return response.choices[0].message.content
result = summarize("Article text goes here...")