Automation Rules
Automatically route traces to annotation queues or datasets based on configurable conditions.
Automation Rules
Automation rules let you act on incoming traces without manual intervention. Rules evaluate each trace against a set of conditions and, when matched, take a configured action — routing the trace to a human reviewer or adding it to a dataset.
All rule types are managed under Settings → Automation Rules in the dashboard.
Rule types
| Rule Type | Action |
|---|---|
| Human Review Routing | Add matching traces to an annotation queue for human labeling |
| Dataset Automation | Add matching traces to a dataset automatically |
For continuous LLM scoring on live traces (without human review), see Online Evaluations.
Human Review Routing Rules
Human Review rules automatically route traces that match your conditions into one or more annotation queues. Reviewers then label or score items in those queues.
Setting up a routing rule
Navigate to Logs → Create Automated Rules → Add to Human Review.
Create a new rule
Click + New and provide a name and optional description for the rule.
Add targeting filters
Define the conditions a trace must meet to be routed. Filters can check any trace attribute:
| Field | Operators | Example |
|---|---|---|
| Trace name | equals, contains, regex | name equals "support-chat" |
| Metadata keys | equals, contains | metadata.region equals "us-east" |
| Tags | contains | tags contains "flagged" |
| User ID | equals, contains | userId equals "user-42" |
| Score | less than, greater than | score(faithfulness) < 0.5 |
Combine multiple filters with AND / OR to express complex conditions.
Select annotation queues
Choose one or more annotation queues as the destination for matching traces. A single rule can fan out to multiple queues.
Set sampling rate
The sampling rate (0–100%) controls what fraction of matching traces are actually routed. Use less than 100% to sample representative traces without overwhelming reviewers.
Save and activate
Save the rule. It starts Active by default. Pause or re-activate from the rules table at any time.
Use cases
- Flag low-quality responses — route traces where an evaluator score falls below a threshold to a "needs review" queue.
- Review new model outputs — route all traces from a newly deployed model version for comparison.
- Compliance sampling — route a percentage of traces in regulated topic areas for audit review.
- Disagreement resolution — route traces where automated evaluators disagree for human arbitration.
Rules table
| Column | Description |
|---|---|
| Rule Name | Name and optional description |
| Status | Active (green) or Paused (amber) |
| Review Queues | Destination annotation queue(s) |
| Total Items Added | Cumulative count of traces routed by this rule |
| Created | Creation timestamp |
Deleting a rule does not remove items already added to queues. Previously routed traces remain in their queues.
Dataset Automation Rules
Dataset Automation rules add traces that match your conditions to a dataset automatically. This lets you build and grow datasets from production traffic without manual curation.
Setting up a dataset rule
Navigate to Logs → Create Automated Rules → Add to Dataset.
Create a new rule
Click + New and give the rule a name and optional description.
Add targeting filters
Define which traces to capture. The filter configuration is identical to Human Review Routing Rules — any trace attribute can be used as a condition.
Select a target dataset
Choose an existing dataset to add matched traces to. The trace (or a specific observation within it) is saved as a new dataset item.
Set sampling rate
Control what fraction (0–100%) of matching traces are added. For high-volume pipelines, a 5–10% sample often captures sufficient diversity without creating an unmanageably large dataset.
Save and activate
Save the rule. It starts Active by default.
Use cases
- Golden dataset construction — capture high-scoring traces as ground-truth examples for future experiments.
- Failure case collection — collect traces below a quality threshold to build a regression dataset.
- Diverse coverage — sample across user cohorts or topic areas to ensure dataset diversity.
- Continuous dataset refresh — keep datasets up to date with recent production patterns rather than relying on static snapshots.
Rules table
| Column | Description |
|---|---|
| Rule Name | Name and optional description |
| Status | Active (green) or Paused (amber) |
| Target Dataset | Dataset that receives matched traces |
| Total Items Added | Cumulative dataset items created by this rule |
| Created | Creation timestamp |
Shared concepts
Targeting filters
All rule types share the same targeting filter structure. Filters are evaluated against each new trace at ingestion time. Each filter specifies:
- Field — the trace attribute to check (name, metadata key, tag, user ID, score, etc.)
- Operator — how to compare (
equals,contains,regex,less than,greater than) - Value — the comparison value
- Logical operator —
ANDorORto chain with the next filter
Sampling
Sampling is deterministic per trace ID. A given trace will always be included or excluded for the same rule, so you won't see the same trace added to a queue multiple times if it is re-evaluated.
Accessing Rules and Rule status
Navigate to Settings → Project Settings → Automated Rules → Human Review or Dataset.
Rules can be Active or Paused. Pausing a rule stops it from processing new traces but does not affect items already added. Re-activating a rule resumes processing from that point forward — traces ingested while the rule was paused are not back-filled.