Annotate Directly from Experiment Compare View

Add human annotations while reviewing experiment results side-by-side. Review experiment outputs, assign scores, and leave comments while viewing full experiment context.
You can now annotate traces directly from the experiment compare view, streamlining the workflow of running experiments and adding human feedback.
What’s New
Human annotation is now available directly in the experiment compare view. Select any cell to open the annotation side panel, where you can assign scores and leave comments while maintaining full experiment context. Use resulting annotation score data to compare experiment results across different prompt versions, model configurations or iterations of your application.
How It Works

- Run an experiment: Execute an experiment via UI or SDK to test prompt versions or models
- Open compare view: Navigate to the experiment comparison to review results side-by-side
- Select a cell: Click any experiment item to open the annotation panel
- Add scores: Assign values for the score configs you’ve set up
- Leave comments: Add context for your team about specific outputs
- Move to next item: Click “Annotate” on the next cell you’d like to review
The UI updates optimistically as you add scores, providing immediate feedback while data persists in the background. Summary metrics in the compare view reflect your annotations as you work.
Score Configurations
Before annotating, create score configs to standardize scoring criteria across your team. Score configs support:
- Numerical scores: Define min/max ranges for quantitative evaluation
- Categorical scores: Set custom categories for classification tasks
- Binary scores: Use boolean values for pass/fail judgments
Standardized score configs ensure consistency and make results comparable across experiments and team members. You can update score configs at any time.
Getting Started
- Set up score configs for your evaluation criteria
- Run an experiment via UI
- Open the compare view and start annotating