Human Annotation for LLM apps
Collaborate with your team and add scores
via the Langfuse UI. You can add scores to both traces and observations within a trace.
Why label data manually?
- Collaboration: Enable team collaboration by inviting other internal members to annotate a subset of traces and observations. This manual evaluation can enhance the overall accuracy and reliability of your results by incorporating diverse perspectives and expertise. See Annotation queues to manage and prioritize the annotation tasks effectively.
- Annotation data consistency: Create score configurations for annotation workflows to ensure that all team members are using standardized scoring criteria. Hereby configure categorical, numerical or binary score types to capture different aspects of your data.
- Evaluation of new product features: This feature can be useful for new use cases where no other scores have been allocated yet.
- Benchmarking of other scores: Establish a human baseline score that can be used as a benchmark to compare and evaluate other scores. This can provide a clear standard of reference and enhance the objectivity of your performance evaluations.
Langfuse supports:
From any trace, you can annotate scores on different dimensions. See step-by-step guide below for details.
1. Annotation of single traces
- HobbyFull
- ProFull
- TeamFull
- Self HostedFull
Create score configurations
To use annotation, you need to create a score configuration. You can create multiple score configurations for different types of scores. Score configurations are immutable. However, you can archive configs if you no longer want to use them in annotation. Archived configs can be restored at any time.
To create a score configuration:
- Navigate to
Settings
, locate theScore Configs
table, and clickAdd new score config
. - Specify the name of the score configuration and the type of score you want to create. You can choose between
Categorical
,Numeric
, andBoolean
score types. - Optionally, add a description to provide additional context for your team members.
Your configs are now available for annotation of traces and observations.
Data Labelling on LLM traces or observations
To annotate a trace or observation:
- Navigate to the trace detail view.
- Click on the
Annotate
button. - Select the scores you want to add.
- Fill in the score values. The scores will be saved automatically. Annotation scores can be edited or deleted at any time.
- Optionally, add comments to individual scores.
View scores on trace or observation
Upon completing annotation click on the Scores
tab to view a table of all the scores that have been added to the trace or observation.
2. Annotation Queues
- HobbyPublic Beta
- ProPublic Beta
- TeamPublic Beta
- Self HostedEnterprise Edition(Enterprise)
Annotation queues allow you to manage and prioritize your annotation tasks in a structured way. This feature is particularly useful for large-scale projects that benefit of human-in-the-loop evaluation at some scale. Queues streamline this process by allowing for specifying which traces/observations you’d like to annotate on which dimensions.
Create annotation queues
Set up annotation queues to specify which traces/observations you’d like to annotate on which dimensions. Queues are fully mutable and editable even after annotation tasks have been created. You may also add/remove annotation tasks to/from queues at any time.
Populate annotation queues
Once you have created annotation queues, you can assign traces or observations to them. The easiest way to do this at scale is to navigate to the trace table view, optionally adjust the filters to narrow down the traces/observations you’d like to annotate, and then use the Actions
> Add to queue
button in the top right corner. You can also add single traces and observations to queues via the Annotate
dropdown on the detail view.
Process annotation tasks
Navigate to the Annotate
tab to view all the annotation queues that have been created. Select the queue you’d like to process. You will see all annotation tasks sequentially in the queue. Our queues are designed to be first-in-first-out (FIFO) and are fully concurrency safe. After adding scores on the defined dimensions, click on Complete + next
to move to the next annotation task. If you’d like to view an overview of all annotation tasks of a given queue, you can click on the queue name in the queue table. This overview shows the status of each tasks.