DocsEvaluationEvaluation MethodsManual Annotations

Human Annotation

Human Annotation is a manual evaluation method. It is used to collaboratively annotate traces and observations with scores.

In Langfuse you can either annotate single traces or observations or use Annotation Queues to work through larger batches of traces to score.

Annotate

Why use Human Annotation?

  • Collaboration: Enable team collaboration by inviting other internal members to annotate a subset of traces and observations. This manual evaluation can enhance the overall accuracy and reliability of your results by incorporating diverse perspectives and expertise.
  • Annotation data consistency: Create score configurations for annotation workflows to ensure that all team members are using standardized scoring criteria. Hereby configure categorical, numerical or binary score types to capture different aspects of your data.
  • Evaluation of new product features: This feature can be useful for new use cases where no other scores have been allocated yet.
  • Benchmarking of other scores: Establish a human baseline score that can be used as a benchmark to compare and evaluate other scores. This can provide a clear standard of reference and enhance the objectivity of your performance evaluations.

Annotation of Single Traces

Manual Annotation of single traces and observations is available in the trace detail view.

Prerequisite: Create a ScoreConfig

To use Human Annotation, you need to have at least one score configuration (ScoreConfig) set up. See how to create and manage ScoreConfigs for details.

Trigger Annotation on a Trace or Observation

On a Trace or Observation detail view click on Annotate to open the annotation form.

Annotate

Select ScoreConfigs to use

Annotate

Set Score values

Annotate

See newly added Scores

To see your newly added scores, click on the Scores tab on the trace or observation detail view.

Detail scores table

Annotation Queues

Annotation queues allow you to manage and prioritize your annotation tasks in a structured way. This feature is particularly useful for large-scale projects that benefit of human-in-the-loop evaluation at some scale. Queues streamline this process by allowing for specifying which traces/observations you’d like to annotate on which dimensions.

Create Annotation Queues

Prerequisite: Create a ScoreConfig

To use Human Annotation, you need to have at least one score configuration (ScoreConfig) set up. See how to create and manage ScoreConfigs for details.

Go to Annotation Queues View

  • Navigate to Your Project > Human Annotation to see all your annotation queues.
  • Click on New queue to create a new queue.

Annotate

Fill out Create Queue Form

  • Select the ScoreConfigs you want to use for this queue.
  • Set the Queue name and Description (optional).

Annotate

  • Click on Create queue to create the queue.

Run Annotation Queues

Populate Annotation Queues

Once you have created annotation queues, you can assign traces or observations to them.

To add multiple traces or observations to a queue:

  1. Navigate to the trace table view and optionally adjust the filters
  2. Select Traces or Observations via the checkboxes.
  3. Click on the “Actions” dropdown menu
  4. Click on Add to queue to add the selected traces or observations to the queue.
  5. Select the queue you want to add the traces or observations to.

Annotate

  1. Navigate to Your Project > Human Annotation
  2. Option 1: Click on the queue name to view the associated annotation tasks
  3. Option 2: Click on “Process queue” to start processing the queue

Annotate

Process Annotation Tasks

You will see an annotation task for each item in the queue.

  1. On the Annotate Card add scores on the defined dimensions
  2. Click on Complete + next to move to the next annotation task or finish the queue

Annotate

Manage Annotation Queues via API

You can enqueue, manage and dequeue annotation tasks via the API. This allows for scaling and automating your annotation workflows.

GitHub Discussions

Was this page helpful?