Langfuse just got faster →
DocsEvaluationScoresData Model

Scores Data Model

This page describes the data model for score-related objects in Langfuse. For an overview of what scores are and when to use them, see the Scores overview. For datasets, experiment runs, and function definitions, see the Experiments data model.

For detailed reference please refer to

Scores

Scores are the data object to store evaluation results. They are used to assign evaluation scores to traces, observations, sessions, or dataset runs. Scores can be added manually via annotations, programmatically via the SDK/API, or automatically via LLM-as-a-Judge evaluators.


Scores have the following properties:

  • Each Score references exactly one of Trace, Observation, Session, or DatasetRun
  • Scores are either numeric, categorical, boolean, or text (see Score Types)
  • Scores can optionally be linked to a ScoreConfig to ensure they comply with a specific schema

Score object

AttributeTypeRequiredDescription
idstringYesUnique identifier of the score. Auto-generated by SDKs. Optionally can also be used as an idempotency key to update scores.
namestringYesName of the score, e.g. user_feedback, hallucination_eval
valuenumberNoNumeric value of the score. Always defined for numeric and boolean scores. Optional for categorical scores. Not used for text scores.
stringValuestringNoString value of the score. Used for categorical, boolean (string equivalent), and text data types. Automatically set for categorical scores based on the config if the configId is provided.
dataTypestringNoAutomatically set based on the config data type when the configId is provided. Otherwise can be defined manually as NUMERIC, CATEGORICAL, BOOLEAN, or TEXT
sourcestringYesAutomatically set based on the source of the score. Can be either API, EVAL, or ANNOTATION
commentstringNoEvaluation comment, commonly used for user feedback, eval reasoning output or internal notes
traceIdstringNoId of the trace the score relates to
observationIdstringNoId of the observation (e.g. LLM call) the score relates to
sessionIdstringNoId of the session the score relates to
datasetRunIdstringNoId of the dataset run the score relates to
configIdstringNoScore config id to ensure that the score follows a specific schema. Can be defined in the Langfuse UI or via API.

Common Use Cases

LevelDescription
TraceUsed for evaluation of a single interaction. (most common)
ObservationUsed for evaluation of a single observation below the trace level.
SessionUsed for comprehensive evaluation of outputs across multiple interactions.
Dataset RunUsed for performance scores of a Dataset Run.

Score Config

Score configs are used to ensure that your scores follow a specific schema. Using score configs allows you to standardize your scoring schema across your team and ensure that scores are consistent and comparable for future analysis.

You can define a ScoreConfig in the Langfuse UI or via our API. Configs are immutable but can be archived (and restored anytime).

A score config includes:

  • Score name
  • Data type: NUMERIC, CATEGORICAL, BOOLEAN, TEXT
  • Constraints on score value range (Min/Max for numerical, Custom categories for categorical data types, 1-500 characters for text)

ScoreConfig object

AttributeTypeRequiredDescription
idstringYesUnique identifier of the score config.
namestringYesName of the score config, e.g. user_feedback, hallucination_eval
dataTypestringYesCan be either NUMERIC, CATEGORICAL, BOOLEAN, or TEXT
isArchivedbooleanNoWhether the score config is archived. Defaults to false
minValuenumberNoSets minimum value for numerical scores. If not set, the minimum value defaults to -∞
maxValuenumberNoSets maximum value for numerical scores. If not set, the maximum value defaults to +∞
categorieslistNoDefines categories for categorical scores. List of objects with label value pairs
descriptionstringNoProvides further description of the score configuration

Was this page helpful?