How to use Langfuse-hosted Evaluators on Dataset Runs?
Before running an Experiments via SDK, you will need to set up which evaluators you want to run in the Langfuse UI. You will need to configure a running evaluator in the following format:
- Dataset: Filter which source dataset the evaluator should run on.
- Scope: Choose whether to target only new Dataset Runs and/or execute the evaluator on past Dataset Runs (for backfilling).
- Sampling: To manage costs and evaluation throughput, you can configure the evaluator to run on a percentage (e.g., 5%) of Dataset Run items.
See the llm-as-a-judge documentation for more details.
Was this page helpful?