06 Experiments
Workshop source
Workshop material is maintained in the public langfuse/langfuse-workshop repository. Use the repository for the runnable app, checkpoint branches, and local setup.
Learner guide: 06 Experiments
Instructor notes
- The key idea is reuse: the experiment runner calls the same
runSupportConversation(...)as the web app. - Contrast deterministic scoring (
keyword_overlap) with LLM-as-a-judge scoring (correctness). - Confirm the default evaluator model before the Correctness setup. If learners skipped setup, send them to Project Settings โ LLM Connections first.
- Keep concurrency at one for workshops so traces and console output are easy to follow.
Demo rhythm
- Skim the numbered sections in
scripts/run-dataset.ts. - Configure the Correctness evaluator for dataset runs.
- Run
npm run dataset:run. - Open the run table, per-item traces, and chart view.
Watch for
- Correctness evaluator mapping.
querycomes from dataset input,generationfrom run output, andground_truthfrom expected output. - "No default model set" means Langfuse needs an LLM connection/default evaluator model; it is not fixed by editing
.env. - Slow asynchronous evaluator results; refresh after the run finishes.
Was this page helpful?