04 Monitoring
Workshop source
Workshop material is maintained in the public langfuse/langfuse-workshop repository. Use the repository for the runnable app, checkpoint branches, and local setup.
Learner guide: 04 Monitoring
Instructor notes
- This is still a UI-first chapter, but it now mixes two evaluator types: LLM-as-a-judge for semantic signals and a code evaluator for a deterministic frustration signal.
- Before the first evaluator, confirm the project has Project Settings → LLM Connections configured and a default evaluator model saved. Fresh projects otherwise show "No default model set" before learners can pick the published templates.
- Explain why the two monitors target different observations: out-of-scope needs the system prompt on the generation, while disagreement needs the conversation history on the agent root.
- Explain why the all-caps monitor is code-based: no model call is needed when a simple deterministic rule is enough.
- Use the first few evaluator results as a debugging exercise, not just a pass/fail check.
Demo rhythm
- Configure Out-of-Scope Request on final generation observations.
- Configure User Disagreement on the
dad-it-support-chat-turnagent observation. - Configure the all-caps code evaluator on the same
dad-it-support-chat-turnagent observation. - Send one clean in-scope turn, one out-of-scope turn, one disagreement turn, and one all-caps turn.
Watch for
- Accidentally choosing the wrong template for User Disagreement.
- Treating the Langfuse API keys from
.envas enough for evaluators. Judge-based evaluators also need the Langfuse-side LLM connection. - Mapping
last_user_messageto the last transcript item on a final generation; final generations include tool messages after the user turn. - For the all-caps signal, prefer the Python version in the learner docs rather than fighting the TypeScript editor.
- Learners assuming the all-caps score is a guarantee of anger. Frame it as a triage signal, not a verdict.
Was this page helpful?