Launch Week 5 ยท Day 1: Experiments in CI/CD โ†’
Workshop08 Wrap-up
WorkshopLessons08 Wrap-up

08 Wrap-up

Workshop source

Workshop material is maintained in the public langfuse/langfuse-workshop repository. Use the repository for the runnable app, checkpoint branches, and local setup.

View this Markdown file

Starting point

git checkout checkpoint/08-wrap-up

You have walked through every loop step.

What you should be able to do now

  • Trace an LLM app end-to-end and read the result as a debugging surface.
  • Connect prompts to traces so a prompt change has a measurable next-trace effect.
  • Detect interesting production behavior (out-of-scope, disagreement) automatically.
  • Turn product scope into a starter dataset of realistic examples.
  • Run experiments on the same agent code with no parallel implementation.
  • Compare runs after changes and decide which setup is better โ€” by score and by reading individual outputs.

Bigger picture

Langfuse in this workshop is a shared surface, not just observability:

  • understanding behavior โ€” every interaction is inspectable
  • collecting representative examples โ€” production seeds datasets
  • comparing changes โ€” every prompt or code change has a baseline
  • improving systems continuously โ€” the loop closes back on itself

Good closing questions

  • What did tracing reveal that was invisible before?
  • Which production events would you monitor first in your own app?
  • What would you add to the starter dataset next?
  • What change would you test after the first prompt iteration?
  • Where in your real app would the /langfuse skill have saved you the most hand-rolling?

How to work on your own application

When you go back to your own codebase, do this in order:

  1. Run the Langfuse CLI so you can manage prompts, datasets, and runs from the terminal. The CLI uses the same project API keys as the SDKs, so there is no separate CLI login step:

    export LANGFUSE_PUBLIC_KEY="pk-lf-..."
    export LANGFUSE_SECRET_KEY="sk-lf-..."
    export LANGFUSE_BASE_URL="https://cloud.langfuse.com"
    
    npx langfuse-cli api __schema

    If you prefer a global binary, install the published langfuse-cli package:

    npm install -g langfuse-cli
    langfuse api __schema
  2. Install the Langfuse skill (/langfuse) โ€” it packages the recommended tracing, prompt management, monitoring, and evaluator patterns from this workshop and applies them to your codebase without you hand-rolling each piece.

  3. Pick the smallest LLM-using surface you have and wire observe(...) + observeOpenAI(...) first. Get one trace before you do anything else.

  4. Add user/session information only once you have at least two users or two sessions of traffic โ€” there's no point until then.

  5. Build your first dataset from real traces, discussions with experts, or past examples rather than from imagination. Production behavior will tell you over time what your dataset needs to cover.

  6. Run one experiment, change one thing, rerun. Then repeat.

For the bigger-picture material on each step, the Langfuse Academy has dedicated lessons:


Was this page helpful?