FAQ

How do I cut my Langfuse Cloud bill?

Langfuse Cloud pricing is based on the number of ingested units per billing period.

Units = Traces + Observations + Scores (data model)

Most cost spikes result from ingesting too many traces or overly verbose observations. You can cut costs quickly by sampling fewer traces or logging only essential data—all while preserving your core insights.

Use our pricing calculator to estimate how different usage levels impact your monthly costs.

You can track your unit consumption in real-time via the “Langfuse Usage Management” dashboard:

Langfuse Usage Management Dashboard

Option 1: Reduce observations per trace

Every observation within a trace counts toward your unit total. Some observations may be overly detailed or irrelevant to your specific use case. Steps to remove them:

  1. Review your traces to identify low-value or unnecessary observations.
  2. Update your integration/instrumentation to exclude these observations.
    • For most integrations, you define which observations are created. Thus, you can remove them by updating your instrumentation.
    • If you use the Python SDK (v3, OpenTelemetry-based), all OpenTelemetry spans are exported to Langfuse by default. If some observations are not relevant, you can filter out observations by instrumentation scope.

Option 2: Sample fewer traces

Keeping all traces is often valuable for LLM application development. Unlike traditional observability:

  • Dynamic sampling based on error levels isn’t feasible since you only know if a trace is interesting after completion (through user feedback, LLM-as-a-judge evaluation, etc.).
  • Retaining all traces supports model distillation efforts down the line.

However, if your application operates at significant scale, sampling can be a reasonable cost-cutting strategy. Check out the sampling docs to learn more.


Have questions regarding your Langfuse bill? Reach out to support.

Was this page helpful?