DocsRoadmap

Langfuse Roadmap

Langfuse is open source and we want to be fully transparent what we’re working on and what’s next. This roadmap is a living document and we’ll update it as we make progress.

Your feedback is highly appreciated. Feel like something is missing? Add new ideas on GitHub or vote on existing ones. Both are a great way to contribute to Langfuse and help us understand what is important to you.

🚀 Released

10 most recent changelog items:

Subscribe to our mailing list to get occasional email updates about new features.

🚧 In progress

  • Tracing
    • Full-text search: Search across the inputs/outputs of traces, sessions, and datasets (#939)
    • Analytics
      • Customizable dashboards: Create and share custom dashboards based on metrics extracted from traces (#1011)
      • Query API for custom metric aggregations to replace the static daily metrics API
    • Make Langfuse SDKs OpenTelemetry native, Langfuse Server already supports OpenTelemetry (docs)
  • Evaluation
    • Session-level scores (#2728)
    • Improvements to core eval views (e.g. compare run view)
    • Annotate dataset experiments
  • UI/UX: improvements of all core product features
  • Admin/SCIM API to programmatically manage organizations, projects, and users (#1007)
  • Self-host documentation: provide better guidance on selecting the right deployment options and scaling Langfuse

🔮 Planned

  • Agent Observability: Improve native support for agentic applications
    • Generalized agent graphs (#2669), beta available for LangGraph (docs)
    • Filtering for tool calls used within an execution
    • Opinionated agent evaluations
  • Evaluation
    • Simplified configuration of in-UI evals
    • More llm-as-a-judge templates for RAG, Agents, and conversational applications
    • Non-LLM evaluators: classifiers, custom code, regex, etc.
    • Evals on repeated spans/observations within a trace
    • Comparisons of different evaluation metrics
    • Improvements to Langfuse SDK to simplify creation of experiments
    • End-to-end examples on how to run Langfuse Experiments as regression tests in CI/CD pipelines
    • Add tracing to LLM-as-a-judge evals for better observability (debugging and cost tracking)
  • UI
    • Filters: saved views, new filter UI on tables
  • Prompt Management
    • Placeholders for chat messages (#2210)
    • Track prompt variables in production tracing, simplify adding to datasets and running prompt experiments
    • LLM-assisted prompt engineering
    • Folders (#4874)
    • A/B testing
    • Native support for tool calls (#2624)
  • Data Platform
    • Webhooks to subscribe to changes within your Langfuse project (#1033)
    • Alerts: Create alerts for custom metric thresholds, errors, etc.
    • Rule-based routing between Langfuse product features
  • Prompt Experiments & Playground
    • Split view to compare different LLMs and prompts side by side before running a prompt experiment
    • Add tool-call and structured output support to prompt experiments
  • Langfuse Cloud
    • HIPAA compliance
    • Usage based billing alerts

🙏 Feature requests and bug reports

The best way to support Langfuse is to share your feedback, report bugs, and upvote on ideas suggested by others.

Feature requests

Bug reports

Was this page useful?

Questions? We're here to help

Subscribe to updates