Langfuse Roadmap
Langfuse is open source and we want to be fully transparent what we’re working on and what’s next. This roadmap is a living document and we’ll update it as we make progress.
Your feedback is highly appreciated. Feel like something is missing? Add new ideas on GitHub or vote on existing ones. Both are a great way to contribute to Langfuse and help us understand what is important to you.
🚀 Released
10 most recent changelog items:
- Protected prompt labels
- Playground support for Gemini 2.5 Pro Experimental
- Tool Calling and Structured Output in Playground
- Traces Table Peek View
- New Prompt View
- New Trace View
- OpenAI Response API support in SDKs
- Batch-export Scores via UI
- Public API for Annotation Queues
- Prompt Composability
Subscribe to our mailing list to get occasional email updates about new features.
🚧 In progress
- Tracing
- Full-text search: Search across the inputs/outputs of traces, sessions, and datasets (#939)
- Analytics
- Customizable dashboards: Create and share custom dashboards based on metrics extracted from traces (#1011)
- Query API for custom metric aggregations to replace the static daily metrics API
- Make Langfuse SDKs OpenTelemetry native, Langfuse Server already supports OpenTelemetry (docs)
- Evaluation
- Session-level scores (#2728)
- Improvements to core eval views (e.g. compare run view)
- Annotate dataset experiments
- UI/UX: improvements of all core product features
- Admin/SCIM API to programmatically manage organizations, projects, and users (#1007)
- Self-host documentation: provide better guidance on selecting the right deployment options and scaling Langfuse
🔮 Planned
- Agent Observability: Improve native support for agentic applications
- Evaluation
- Simplified configuration of in-UI evals
- More llm-as-a-judge templates for RAG, Agents, and conversational applications
- Non-LLM evaluators: classifiers, custom code, regex, etc.
- Evals on repeated spans/observations within a trace
- Comparisons of different evaluation metrics
- Improvements to Langfuse SDK to simplify creation of experiments
- End-to-end examples on how to run Langfuse Experiments as regression tests in CI/CD pipelines
- Add tracing to LLM-as-a-judge evals for better observability (debugging and cost tracking)
- UI
- Filters: saved views, new filter UI on tables
- Prompt Management
- Data Platform
- Webhooks to subscribe to changes within your Langfuse project (#1033)
- Alerts: Create alerts for custom metric thresholds, errors, etc.
- Rule-based routing between Langfuse product features
- Prompt Experiments & Playground
- Split view to compare different LLMs and prompts side by side before running a prompt experiment
- Add tool-call and structured output support to prompt experiments
- Langfuse Cloud
- HIPAA compliance
- Usage based billing alerts
🙏 Feature requests and bug reports
The best way to support Langfuse is to share your feedback, report bugs, and upvote on ideas suggested by others.