2024/04/19

Langfuse Launch Week #1

Unveiling Langfuse 2.0 in a week of releases

Langfuse

Joins us for Launch Week #1

We’re excited to announce Langfuse’s first launch week. We’re kicking it off on Monday April 22nd and will release a major upgrade to the Langfuse platform every day until Friday.

⭐️ Star us on GitHub & see all of our releases!
Twitter will be our main channel for all of Launch Week #1
Join our first town hall on Wednesday

Launches

Day 0: OpenAI JS SDK Integration

import OpenAI from "openai";
import { observeOpenAI } from "langfuse";
 
// wrap the OpenAI SDK
const openai = observeOpenAI(new OpenAI());
 
// use the OpenAI SDK as you normally would
const res = await openai.chat.completions.create({
  messages: [{ role: "system", content: "Tell me a story." }],
});

We launched a new wrapper for the OpenAI JS SDK. This integration, designed to enable easier monitoring of OpenAI API usage, features seamless observability with enhancements like automatic tracking of prompts, completions, and API errors, as well as insights into model usage and costs. After a soft launch that gathered user feedback for improvements, the integration is now fully available, complete with comprehensive documentation and an example notebook.

Day 1: PostHog Integration

We teamed up with PostHog (OSS product analytics) to integrate LLM-related product metrics into your existing PostHog dashboards. This integration is now available in public beta on Langfuse cloud. You can configure it within your Langfuse project settings. When activated, Langfuse sends metrics related to traces, generations, and scores to PostHog. You can then build custom dashboards to visualize the data or use the LLM Analytics dashboard template to get started quickly. See docs for more information.

Day 2: LLM Playground

We’re excited to introduce the LLM Playground to Langfuse. By making prompt engineering possible directly in Langfuse, we take another step in our mission to build a feature-complete LLM engineering platform that helps you along the full live cycle of your LLM application. With the LLM playground, you can now test and iterate your prompts directly in Langfuse. Either start from scratch or jump into the playground from an existing prompt in your project. See the docs for more details and let us know what you think in the GitHub discussion.

Day 3: Decorator-based integration for Python

We’re happy to share that the Decorator-based integration for Python now supports all Langfuse features and is the recommended way to use Langfuse in Python. The decorator makes integrating with Langfuse so much easier. Head over to the Python Decorator docs to learn more. All inputs, outputs, timings are captured automatically, and it works with all other Langfuse integrations (LangChain, LlamaIndex, OpenAI SDK, …). To celebrate this milestone, we wrote a blog post on the technical details and created the example notebook shown in the video as it demonstrates what’s really cool about the decorator. Thanks again to @lshalon and @AshisGhosh for your contributions to this!

Day 4: Datasets v2

We’re thrilled to release Datasets v2, featuring significant enhancements to the dataset experience in Langfuse. Improvements include a new editor powered by Codemirror, metadata support on all objects, tables that render inputs/outputs side-by-side, the ability to link dataset runs to traces, and the option to create dataset items directly from traces. We’ve also extended the public API with new endpoints for programmatic management of datasets. Check out the changelog which summarizes all the new features and improvements.

Day 5: Model-based Evaluations

On the final day of Launch Week 1, we’re happy to release the biggest change to Langfuse yet: Model-based evaluations run right in Langfuse. So far, it was easy to measure LLM cost and latency in Langfuse. Quality is based on scores which can be user feedback, manual labeling results, or be ingested by evaluation pipelines that you built yourself using the Langfuse SDKs/API. Model-based Evaluations in Langfuse make it way easier to continuously evalutate your application on the dimensions you care about. These can be: hallucinations, toxicity, relevance, correctness, conciseness, and so much more. We provide you with some battle-tested templates to get you started, but you can also write your own templates to cover any niche use case that might be exclusive to your application. Check out the changelog or watch the video to learn more about all the details.

Launch Week Events

Wednesday: First virtual town hall

You’re invited to our first virtual town hall. We (Max, Marc and Clemens) will be demoing new features in Langfuse, answering questions and talking about where we’re taking the project. We’re looking forward to hanging out!

When: Wednesday, April 24th, noon PT, 9pm CET
Recording on YouTube

Friday: Langfuse 2.0 on Product Hunt

We will end the week with the launch of Langfuse 2.0 on Product Hunt on Friday, April 26th. After our initial launch last year – which led to a Golden Kitty Award – we are very excited to be back on Product Hunt.

Launch post (Spoiler: Langfuse became the #1 product of the day 🥇)