Observability for Baseten with Langfuse
This guide shows you how to integrate Baseten with Langfuse. Baseten’s inference API is fully compatible with OpenAI’s client libraries, allowing us to use the Langfuse OpenAI drop-in replacement to trace all parts of your application.
What is Baseten? Baseten is an inference platform that enables developers to deploy and scale machine learning models in production. It provides fast, reliable model inference with support for popular open-source models through an OpenAI-compatible API.
What is Langfuse? Langfuse is an open source LLM engineering platform that helps teams trace API calls, monitor performance, and debug issues in their AI applications.
Step 1: Install Dependencies
Make sure you have installed the necessary Python packages:
%pip install openai langfuse -q
Step 2: Set Up Environment Variables
import os
# Get keys for your project from the project settings page: https://cloud.langfuse.com
os.environ["LANGFUSE_PUBLIC_KEY"] = "pk-lf-..."
os.environ["LANGFUSE_SECRET_KEY"] = "sk-lf-..."
os.environ["LANGFUSE_BASE_URL"] = "https://cloud.langfuse.com" # 🇪🇺 EU region
# os.environ["LANGFUSE_BASE_URL"] = "https://us.cloud.langfuse.com" # 🇺🇸 US region
# Get your Baseten API key from https://app.baseten.co/settings/api_keys
os.environ["BASETEN_API_KEY"] = "..."
Step 3: Langfuse OpenAI drop-in Replacement
In this step we use the native OpenAI drop-in replacement by importing from langfuse.openai import openai.
To start using Baseten with OpenAI’s client libraries, pass in your Baseten API key to the api_key option, and change the base_url to https://inference.baseten.co/v1:
# instead of import openai:
from langfuse.openai import openai
client = openai.OpenAI(
api_key=os.environ.get("BASETEN_API_KEY"),
base_url="https://inference.baseten.co/v1",
)Step 4: Run An Example
The following cell demonstrates how to call Baseten’s chat model using the traced OpenAI client. All API calls will be automatically traced by Langfuse.
response = client.chat.completions.create(
model="zai-org/GLM-4.6",
messages=[
{"role": "system", "content": "You are a travel agent. Be descriptive and helpful."},
{"role": "user", "content": "Tell me the top 3 things to do in San Francisco"},
],
name="baseten-example-trace"
)
print(response.choices[0].message.content)Step 5: See Traces in Langfuse
After running the example model call, you can see the traces in Langfuse. You will see detailed information about your Baseten API calls, including:
- Request parameters (model, messages, temperature, etc.)
- Response content
- Token usage statistics
- Latency metrics

Interoperability with the Python SDK
You can use this integration together with the Langfuse SDKs to add additional attributes to the trace.
The @observe() decorator provides a convenient way to automatically wrap your instrumented code and add additional attributes to the trace.
from langfuse import observe, propagate_attributes, get_client
langfuse = get_client()
@observe()
def my_llm_pipeline(input):
# Add additional attributes (user_id, session_id, metadata, version, tags) to all spans created within this execution scope
with propagate_attributes(
user_id="user_123",
session_id="session_abc",
tags=["agent", "my-trace"],
metadata={"email": "user@langfuse.com"},
version="1.0.0"
):
# YOUR APPLICATION CODE HERE
result = call_llm(input)
# Update the trace input and output
langfuse.update_current_trace(
input=input,
output=result,
)
return resultLearn more about using the Decorator in the Langfuse SDK instrumentation docs.
Next Steps
Once you have instrumented your code, you can manage, evaluate and debug your application: