This is a Jupyter notebook

Observability for Cerebras with Langfuse

This guide shows you how to integrate Cerebras with Langfuse. Cerebras’s API is fully compatible with OpenAI’s client libraries, allowing us to use the Langfuse OpenAI drop-in replacement to trace all parts of your application.

What is Cerebras? Cerebras is a high-throughput, low-latency inference platform built on Cerebras’ wafer-scale processors (WSE) and CS systems, optimized specifically for token generation at scale.

What is Langfuse? Langfuse is an open source LLM engineering platform that helps teams trace API calls, monitor performance, and debug issues in their AI applications.

Step 1: Install Dependencies

Make sure you have installed the necessary Python packages:

%pip install openai langfuse -q

Step 2: Set Up Environment Variables

import os
 
# Get keys for your project from the project settings page: https://cloud.langfuse.com
os.environ["LANGFUSE_PUBLIC_KEY"] = "pk-lf-..." 
os.environ["LANGFUSE_SECRET_KEY"] = "sk-lf-..." 
os.environ["LANGFUSE_BASE_URL"] = "https://cloud.langfuse.com" # 🇪🇺 EU region
# os.environ["LANGFUSE_BASE_URL"] = "https://us.cloud.langfuse.com" # 🇺🇸 US region
 
# Get your Cerebras API key from https://cloud.cerebras.ai/
os.environ["CEREBRAS_API_KEY"] = "csk-..."

Step 3: Langfuse OpenAI drop-in Replacement

In this step we use the native OpenAI drop-in replacement by importing from langfuse.openai import openai.

To start using Cerebras with OpenAI’s client libraries, pass in your Cerebras API key to the api_key option, and change the base_url to https://api.cerebras.ai/v1:

# instead of import openai:
from langfuse.openai import openai
 
client = openai.OpenAI(
  api_key=os.environ.get("CEREBRAS_API_KEY"),
  base_url="https://api.cerebras.ai/v1",
)

Step 4: Run An Example

The following cell demonstrates how to call Cerebras’s chat model using the traced OpenAI client. All API calls will be automatically traced by Langfuse.

response = client.chat.completions.create(
  model="zai-glm-4.6",
  messages=[
    {"role": "system", "content": "You are a travel agent. Be descriptive and helpful."},
    {"role": "user", "content": "Tell me the top 3 things to do in San Francisco"},
  ],
  name="cerebras-travel-agent"
)
 
print(response.choices[0].message.content)

Step 5: See Traces in Langfuse

After running the example model call, you can see the traces in Langfuse. You will see detailed information about your Cerebras API calls, including:

Request parameters (model, messages, temperature, etc.)
Response content
Token usage statistics
Latency metrics

Langfuse Trace Example

Public example trace link in Langfuse

Interoperability with the Python SDK

You can use this integration together with the Langfuse SDKs to add additional attributes to the observation.

The @observe() decorator provides a convenient way to automatically wrap your instrumented code and add additional attributes to the observation.

from langfuse import observe, propagate_attributes, get_client
 
langfuse = get_client()
 
@observe()
def my_llm_pipeline(input):
    # Add additional attributes (user_id, session_id, metadata, version, tags) to all spans created within this execution scope
    with propagate_attributes(
        user_id="user_123",
        session_id="session_abc",
        tags=["agent", "my-observation"],
        metadata={"email": "user@langfuse.com"},
        version="1.0.0"
    ):
 
        # YOUR APPLICATION CODE HERE
        result = call_llm(input)
 
        return result
 
# Run the function
my_llm_pipeline("Hi")

Learn more about using the Decorator in the Langfuse SDK instrumentation docs.

Troubleshooting

No observations appearing

First, enable debug mode in the Python SDK:

export LANGFUSE_DEBUG="True"

Then run your application and check the debug logs:

OTel observations appear in the logs: Your application is instrumented correctly but observations are not reaching Langfuse. To resolve this:
1. Call langfuse.flush() at the end of your application to ensure all observations are exported.
2. Verify that you are using the correct API keys and base URL.
No OTel spans in the logs: Your application is not instrumented correctly. Make sure the instrumentation runs before your application code.

Unwanted observations in Langfuse

The Langfuse SDK is based on OpenTelemetry. Other libraries in your application may emit OTel spans that are not relevant to you. These still count toward your billable units, so you should filter them out. See Unwanted spans in Langfuse for details.

Missing attributes

Some attributes may be stored in the metadata object of the observation rather than being mapped to the Langfuse data model. If a mapping or integration does not work as expected, please raise an issue on GitHub.

Next Steps

Once you have instrumented your code, you can manage, evaluate and debug your application:

Manage prompts in Langfuse Add evaluation scores Run LLM-as-a-judge Evaluators Create datasets Create custom dashboards Test queries in the Playground

BytePlus Cleanlab

Was this page helpful?

Support