Observability for OpenAI SDK (Python)

Looking for the JS/TS version? Check it out here.

If you use the OpenAI Python SDK, you can use the Langfuse drop-in replacement to get full logging by changing only the import. This works with OpenAI and Azure OpenAI.

- import openai
+ from langfuse.openai import openai

Alternative imports:
+ from langfuse.openai import OpenAI, AsyncOpenAI, AzureOpenAI, AsyncAzureOpenAI

Langfuse automatically tracks:

All prompts/completions with support for streaming, async and functions
Latencies
API Errors
Model usage (tokens) and cost (USD) (learn more)

How it works

Install Langfuse SDK

The integration is compatible with OpenAI SDK versions >=0.27.8. It supports async functions and streaming for OpenAI SDK versions >=1.0.0.

pip install langfuse openai

Switch to Langfuse Wrapped OpenAI SDK

Add Langfuse credentials to your environment variables

.env

LANGFUSE_SECRET_KEY = "sk-lf-..."
LANGFUSE_PUBLIC_KEY = "pk-lf-..."
LANGFUSE_BASE_URL = "https://cloud.langfuse.com" # 🇪🇺 EU region
# Other Langfuse data regions include 🇺🇸 US: https://us.cloud.langfuse.com, 🇯🇵 Japan: https://jp.cloud.langfuse.com and ⚕️ HIPAA: https://hipaa.cloud.langfuse.com

Change import

- import openai
+ from langfuse.openai import openai

Alternative imports:
+ from langfuse.openai import OpenAI, AsyncOpenAI, AzureOpenAI, AsyncAzureOpenAI

Change import

- import openai
+ from langfuse.openai import openai

Alternative imports:
+ from langfuse.openai import OpenAI, AsyncOpenAI, AzureOpenAI, AsyncAzureOpenAI

Add Langfuse credentials to your code

openai.langfuse_public_key = "pk-lf-..."
openai.langfuse_secret_key = "sk-lf-..."
openai.langfuse_enabled = True # Default is True, set to False to disable Langfuse
openai.LANGFUSE_BASE_URL = "https://cloud.langfuse.com" # 🇪🇺 EU region
# Other Langfuse data regions include 🇺🇸 US: https://us.cloud.langfuse.com, 🇯🇵 Japan: https://jp.cloud.langfuse.com and ⚕️ HIPAA: https://hipaa.cloud.langfuse.com

# Set openai key via attribute
openai.api_key = "sk-..."

Optional, checks the SDK connection with the server. Not recommended for production usage.

from langfuse import get_client

get_client().auth_check()

Use OpenAI SDK as usual

No changes required.

Check out the notebook for end-to-end examples of the integration:

Example notebook

The Langfuse SDKs queue and batches events in the background to reduce the number of network requests and improve overall performance. In a long-running application, this works without any additional configuration.

If you are running a short-lived application, you need to flush Langfuse to ensure that all events are flushed before the application exits.

from langfuse import get_client
from langfuse.openai import openai

# Flush via global client
langfuse = get_client()
langfuse.flush()

Learn more about queuing and batching of events here.

Assistants API

Tracing of the assistants api is not supported by this integration as OpenAI Assistants have server-side state that cannot easily be captured without additional api requests. We added some more information on how to best track usage of the assistants api in this FAQ.

Debug mode

If you are having issues with the integration, you can enable debug mode to get more information about the requests and responses.

from langfuse import Langfuse
from langfuse.openai import openai

# Enable debug via global client
langfuse = Langfuse(debug=True)

Alternatively, you can set the environment variable:

export LANGFUSE_DEBUG=true

Sampling

Sampling can be used to control the volume of traces collected by the Langfuse server.

from langfuse import Langfuse
from langfuse.openai import openai

# Set sampling via global client (default is 1.0)
langfuse = Langfuse(sample_rate=0.1)

Alternatively, you can set the environment variable:

export LANGFUSE_SAMPLE_RATE=0.1

Disable tracing

You may disable sending traces to Langfuse by setting the appropriate flag.

from langfuse import Langfuse
from langfuse.openai import openai

# Disable via global client
langfuse = Langfuse(tracing_enabled=False)

Alternatively, you can set the environment variable:

export LANGFUSE_TRACING_ENABLED=false

Advanced usage

Custom trace properties

You can add the following properties to the openai method:

Property	Description
`name`	Set `name` to identify a specific type of generation.
`metadata`	Set `metadata` with additional information that you want to see in Langfuse.
`trace_id`	See "Interoperability with Langfuse Python SDK" (below) for more details.
`parent_observation_id`	See "Interoperability with Langfuse Python SDK" (below) for more details.

Setting trace attributes (session_id, user_id, tags):

You have two options:

Option 1: Via metadata (simplest approach):

from langfuse.openai import openai

result = openai.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[
        {"role": "system", "content": "You are a very accurate calculator."},
        {"role": "user", "content": "1 + 1 = "}
    ],
    name="test-chat",
    metadata={
        "langfuse_session_id": "session_123",
        "langfuse_user_id": "user_456",
        "langfuse_tags": ["calculator"],
        "someMetadataKey": "someValue"  # Regular metadata still works
    }
)

Option 2: Via enclosing span (for more control):

from langfuse import get_client, propagate_attributes
from langfuse.openai import openai

langfuse = get_client()

with langfuse.start_as_current_observation(as_type="span", name="calculator-request") as span:
    with propagate_attributes(
        session_id="session_123",
        user_id="user_456",
        tags=["calculator"]
    ):
        result = openai.chat.completions.create(
            model="gpt-3.5-turbo",
            messages=[
                {"role": "system", "content": "You are a very accurate calculator."},
                {"role": "user", "content": "1 + 1 = "}
            ],
            name="test-chat",
            metadata={"someMetadataKey": "someValue"},
        )

Use Traces

Langfuse Tracing groups multiple observations (can be any LLM or non-LLM call) into a single trace. This integration by default creates a single trace for each openai call.

Add non-OpenAI related observations to the trace.
Group multiple OpenAI calls into a single trace while customizing the trace.
Have more control over the trace structure.
Use all Langfuse Tracing features.

New to Langfuse Tracing? Checkout this introduction to the basic concepts.

You can use any of the following options:

Python @observe() decorator - works with both v2 and v3
Use explicit span management - differs between v3 and v2

Option 1: Python Decorator

from langfuse import observe
from langfuse.openai import openai

@observe()
def capital_poem_generator(country):
  capital = openai.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[
        {"role": "system", "content": "What is the capital of the country?"},
        {"role": "user", "content": country}],
    name="get-capital",
  ).choices[0].message.content

  poem = openai.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[
        {"role": "system", "content": "You are a poet. Create a poem about this city."},
        {"role": "user", "content": capital}],
    name="generate-poem",
  ).choices[0].message.content
  return poem

capital_poem_generator("Bulgaria")

Option 2: Context Managers

from langfuse import get_client, propagate_attributes
from langfuse.openai import openai

langfuse = get_client()

with langfuse.start_as_current_observation(as_type="span", name="capital-poem-generator") as span:
    # Propagate trace attributes to all child observations
    with propagate_attributes(
        user_id="user_123",
        session_id="session_456",
        tags=["poetry", "capital"]
    ):
      capital = openai.chat.completions.create(
          model="gpt-3.5-turbo",
          messages=[
              {"role": "system", "content": "What is the capital of the country?"},
              {"role": "user", "content": "Bulgaria"}],
          name="get-capital",
      ).choices[0].message.content

      poem = openai.chat.completions.create(
          model="gpt-3.5-turbo",
          messages=[
              {"role": "system", "content": "You are a poet. Create a poem about this city."},
              {"role": "user", "content": capital}],
          name="generate-poem",
      ).choices[0].message.content

OpenAI token usage on streamed responses

OpenAI returns the token usage on streamed responses only when in stream_options the include_usage parameter is set to True. If you would like to benefit from OpenAI's directly provided token usage, you can set {"include_usage": True} in the stream_options` argument.

When using streaming responses with include_usage=True, OpenAI returns token usage information in a final chunk that has an empty choices list. Make sure your application properly handles these empty choices chunks to ensure accurate token usage tracking by not trying to access some index in the choices list without checking if it is non-empty.

from langfuse import get_client
from langfuse.openai import openai

client = openai.OpenAI()

stream = client.chat.completions.create(
  model="gpt-4o",
  messages=[{"role": "user", "content": "How are you?"}],
  stream=True,
  stream_options={"include_usage": True},
)

result = ""

for chunk in stream:
  # Check if chunk choices are not empty. OpenAI returns token usage in a final chunk with an empty choices list.
  if chunk.choices:
    result += chunk.choices[0].delta.content or ""

# Flush via global client
get_client().flush()

OpenAI Beta APIs

Since OpenAI beta APIs are changing frequently across versions, we fully support only the stable APIs in the OpenAI SDK. If you are using a beta API, you can still use the Langfuse SDK by wrapping the OpenAI SDK manually with the @observe() decorator.

Structured Output

For structured output parsing, you have two fully instrumented options depending on your openai Python SDK version:

openai>=1.92.0 (recommended): use client.chat.completions.parse(...). OpenAI graduated parse and stream out of beta in v1.92.0, and Langfuse wraps the stable openai.resources.chat.completions.Completions.parse (and the async variant). You can pass a Pydantic model directly via response_format and still set Langfuse attributes such as name, metadata, langfuse_session_id, etc.
openai<1.92.0 (legacy): the parse helper is only available under client.beta.chat.completions.parse(...). Langfuse also wraps the beta path on these older versions, so attributes like name and metadata work there too.

On openai>=1.92.0, calls to client.beta.chat.completions.parse(...) are re-routed to the stable parse method by the OpenAI SDK, so they remain instrumented. Prefer client.chat.completions.parse(...) in new code.

from langfuse import get_client
from langfuse.openai import openai
from pydantic import BaseModel

class CalendarEvent(BaseModel):
    name: str
    date: str
    participants: list[str]

completion = openai.chat.completions.parse(
    model="gpt-4o-2024-08-06",
    messages=[
        {"role": "system", "content": "Extract the event information."},
        {
            "role": "user",
            "content": "Alice and Bob are going to a science fair on Friday.",
        },
    ],
    response_format=CalendarEvent,
    name="extract-calendar-event",
    metadata={"langfuse_tags": ["structured-output"]},
)

print(completion.choices[0].message.parsed)

# Flush via global client
get_client().flush()

If you cannot upgrade the OpenAI SDK and want to stay on chat.completions.create, you can still pass a Pydantic model by converting it with type_to_response_format_param from the OpenAI SDK:

from langfuse.openai import openai
from openai.lib._parsing._completions import type_to_response_format_param
from pydantic import BaseModel

class CalendarEvent(BaseModel):
    name: str
    date: str
    participants: list[str]

completion = openai.chat.completions.create(
    model="gpt-4o-2024-08-06",
    messages=[
        {"role": "system", "content": "Extract the event information."},
        {
            "role": "user",
            "content": "Alice and Bob are going to a science fair on Friday.",
        },
    ],
    response_format=type_to_response_format_param(CalendarEvent),
)

Assistants API

Tracing of the assistants api is not supported by this integration as OpenAI Assistants have server-side state that cannot easily be captured without additional api requests. Check out this notebook for an end-to-end example on how to best track usage of the assistants api in Langfuse.

Tracking of OpenAI API Errors

Langfuse automatically tracks and monitors OpenAI API errors if you use the native integration. They are captured via the level and statusMessage fields (see docs).

Learn more about how to get started here.

- import openai
+ from langfuse.openai import openai

# Cause an error by attempting to use a host that does not exist.
openai.base_url = "https://example.com"

country = openai.chat.completions.create(
  name="will-error",
  model="gpt-3.5-turbo",
  messages=[
      {"role": "user", "content": "How are you?"}],
)

Throws error 👆

FAQ

GitHub Discussions

Was this page helpful?

Observability for OpenAI SDK (Python)

How it works

Install Langfuse SDK

Switch to Langfuse Wrapped OpenAI SDK

Use OpenAI SDK as usual

Example notebook

Troubleshooting

Queuing and batching of events

Assistants API

Debug mode

Sampling

Disable tracing

Advanced usage

Custom trace properties

Use Traces

OpenAI token usage on streamed responses

OpenAI Beta APIs

Structured Output

Assistants API

Tracking of OpenAI API Errors

FAQ

GitHub Discussions

On this page