DocsObservabilitySDKsPythonInstrumentation

Instrumentation

To instrument your application to send traces to Langfuse, you can use either native library instrumentations that work out of the box, or use custom instrumentation methods for fine-grained control.

Custom Instrumentation

There are three main ways to create spans with the Langfuse Python SDK. All of them are fully interoperable with each other.

The @observe() decorator provides a convenient way to automatically trace function executions, including capturing their inputs, outputs, execution time, and any errors. It supports both synchronous and asynchronous functions.

from langfuse import observe
 
@observe()
def my_data_processing_function(data, parameter):
    # ... processing logic ...
    return {"processed_data": data, "status": "ok"}
 
@observe(name="llm-call", as_type="generation")
async def my_async_llm_call(prompt_text):
    # ... async LLM call ...
    return "LLM response"

Parameters:

  • name: Optional[str]: Custom name for the created span/generation. Defaults to the function name.
  • as_type: Optional[Literal["generation"]]: If set to "generation", a Langfuse generation object is created, suitable for LLM calls. Otherwise, a regular span is created.
  • capture_input: bool: Whether to capture function arguments as input. Defaults to env var LANGFUSE_OBSERVE_DECORATOR_IO_CAPTURE_ENABLED or True if not set.
  • capture_output: bool: Whether to capture function return value as output. Defaults to env var LANGFUSE_OBSERVE_DECORATOR_IO_CAPTURE_ENABLED or True if not set.
  • transform_to_string: Optional[Callable[[Iterable], str]]: For functions that return generators (sync or async), this callable can be provided to transform the collected chunks into a single string for the output field. If not provided, and all chunks are strings, they will be concatenated. Otherwise, the list of chunks is stored.

Trace Context and Special Keyword Arguments:

The @observe decorator automatically propagates the OTEL trace context. If a decorated function is called from within an active Langfuse span (or another OTEL span), the new observation will be nested correctly.

You can also pass special keyword arguments to a decorated function to control its tracing behavior:

  • langfuse_trace_id: str: Explicitly set the trace ID for this function call. Must be a valid W3C Trace Context trace ID (32-char hex). If you have a trace ID from an external system, you can use Langfuse.create_trace_id(seed=external_trace_id) to generate a valid deterministic ID.
  • langfuse_parent_observation_id: str: Explicitly set the parent observation ID. Must be a valid W3C Trace Context span ID (16-char hex).
@observe()
def my_function(a, b):
    return a + b
 
# Call with a specific trace context
my_function(1, 2, langfuse_trace_id="1234567890abcdef1234567890abcdef")

The observe decorator is capturing the args, kwargs and return value of decorated functions by default. This may lead to performance issues in your application if you have large or deeply nested objects there. To avoid this, explicitly disable function IO capture on the decorated function by passing capture_input / capture_output with value False or globally by setting the environment variable LANGFUSE_OBSERVE_DECORATOR_IO_CAPTURE_ENABLED=False.

Nesting Observations

The function call hierarchy is automatically captured by the @observe decorator reflected in the trace.

from langfuse import observe
 
@observe
def my_data_processing_function(data, parameter):
    # ... processing logic ...
    return {"processed_data": data, "status": "ok"}
 
 
@observe
def main_function(data, parameter):
    return my_data_processing_function(data, parameter)

Updating Observations

You can update observations with new information as your code executes.

  • For spans/generations created via context managers or assigned to variables: use the .update() method on the object.
  • To update the currently active observation in the context (without needing a direct reference to it): use langfuse.update_current_span() or langfuse.update_current_generation().

LangfuseSpan.update() / LangfuseGeneration.update() parameters:

ParameterTypeDescriptionApplies To
inputOptional[Any]Input data for the operation.Both
outputOptional[Any]Output data from the operation.Both
metadataOptional[Any]Additional metadata (JSON-serializable).Both
versionOptional[str]Version identifier for the code/component.Both
levelOptional[SpanLevel]Severity: "DEBUG", "DEFAULT", "WARNING", "ERROR".Both
status_messageOptional[str]A message describing the status, especially for errors.Both
completion_start_timeOptional[datetime]Timestamp when the LLM started generating the completion (streaming).Generation
modelOptional[str]Name/identifier of the AI model used.Generation
model_parametersOptional[Dict[str, MapValue]]Parameters used for the model call (e.g., temperature).Generation
usage_detailsOptional[Dict[str, int]]Token usage (e.g., {"input_tokens": 10, "output_tokens": 20}).Generation
cost_detailsOptional[Dict[str, float]]Cost information (e.g., {"total_cost": 0.0023}).Generation
promptOptional[PromptClient]Associated PromptClient object from Langfuse prompt management.Generation
from langfuse import get_client
 
langfuse = get_client()
 
with langfuse.start_as_current_generation(name="llm-call", model="gpt-3.5-turbo") as gen:
    gen.update(input={"prompt": "Why is the sky blue?"})
    # ... make LLM call ...
    response_text = "Rayleigh scattering..."
    gen.update(
        output=response_text,
        usage_details={"input_tokens": 5, "output_tokens": 50},
        metadata={"confidence": 0.9}
    )
 
# Alternatively, update the current observation in context:
with langfuse.start_as_current_span(name="data-processing"):
    # ... some processing ...
    langfuse.update_current_span(metadata={"step1_complete": True})
    # ... more processing ...
    langfuse.update_current_span(output={"result": "final_data"})

Setting Trace Attributes

Trace-level attributes apply to the entire trace, not just a single observation. You can set or update these using:

  • The .update_trace() method on any LangfuseSpan or LangfuseGeneration object within that trace.
  • langfuse.update_current_trace() to update the trace associated with the currently active observation.

Trace attribute parameters:

ParameterTypeDescription
nameOptional[str]Name for the trace.
user_idOptional[str]ID of the user associated with this trace.
session_idOptional[str]Session identifier for grouping related traces.
versionOptional[str]Version of your application/service for this trace.
inputOptional[Any]Overall input for the entire trace.
outputOptional[Any]Overall output for the entire trace.
metadataOptional[Any]Additional metadata for the trace.
tagsOptional[List[str]]List of tags to categorize the trace.
publicOptional[bool]Whether the trace should be publicly accessible (if configured).

Example: Setting Multiple Trace Attributes

from langfuse import get_client
 
langfuse = get_client()
 
with langfuse.start_as_current_span(name="initial-operation") as span:
    # Set trace attributes early
    span.update_trace(
        user_id="user_xyz",
        session_id="session_789",
        tags=["beta-feature", "llm-chain"]
    )
    # ...
    # Later, from another span in the same trace:
    with span.start_as_current_generation(name="final-generation") as gen:
        # ...
        langfuse.update_current_trace(output={"final_status": "success"}, public=True)

Trace Input/Output Behavior

In v3, trace input and output are automatically set from the root observation (first span/generation) by default. This differs from v2 where integrations could set trace-level inputs/outputs directly.

Default Behavior

from langfuse import get_client
 
langfuse = get_client()
 
with langfuse.start_as_current_span(
    name="user-request",
    input={"query": "What is the capital of France?"}  # This becomes the trace input
) as root_span:
 
    with langfuse.start_as_current_generation(
        name="llm-call",
        model="gpt-4o",
        input={"messages": [{"role": "user", "content": "What is the capital of France?"}]}
    ) as gen:
        response = "Paris is the capital of France."
        gen.update(output=response)
        # LLM generation input/output are separate from trace input/output
 
    root_span.update(output={"answer": "Paris"})  # This becomes the trace output

Override Default Behavior

If you need different trace inputs/outputs than the root observation, explicitly set them:

from langfuse import get_client
 
langfuse = get_client()
 
with langfuse.start_as_current_span(name="complex-pipeline") as root_span:
    # Root span has its own input/output
    root_span.update(input="Step 1 data", output="Step 1 result")
 
    # But trace should have different input/output (e.g., for LLM-as-a-judge)
    root_span.update_trace(
        input={"original_query": "User's actual question"},
        output={"final_answer": "Complete response", "confidence": 0.95}
    )
 
    # Now trace input/output are independent of root span input/output

Critical for LLM-as-a-Judge Features

LLM-as-a-judge and evaluation features typically rely on trace-level inputs and outputs. Make sure to set these appropriately:

from langfuse import observe, get_client
 
langfuse = get_client()
 
@observe()
def process_user_query(user_question: str):
    # LLM processing...
    answer = call_llm(user_question)
 
    # Explicitly set trace input/output for evaluation features
    langfuse.update_current_trace(
        input={"question": user_question},
        output={"answer": answer}
    )
 
    return answer

Trace and Observation IDs

Langfuse uses W3C Trace Context compliant IDs:

  • Trace IDs: 32-character lowercase hexadecimal string (16 bytes).
  • Observation IDs (Span IDs): 16-character lowercase hexadecimal string (8 bytes).

You can retrieve these IDs:

  • langfuse.get_current_trace_id(): Gets the trace ID of the currently active observation.
  • langfuse.get_current_observation_id(): Gets the ID of the currently active observation.
  • span_obj.trace_id and span_obj.id: Access IDs directly from a LangfuseSpan or LangfuseGeneration object.

For scenarios where you need to generate IDs outside of an active trace (e.g., to link scores to traces/observations that will be created later, or to correlate with external systems), use:

  • Langfuse.create_trace_id(seed: Optional[str] = None)(static method): Generates a new trace ID. If a seed is provided, the ID is deterministic. Use the same seed to get the same ID. This is useful for correlating external IDs with Langfuse traces.
from langfuse import get_client, Langfuse
 
langfuse = get_client()
 
# Get current IDs
with langfuse.start_as_current_span(name="my-op") as current_op:
    trace_id = langfuse.get_current_trace_id()
    observation_id = langfuse.get_current_observation_id()
    print(f"Current Trace ID: {trace_id}, Current Observation ID: {observation_id}")
    print(f"From object: Trace ID: {current_op.trace_id}, Observation ID: {current_op.id}")
 
# Generate IDs deterministically
external_request_id = "req_12345"
deterministic_trace_id = Langfuse.create_trace_id(seed=external_request_id)
print(f"Deterministic Trace ID for {external_request_id}: {deterministic_trace_id}")

Linking to Existing Traces (Trace Context)

If you have a trace_id (and optionally a parent_span_id) from an external source (e.g., another service, a batch job), you can link new observations to it using the trace_context parameter. Note that OpenTelemetry offers native cross-service context propagation, so this is not necessarily required for calls between services that are instrumented with OTEL.

from langfuse import get_client
 
langfuse = get_client()
 
existing_trace_id = "abcdef1234567890abcdef1234567890" # From an upstream service
existing_parent_span_id = "fedcba0987654321" # Optional parent span in that trace
 
with langfuse.start_as_current_span(
    name="process-downstream-task",
    trace_context={
        "trace_id": existing_trace_id,
        "parent_span_id": existing_parent_span_id # If None, this becomes a root span in the existing trace
    }
) as span:
    # This span is now part of the trace `existing_trace_id`
    # and a child of `existing_parent_span_id` if provided.
    print(f"This span's trace_id: {span.trace_id}") # Will be existing_trace_id
    pass

Client Management

flush()

Manually triggers the sending of all buffered observations (spans, generations, scores, media metadata) to the Langfuse API. This is useful in short-lived scripts or before exiting an application to ensure all data is persisted.

from langfuse import get_client
 
langfuse = get_client()
# ... create traces and observations ...
langfuse.flush() # Ensures all pending data is sent

The flush() method blocks until the queued data is processed by the respective background threads.

shutdown()

Gracefully shuts down the Langfuse client. This includes:

  1. Flushing all buffered data (similar to flush()).
  2. Waiting for background threads (for data ingestion and media uploads) to finish their current tasks and terminate.

It’s crucial to call shutdown() before your application exits to prevent data loss and ensure clean resource release. The SDK automatically registers an atexit hook to call shutdown() on normal program termination, but manual invocation is recommended in scenarios like:

  • Long-running daemons or services when they receive a shutdown signal.
  • Applications where atexit might not reliably trigger (e.g., certain serverless environments or forceful terminations).
from langfuse import get_client
 
langfuse = get_client()
# ... application logic ...
 
# Before exiting:
langfuse.shutdown()

Native Instrumentations

The Langfuse Python SDK has native integrations for the OpenAI and LangChain SDK. You can also use any other OTel-based instrumentation library to automatically trace your calls in Langfuse.

OpenAI Integration

Langfuse offers a drop-in replacement for the OpenAI Python SDK to automatically trace all your OpenAI API calls. Simply change your import statement:

- import openai
+ from langfuse.openai import openai
 
# Your existing OpenAI code continues to work as is
# For example:
# client = openai.OpenAI()
# completion = client.chat.completions.create(...)

What’s automatically captured:

  • Requests & Responses: All prompts/completions, including support for streaming, async operations, and function/tool calls.
  • Timings: Latencies for API calls.
  • Errors: API errors are captured with their details.
  • Model Usage: Token counts (input, output, total).
  • Cost: Estimated cost in USD (based on model and token usage).
  • Media: Input audio and output audio from speech-to-text and text-to-speech endpoints.

The integration is fully interoperable with @observe and manual tracing methods (start_as_current_span, etc.). If an OpenAI call is made within an active Langfuse span, the OpenAI generation will be correctly nested under it.

Passing Langfuse arguments to OpenAI calls:

You can pass Langfuse-specific arguments directly to OpenAI client methods. These will be used to enrich the trace data.

from langfuse import get_client
from langfuse.openai import openai
 
langfuse = get_client()
 
client = openai.OpenAI()
 
with langfuse.start_as_current_span(name="qna-bot-openai") as span:
    langfuse.update_current_trace(tags=["qna-bot-openai"])
 
    # This will be traced as a Langfuse generation
    response = client.chat.completions.create(
        name="qna-bot-openai",  # Custom name for this generation in Langfuse
        metadata={"user_tier": "premium", "request_source": "web_api"}, # will be added to the Langfuse generation
        model="gpt-4o",
        messages=[{"role": "user", "content": "What is OpenTelemetry?"}],
    )

Setting trace attributes via metadata:

You can set trace attributes (session_id, user_id, tags) directly on OpenAI calls using special fields in the metadata parameter:

from langfuse.openai import openai
 
client = openai.OpenAI()
 
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello"}],
    metadata={
        "langfuse_session_id": "session_123",
        "langfuse_user_id": "user_456",
        "langfuse_tags": ["production", "chat-bot"],
        "custom_field": "additional metadata"  # Regular metadata fields work too
    }
)

The special metadata fields are:

  • langfuse_session_id: Sets the session ID for the trace
  • langfuse_user_id: Sets the user ID for the trace
  • langfuse_tags: Sets tags for the trace (should be a list of strings)

Supported Langfuse arguments: name, metadata, langfuse_prompt

Learn more in the OpenAI integration documentation.

Langchain Integration

Langfuse provides a callback handler for Langchain to trace its operations.

Setup:

Initialize the CallbackHandler and add it to your Langchain calls, either globally or per-call.

from langfuse import get_client
from langfuse.langchain import CallbackHandler
from langchain_openai import ChatOpenAI # Example LLM
from langchain_core.prompts import ChatPromptTemplate
 
langfuse = get_client()
 
# Initialize the Langfuse handler
langfuse_handler = CallbackHandler()
 
# Example: Using it with an LLM call
llm = ChatOpenAI(model_name="gpt-4o")
prompt = ChatPromptTemplate.from_template("Tell me a joke about {topic}")
chain = prompt | llm
 
with langfuse.start_as_current_span(name="joke-chain") as span:
    langfuse.update_current_trace(tags=["joke-chain"])
 
    response = chain.invoke({"topic": "cats"}, config={"callbacks": [langfuse_handler]})
    print(response)

Setting trace attributes via metadata:

You can set trace attributes (session_id, user_id, tags) directly during chain invocation using special fields in the metadata configuration:

from langfuse.langchain import CallbackHandler
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
 
# Initialize the Langfuse handler
langfuse_handler = CallbackHandler()
 
# Create your LangChain components
llm = ChatOpenAI(model_name="gpt-4o")
prompt = ChatPromptTemplate.from_template("Tell me a joke about {topic}")
chain = prompt | llm
 
# Set trace attributes via metadata in chain invocation
response = chain.invoke(
    {"topic": "cats"},
    config={
        "callbacks": [langfuse_handler],
        "metadata": {
            "langfuse_session_id": "session_123",
            "langfuse_user_id": "user_456",
            "langfuse_tags": ["production", "humor-bot"],
            "custom_field": "additional metadata"  # Regular metadata fields work too
        }
    }
)

The special metadata fields are:

  • langfuse_session_id: Sets the session ID for the trace
  • langfuse_user_id: Sets the user ID for the trace
  • langfuse_tags: Sets tags for the trace (should be a list of strings)

You can also pass update_trace=True to the CallbackHandler init to force a trace update with the chains input, output and metadata.

What’s captured:

The callback handler maps various Langchain events to Langfuse observations:

  • Chains (on_chain_start, on_chain_end, on_chain_error): Traced as spans.
  • LLMs (on_llm_start, on_llm_end, on_llm_error, on_chat_model_start): Traced as generations, capturing model name, prompts, responses, and usage if available from the LLM provider.
  • Tools (on_tool_start, on_tool_end, on_tool_error): Traced as spans, capturing tool input and output.
  • Retrievers (on_retriever_start, on_retriever_end, on_retriever_error): Traced as spans, capturing the query and retrieved documents.
  • Agents (on_agent_action, on_agent_finish): Agent actions and final finishes are captured within their parent chain/agent span.

Langfuse attempts to parse model names, usage, and other relevant details from the information provided by Langchain. The metadata argument in Langchain calls can be used to pass additional information to Langfuse, including langfuse_prompt to link with managed prompts.

Learn more in the Langchain integration documentation.

Third-party integrations

The Langfuse SDK seamlessly integrates with any third-party library that uses OpenTelemetry instrumentation. When these libraries emit spans, they are automatically captured and properly nested within your trace hierarchy. This enables unified tracing across your entire application stack without requiring any additional configuration.

For example, if you’re using OpenTelemetry-instrumented databases, HTTP clients, or other services alongside your LLM operations, all these spans will be correctly organized within your traces in Langfuse.

You can use any third-party, OTEL-based instrumentation library for Anthropic to automatically trace all your Anthropic API calls in Langfuse.

In this example, we are using the opentelemetry-instrumentation-anthropic library.

from anthropic import Anthropic
from opentelemetry.instrumentation.anthropic import AnthropicInstrumentor
 
from langfuse import get_client
 
# This will automatically emit OTEL-spans for all Anthropic API calls
AnthropicInstrumentor().instrument()
 
langfuse = get_client()
anthropic_client = Anthropic()
 
with langfuse.start_as_current_span(name="myspan"):
    # This will be traced as a Langfuse generation nested under the current span
    message = anthropic_client.messages.create(
        model="claude-3-7-sonnet-20250219",
        max_tokens=1024,
        messages=[{"role": "user", "content": "Hello, Claude"}],
    )
 
    print(message.content)
 
# Flush events to Langfuse in short-lived applications
langfuse.flush()

Learn more in the Anthropic integration documentation.

Was this page helpful?