Instrumentation

To instrument your application to send traces to Langfuse, you can use either native library instrumentations that work out of the box, or use custom instrumentation methods for fine-grained control.

Custom Instrumentation

There are three main ways to create spans with the Langfuse Python SDK. All of them are fully interoperable with each other.

The @observe() decorator provides a convenient way to automatically trace function executions, including capturing their inputs, outputs, execution time, and any errors. It supports both synchronous and asynchronous functions.

from langfuse import observe
 
@observe()
def my_data_processing_function(data, parameter):
    # ... processing logic ...
    return {"processed_data": data, "status": "ok"}
 
@observe(name="llm-call", as_type="generation")
async def my_async_llm_call(prompt_text):
    # ... async LLM call ...
    return "LLM response"

Parameters:

name: Optional[str]: Custom name for the created span/generation. Defaults to the function name.
as_type: Optional[Literal["generation"]]: If set to "generation", a Langfuse generation object is created, suitable for LLM calls. Otherwise, a regular span is created.
capture_input: bool: Whether to capture function arguments as input. Defaults to env var LANGFUSE_OBSERVE_DECORATOR_IO_CAPTURE_ENABLED or True if not set.
capture_output: bool: Whether to capture function return value as output. Defaults to env var LANGFUSE_OBSERVE_DECORATOR_IO_CAPTURE_ENABLED or True if not set.
transform_to_string: Optional[Callable[[Iterable], str]]: For functions that return generators (sync or async), this callable can be provided to transform the collected chunks into a single string for the output field. If not provided, and all chunks are strings, they will be concatenated. Otherwise, the list of chunks is stored.

Trace Context and Special Keyword Arguments:

The @observe decorator automatically propagates the OTEL trace context. If a decorated function is called from within an active Langfuse span (or another OTEL span), the new observation will be nested correctly.

You can also pass special keyword arguments to a decorated function to control its tracing behavior:

langfuse_trace_id: str: Explicitly set the trace ID for this function call. Must be a valid W3C Trace Context trace ID (32-char hex). If you have a trace ID from an external system, you can use Langfuse.create_trace_id(seed=external_trace_id) to generate a valid deterministic ID.
langfuse_parent_observation_id: str: Explicitly set the parent observation ID. Must be a valid W3C Trace Context span ID (16-char hex).

@observe()
def my_function(a, b):
    return a + b
 
# Call with a specific trace context
my_function(1, 2, langfuse_trace_id="1234567890abcdef1234567890abcdef")

The observe decorator is capturing the args, kwargs and return value of decorated functions by default. This may lead to performance issues in your application if you have large or deeply nested objects there. To avoid this, explicitly disable function IO capture on the decorated function by passing capture_input / capture_output with value False or globally by setting the environment variable LANGFUSE_OBSERVE_DECORATOR_IO_CAPTURE_ENABLED=False.

You can create spans or generations anywhere in your application. If you need more control than the @observe decorator, the primary way to do this is using context managers (with with statements), which ensure that observations are properly started and ended.

langfuse.start_as_current_observation(as_type="span"): Creates a new span and sets it as the currently active observation in the OTel context for its duration. Any new observations created within this block will be its children.
langfuse.start_as_current_observation(as_type="generation"): Similar to the above, but creates a specialized “generation” observation type for LLM calls.
You can see an overview of the different observation types here.

from langfuse import get_client, propagate_attributes
 
langfuse = get_client()
 
with langfuse.start_as_current_observation(
    as_type="span",
    name="user-request-pipeline",
    input={"user_query": "Tell me a joke about OpenTelemetry"},
) as root_span:
    # This span is now active in the context.
 
    # Propagate trace attributes to all child observations
    with propagate_attributes(
        user_id="user_123",
        session_id="session_abc",
        tags=["experimental", "comedy"]
    ):
 
      # Create a nested generation
      with langfuse.start_as_current_observation(
          as_type="generation",
          name="joke-generation",
          model="gpt-4o",
          input=[{"role": "user", "content": "Tell me a joke about OpenTelemetry"}],
          model_parameters={"temperature": 0.7}
      ) as generation:
          # Simulate an LLM call
          joke_response = "Why did the OpenTelemetry collector break up with the span? Because it needed more space... for its attributes!"
          token_usage = {"input_tokens": 10, "output_tokens": 25}
 
          generation.update(
              output=joke_response,
              usage_details=token_usage
          )
          # Generation ends automatically here
 
      root_span.update(output={"final_joke": joke_response})
      # Root span ends automatically here

For scenarios where you need to create an observation (a span or generation) without altering the currently active OpenTelemetry context, you can use langfuse.start_span() or langfuse.start_generation().

from langfuse import get_client
 
langfuse = get_client()
 
span = langfuse.start_span(name="my-span")
 
span.end() # Important: Manually end the span

⚠️

If you use langfuse.start_span() or langfuse.start_generation(), you are responsible for calling .end() on the returned observation object. Failure to do so will result in incomplete or missing observations in Langfuse. Their start_as_current_... counterparts used with a with statement handle this automatically.

Key Characteristics:

No Context Shift: Unlike their start_as_current_... counterparts, these methods do not set the new observation as the active one in the OpenTelemetry context. The previously active span (if any) remains the current context for subsequent operations in the main execution flow.
Parenting: The observation created by start_span() or start_generation() will still be a child of the span that was active in the context at the moment of its creation.
Manual Lifecycle: These observations are not managed by a with block and therefore must be explicitly ended by calling their .end() method.
Nesting Children:
- Subsequent observations created using the global langfuse.start_as_current_observation() (or similar global methods) will not be children of these “manual” observations. Instead, they will be parented by the original active span.
- To create children directly under a “manual” observation, you would use methods on that specific observation object (e.g., manual_span.start_as_current_observation(...)).

When to Use:

This approach is useful when you need to:

Record work that is self-contained or happens in parallel to the main execution flow but should still be part of the same overall trace (e.g., a background task initiated by a request).
Manage the observation’s lifecycle explicitly, perhaps because its start and end are determined by non-contiguous events.
Obtain an observation object reference before it’s tied to a specific context block.

Example with more complex nesting:

from langfuse import get_client
 
langfuse = get_client()
 
# This outer span establishes an active context.
with langfuse.start_as_current_observation(as_type="span", name="main-operation") as main_operation_span:
    # 'main_operation_span' is the current active context.
 
    # 1. Create a "manual" span using langfuse.start_span().
    #    - It becomes a child of 'main_operation_span'.
    #    - Crucially, 'main_operation_span' REMAINS the active context.
    #    - 'manual_side_task' does NOT become the active context.
    manual_side_task = langfuse.start_span(name="manual-side-task")
    manual_side_task.update(input="Data for side task")
 
    # 2. Start another operation that DOES become the active context.
    #    This will be a child of 'main_operation_span', NOT 'manual_side_task',
    #    because 'manual_side_task' did not alter the active context.
    with langfuse.start_as_current_observation(as_type="span", name="core-step-within-main") as core_step_span:
        # 'core_step_span' is now the active context.
        # 'manual_side_task' is still open but not active in the global context.
        core_step_span.update(input="Data for core step")
        # ... perform core step logic ...
        core_step_span.update(output="Core step finished")
    # 'core_step_span' ends. 'main_operation_span' is the active context again.
 
    # 3. Complete and end the manual side task.
    # This could happen at any point after its creation, even after 'core_step_span'.
    manual_side_task.update(output="Side task completed")
    manual_side_task.end() # Manual end is crucial for 'manual_side_task'
 
    main_operation_span.update(output="Main operation finished")
# 'main_operation_span' ends automatically here.
 
# Expected trace structure in Langfuse:
# - main-operation
#   |- manual-side-task
#   |- core-step-within-main
#     (Note: 'core-step-within-main' is a sibling to 'manual-side-task', both children of 'main-operation')

Nesting Observations

The function call hierarchy is automatically captured by the @observe decorator reflected in the trace.

from langfuse import observe
 
@observe
def my_data_processing_function(data, parameter):
    # ... processing logic ...
    return {"processed_data": data, "status": "ok"}
 
 
@observe
def main_function(data, parameter):
    return my_data_processing_function(data, parameter)

Nesting is handled automatically by OpenTelemetry’s context propagation. When you create a new observation using start_as_current_observation, it becomes a child of the observation that was active in the context when it was created.

from langfuse import get_client
 
langfuse = get_client()
 
with langfuse.start_as_current_observation(as_type="span", name="outer-process") as outer_span:
    # outer_span is active
 
    with langfuse.start_as_current_observation(as_type="generation", name="llm-step-1") as gen1:
        # gen1 is active, child of outer_span
        gen1.update(output="LLM 1 output")
 
    with outer_span.start_as_current_span(name="intermediate-step") as mid_span:
        # mid_span is active, also a child of outer_span
        # This demonstrates using the yielded span object to create children
 
        with mid_span.start_as_current_observation(as_type="generation", name="llm-step-2") as gen2:
            # gen2 is active, child of mid_span
            gen2.update(output="LLM 2 output")
 
        mid_span.update(output="Intermediate processing done")
 
    outer_span.update(output="Outer process finished")

If you are creating observations manually (not _as_current_), you can use the methods on the parent LangfuseSpan or LangfuseGeneration object to create children. These children will not become the current context unless their _as_current_ variants are used.

from langfuse import get_client
 
langfuse = get_client()
 
parent = langfuse.start_span(name="manual-parent")
 
child_span = parent.start_span(name="manual-child-span")
# ... work ...
child_span.end()
 
child_gen = parent.start_generation(name="manual-child-generation")
# ... work ...
child_gen.end()
 
parent.end()

Updating Observations

You can update observations with new information as your code executes.

For spans/generations created via context managers or assigned to variables: use the .update() method on the object.
To update the currently active observation in the context (without needing a direct reference to it): use langfuse.update_current_span() or langfuse.update_current_generation().

LangfuseSpan.update() / LangfuseGeneration.update() parameters:

Parameter	Type	Description	Applies To
`input`	`Optional[Any]`	Input data for the operation.	Both
`output`	`Optional[Any]`	Output data from the operation.	Both
`metadata`	`Optional[Any]`	Additional metadata (JSON-serializable).	Both
`version`	`Optional[str]`	Version identifier for the code/component.	Both
`level`	`Optional[SpanLevel]`	Severity: `"DEBUG"`, `"DEFAULT"`, `"WARNING"`, `"ERROR"`.	Both
`status_message`	`Optional[str]`	A message describing the status, especially for errors.	Both
`completion_start_time`	`Optional[datetime]`	Timestamp when the LLM started generating the completion (streaming).	Generation
`model`	`Optional[str]`	Name/identifier of the AI model used.	Generation
`model_parameters`	`Optional[Dict[str, MapValue]]`	Parameters used for the model call (e.g., temperature).	Generation
`usage_details`	`Optional[Dict[str, int]]`	Token usage (e.g., `{"input_tokens": 10, "output_tokens": 20}`).	Generation
`cost_details`	`Optional[Dict[str, float]]`	Cost information (e.g., `{"total_cost": 0.0023}`).	Generation
`prompt`	`Optional[PromptClient]`	Associated `PromptClient` object from Langfuse prompt management.	Generation

from langfuse import get_client
 
langfuse = get_client()
 
with langfuse.start_as_current_observation(as_type="generation", name="llm-call", model="gpt-5-mini") as gen:
    gen.update(input={"prompt": "Why is the sky blue?"})
    # ... make LLM call ...
    response_text = "Rayleigh scattering..."
    gen.update(
        output=response_text,
        usage_details={"input_tokens": 5, "output_tokens": 50},
        metadata={"confidence": 0.9}
    )
 
# Alternatively, update the current observation in context:
with langfuse.start_as_current_observation(as_type="span", name="data-processing"):
    # ... some processing ...
    langfuse.update_current_span(metadata={"step1_complete": True})
    # ... more processing ...
    langfuse.update_current_span(output={"result": "final_data"})

Setting Trace Attributes

Trace-level attributes apply to the entire trace, not just a single observation. You can set or update these using:

the propagate_attributes context manager that sets attributes on all observations inside its context and on the trace
The .update_trace() method on any LangfuseSpan or LangfuseGeneration object within that trace.
langfuse.update_current_trace() to update the trace associated with the currently active observation.

Trace attribute parameters:

Parameter	Type	Description	Recommended Method
`name`	`Optional[str]`	Name for the trace.	`update_trace()`
`user_id`	`Optional[str]`	ID of the user associated with this trace.	`propagate_attributes()`
`session_id`	`Optional[str]`	Session identifier for grouping related traces.	`propagate_attributes()`
`version`	`Optional[str]`	Version of your application/service for this trace.	`propagate_attributes()`
`input`	`Optional[Any]`	Overall input for the entire trace.	`update_trace()`
`output`	`Optional[Any]`	Overall output for the entire trace.	`update_trace()`
`metadata`	`Optional[Any]`	Additional metadata for the trace.	`propagate_attributes()`
`tags`	`Optional[List[str]]`	List of tags to categorize the trace.	`propagate_attributes()`
`public`	`Optional[bool]`	Whether the trace should be publicly accessible (if configured).	`update_trace()`

Note: For user_id, session_id, metadata, version, and tags, consider using propagate_attributes() (see below) to ensure these attributes are applied to all spans, not just the trace object.

In the near-term future filtering and aggregating observations by these attributes requires them to be present on all observations, and propagate_attributes is the future-proof solution.

Propagating Attributes

Certain attributes (user_id, session_id, metadata, version, tags) should be applied to all spans created within some execution scope. This is important because Langfuse aggregation queries (e.g., filtering by user_id, calculating costs by session_id) will soon operate across individual observations rather than the trace level.

Use the propagate_attributes() context manager to automatically propagate these attributes to all child observations:

from langfuse import get_client, propagate_attributes
 
langfuse = get_client()
 
with langfuse.start_as_current_observation(as_type="span", name="user-workflow") as span:
    # Propagate attributes to all child observations
    with propagate_attributes(
        user_id="user_123",
        session_id="session_abc",
        metadata={"experiment": "variant_a", "env": "prod"},
        version="1.0"
    ):
        # All spans created here inherit these attributes
        with langfuse.start_as_current_observation(
            as_type="generation",
            name="llm-call",
            model="gpt-4o"
        ) as gen:
            # This generation automatically has user_id, session_id, metadata, version
            pass

Note on Attribute Propagation

We use Attribute Propagation to propagate specific attributes (userId, sessionId, version, tags, metadata) across all observations in an execution context. We will use all observations with these attributes to calculate attribute-level metrics. Please consider the following when using Attribute Propagation:

Values must be strings ≤200 characters
Metadata keys: Alphanumeric characters only (no whitespace or special characters)
Call early in your trace to ensure all observations are covered. This way you make sure that all Metrics in Langfuse are accurate.
Invalid values are dropped with a warning

Cross-Service Propagation

For distributed tracing across multiple services, use the as_baggage parameter (see OpenTelemetry documentation for more details) to propagate attributes via HTTP headers:

from langfuse import get_client, propagate_attributes
import requests
 
langfuse = get_client()
 
# Service A - originating service
with langfuse.start_as_current_observation(as_type="span", name="api-request"):
    with propagate_attributes(
        user_id="user_123",
        session_id="session_abc",
        as_baggage=True  # Propagate via HTTP headers
    ):
        # HTTP request to Service B
        response = requests.get("https://service-b.example.com/api")
        # user_id and session_id are now in HTTP headers
 
# Service B will automatically extract and apply these attributes

⚠️

Security Warning: When as_baggage=True, attribute values are added to HTTP headers on ALL outbound requests. Only enable for non-sensitive values and when you need cross-service tracing.

Trace Input/Output Behavior

In v3, trace input and output are automatically set from the root observation (first span/generation) by default. This differs from v2 where integrations could set trace-level inputs/outputs directly.

Default Behavior

from langfuse import get_client
 
langfuse = get_client()
 
with langfuse.start_as_current_observation(
    as_type="span",
    name="user-request",
    input={"query": "What is the capital of France?"}  # This becomes the trace input
) as root_span:
 
    with langfuse.start_as_current_observation(
        as_type="generation",
        name="llm-call",
        model="gpt-4o",
        input={"messages": [{"role": "user", "content": "What is the capital of France?"}]}
    ) as gen:
        response = "Paris is the capital of France."
        gen.update(output=response)
        # LLM generation input/output are separate from trace input/output
 
    root_span.update(output={"answer": "Paris"})  # This becomes the trace output

Override Default Behavior

If you need different trace inputs/outputs than the root observation, explicitly set them:

from langfuse import get_client
 
langfuse = get_client()
 
with langfuse.start_as_current_observation(as_type="span", name="complex-pipeline") as root_span:
    # Root span has its own input/output
    root_span.update(input="Step 1 data", output="Step 1 result")
 
    # But trace should have different input/output (e.g., for LLM-as-a-judge)
    root_span.update_trace(
        input={"original_query": "User's actual question"},
        output={"final_answer": "Complete response", "confidence": 0.95}
    )
 
    # Now trace input/output are independent of root span input/output

Critical for LLM-as-a-Judge Features

LLM-as-a-judge and evaluation features typically rely on trace-level inputs and outputs. Make sure to set these appropriately:

from langfuse import observe, get_client
 
langfuse = get_client()
 
@observe()
def process_user_query(user_question: str):
    # LLM processing...
    answer = call_llm(user_question)
 
    # Explicitly set trace input/output for evaluation features
    langfuse.update_current_trace(
        input={"question": user_question},
        output={"answer": answer}
    )
 
    return answer

Trace and Observation IDs

Langfuse uses W3C Trace Context compliant IDs:

Trace IDs: 32-character lowercase hexadecimal string (16 bytes).
Observation IDs (Span IDs): 16-character lowercase hexadecimal string (8 bytes).

You can retrieve these IDs:

langfuse.get_current_trace_id(): Gets the trace ID of the currently active observation.
langfuse.get_current_observation_id(): Gets the ID of the currently active observation.
span_obj.trace_id and span_obj.id: Access IDs directly from a LangfuseSpan or LangfuseGeneration object.

For scenarios where you need to generate IDs outside of an active trace (e.g., to link scores to traces/observations that will be created later, or to correlate with external systems), use:

Langfuse.create_trace_id(seed: Optional[str] = None)(static method): Generates a new trace ID. If a seed is provided, the ID is deterministic. Use the same seed to get the same ID. This is useful for correlating external IDs with Langfuse traces.

from langfuse import get_client, Langfuse
 
langfuse = get_client()
 
# Get current IDs
with langfuse.start_as_current_observation(as_type="span", name="my-op") as current_op:
    trace_id = langfuse.get_current_trace_id()
    observation_id = langfuse.get_current_observation_id()
    print(f"Current Trace ID: {trace_id}, Current Observation ID: {observation_id}")
    print(f"From object: Trace ID: {current_op.trace_id}, Observation ID: {current_op.id}")
 
# Generate IDs deterministically
external_request_id = "req_12345"
deterministic_trace_id = Langfuse.create_trace_id(seed=external_request_id)
print(f"Deterministic Trace ID for {external_request_id}: {deterministic_trace_id}")

Linking to Existing Traces (Trace Context)

If you have a trace_id (and optionally a parent_span_id) from an external source (e.g., another service, a batch job), you can link new observations to it using the trace_context parameter. Note that OpenTelemetry offers native cross-service context propagation, so this is not necessarily required for calls between services that are instrumented with OTEL.

from langfuse import get_client
 
langfuse = get_client()
 
existing_trace_id = "abcdef1234567890abcdef1234567890" # From an upstream service
existing_parent_span_id = "fedcba0987654321" # Optional parent span in that trace
 
with langfuse.start_as_current_observation(
    as_type="span",
    name="process-downstream-task",
    trace_context={
        "trace_id": existing_trace_id,
        "parent_span_id": existing_parent_span_id # If None, this becomes a root span in the existing trace
    }
) as span:
    # This span is now part of the trace `existing_trace_id`
    # and a child of `existing_parent_span_id` if provided.
    print(f"This span's trace_id: {span.trace_id}") # Will be existing_trace_id
    pass

Client Management

`flush()`

Manually triggers the sending of all buffered observations (spans, generations, scores, media metadata) to the Langfuse API. This is useful in short-lived scripts or before exiting an application to ensure all data is persisted.

from langfuse import get_client
 
langfuse = get_client()
# ... create traces and observations ...
langfuse.flush() # Ensures all pending data is sent

The flush() method blocks until the queued data is processed by the respective background threads.

`shutdown()`

Gracefully shuts down the Langfuse client. This includes:

Flushing all buffered data (similar to flush()).
Waiting for background threads (for data ingestion and media uploads) to finish their current tasks and terminate.

It’s crucial to call shutdown() before your application exits to prevent data loss and ensure clean resource release. The SDK automatically registers an atexit hook to call shutdown() on normal program termination, but manual invocation is recommended in scenarios like:

Long-running daemons or services when they receive a shutdown signal.
Applications where atexit might not reliably trigger (e.g., certain serverless environments or forceful terminations).

from langfuse import get_client
 
langfuse = get_client()
# ... application logic ...
 
# Before exiting:
langfuse.shutdown()

Native Instrumentations

The Langfuse Python SDK has native integrations for the OpenAI and LangChain SDK. You can also use any other OTel-based instrumentation library to automatically trace your calls in Langfuse.

OpenAI Integration

Langfuse offers a drop-in replacement for the OpenAI Python SDK to automatically trace all your OpenAI API calls. Simply change your import statement:

- import openai
+ from langfuse.openai import openai
 
# Your existing OpenAI code continues to work as is
# For example:
# client = openai.OpenAI()
# completion = client.chat.completions.create(...)

What’s automatically captured:

Requests & Responses: All prompts/completions, including support for streaming, async operations, and function/tool calls.
Timings: Latencies for API calls.
Errors: API errors are captured with their details.
Model Usage: Token counts (input, output, total).
Cost: Estimated cost in USD (based on model and token usage).
Media: Input audio and output audio from speech-to-text and text-to-speech endpoints.

The integration is fully interoperable with @observe and manual tracing methods (start_as_current_span, etc.). If an OpenAI call is made within an active Langfuse span, the OpenAI generation will be correctly nested under it.

Passing Langfuse arguments to OpenAI calls:

You can pass Langfuse-specific arguments directly to OpenAI client methods. These will be used to enrich the trace data.

from langfuse import get_client, propagate_attributes
from langfuse.openai import openai
 
langfuse = get_client()
 
client = openai.OpenAI()
 
with langfuse.start_as_current_observation(as_type="span", name="qna-bot-openai") as span:
 
    with propagate_attributes(
        tags=["qna-bot-openai"]
    ):
        # This will be traced as a Langfuse generation
        response = client.chat.completions.create(
            name="qna-bot-openai",  # Custom name for this generation in Langfuse
            metadata={"user_tier": "premium", "request_source": "web_api"}, # will be added to the Langfuse generation
            model="gpt-4o",
            messages=[{"role": "user", "content": "What is OpenTelemetry?"}],
        )

Setting trace attributes via metadata:

You can set trace attributes (session_id, user_id, tags) directly on OpenAI calls using special fields in the metadata parameter:

from langfuse.openai import openai
 
client = openai.OpenAI()
 
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello"}],
    metadata={
        "langfuse_session_id": "session_123",
        "langfuse_user_id": "user_456",
        "langfuse_tags": ["production", "chat-bot"],
        "custom_field": "additional metadata"  # Regular metadata fields work too
    }
)

The special metadata fields are:

langfuse_session_id: Sets the session ID for the trace
langfuse_user_id: Sets the user ID for the trace
langfuse_tags: Sets tags for the trace (should be a list of strings)

Supported Langfuse arguments: name, metadata, langfuse_prompt

Learn more in the OpenAI integration documentation.

Langchain Integration

Langfuse provides a callback handler for Langchain to trace its operations.

Setup:

Initialize the CallbackHandler and add it to your Langchain calls, either globally or per-call.

from langfuse import get_client, propagate_attributes
from langfuse.langchain import CallbackHandler
from langchain_openai import ChatOpenAI # Example LLM
from langchain_core.prompts import ChatPromptTemplate
 
langfuse = get_client()
 
# Initialize the Langfuse handler
langfuse_handler = CallbackHandler()
 
# Example: Using it with an LLM call
llm = ChatOpenAI(model_name="gpt-4o")
prompt = ChatPromptTemplate.from_template("Tell me a joke about {topic}")
chain = prompt | llm
 
with langfuse.start_as_current_observation(as_type="span", name="joke-chain") as span:
 
    with propagate_attributes(
        tags=["joke-chain"]
    ):
        response = chain.invoke({"topic": "cats"}, config={"callbacks": [langfuse_handler]})
        print(response)

Setting trace attributes via metadata:

You can set trace attributes (session_id, user_id, tags) directly during chain invocation using special fields in the metadata configuration:

from langfuse.langchain import CallbackHandler
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
 
# Initialize the Langfuse handler
langfuse_handler = CallbackHandler()
 
# Create your LangChain components
llm = ChatOpenAI(model_name="gpt-4o")
prompt = ChatPromptTemplate.from_template("Tell me a joke about {topic}")
chain = prompt | llm
 
# Set trace attributes via metadata in chain invocation
response = chain.invoke(
    {"topic": "cats"},
    config={
        "callbacks": [langfuse_handler],
        "metadata": {
            "langfuse_session_id": "session_123",
            "langfuse_user_id": "user_456",
            "langfuse_tags": ["production", "humor-bot"],
            "custom_field": "additional metadata"  # Regular metadata fields work too
        }
    }
)

The special metadata fields are:

langfuse_session_id: Sets the session ID for the trace
langfuse_user_id: Sets the user ID for the trace
langfuse_tags: Sets tags for the trace (should be a list of strings)

You can also pass update_trace=True to the CallbackHandler init to force a trace update with the chains input, output and metadata.

What’s captured:

The callback handler maps various Langchain events to Langfuse observations:

Chains (on_chain_start, on_chain_end, on_chain_error): Traced as spans.
LLMs (on_llm_start, on_llm_end, on_llm_error, on_chat_model_start): Traced as generations, capturing model name, prompts, responses, and usage if available from the LLM provider.
Tools (on_tool_start, on_tool_end, on_tool_error): Traced as spans, capturing tool input and output.
Retrievers (on_retriever_start, on_retriever_end, on_retriever_error): Traced as spans, capturing the query and retrieved documents.
Agents (on_agent_action, on_agent_finish): Agent actions and final finishes are captured within their parent chain/agent span.

Langfuse attempts to parse model names, usage, and other relevant details from the information provided by Langchain. The metadata argument in Langchain calls can be used to pass additional information to Langfuse, including langfuse_prompt to link with managed prompts.

Learn more in the Langchain integration documentation.

Third-party integrations

The Langfuse SDK seamlessly integrates with any third-party library that uses OpenTelemetry instrumentation. When these libraries emit spans, they are automatically captured and properly nested within your trace hierarchy. This enables unified tracing across your entire application stack without requiring any additional configuration.

Google ADK All integrations

For example, if you’re using OpenTelemetry-instrumented databases, HTTP clients, or other services alongside your LLM operations, all these spans will be correctly organized within your traces in Langfuse.

You can use any third-party, OTEL-based instrumentation library for Anthropic to automatically trace all your Anthropic API calls in Langfuse.

In this example, we are using the opentelemetry-instrumentation-anthropic library.

from anthropic import Anthropic
from opentelemetry.instrumentation.anthropic import AnthropicInstrumentor
 
from langfuse import get_client
 
# This will automatically emit OTEL-spans for all Anthropic API calls
AnthropicInstrumentor().instrument()
 
langfuse = get_client()
anthropic_client = Anthropic()
 
with langfuse.start_as_current_observation(as_type="span", name="myspan"):
    # This will be traced as a Langfuse generation nested under the current span
    message = anthropic_client.messages.create(
        model="claude-3-7-sonnet-20250219",
        max_tokens=1024,
        messages=[{"role": "user", "content": "Hello, Claude"}],
    )
 
    print(message.content)
 
# Flush events to Langfuse in short-lived applications
langfuse.flush()

Learn more in the Anthropic integration documentation.

Setup Evaluation

Was this page helpful?

Support