Instrumentation
To instrument your application to send traces to Langfuse, you can use either native library instrumentations that work out of the box, or use custom instrumentation methods for fine-grained control.
Custom Instrumentation
There are three main ways to create spans with the Langfuse Python SDK. All of them are fully interoperable with each other.
The @observe()
decorator provides a convenient way to automatically trace function executions, including capturing their inputs, outputs, execution time, and any errors. It supports both synchronous and asynchronous functions.
from langfuse import observe
@observe()
def my_data_processing_function(data, parameter):
# ... processing logic ...
return {"processed_data": data, "status": "ok"}
@observe(name="llm-call", as_type="generation")
async def my_async_llm_call(prompt_text):
# ... async LLM call ...
return "LLM response"
Parameters:
name: Optional[str]
: Custom name for the created span/generation. Defaults to the function name.as_type: Optional[Literal["generation"]]
: If set to"generation"
, a Langfuse generation object is created, suitable for LLM calls. Otherwise, a regular span is created.capture_input: bool
: Whether to capture function arguments as input. Defaults to env varLANGFUSE_OBSERVE_DECORATOR_IO_CAPTURE_ENABLED
orTrue
if not set.capture_output: bool
: Whether to capture function return value as output. Defaults to env varLANGFUSE_OBSERVE_DECORATOR_IO_CAPTURE_ENABLED
orTrue
if not set.transform_to_string: Optional[Callable[[Iterable], str]]
: For functions that return generators (sync or async), this callable can be provided to transform the collected chunks into a single string for theoutput
field. If not provided, and all chunks are strings, they will be concatenated. Otherwise, the list of chunks is stored.
Trace Context and Special Keyword Arguments:
The @observe
decorator automatically propagates the OTEL trace context. If a decorated function is called from within an active Langfuse span (or another OTEL span), the new observation will be nested correctly.
You can also pass special keyword arguments to a decorated function to control its tracing behavior:
langfuse_trace_id: str
: Explicitly set the trace ID for this function call. Must be a valid W3C Trace Context trace ID (32-char hex). If you have a trace ID from an external system, you can useLangfuse.create_trace_id(seed=external_trace_id)
to generate a valid deterministic ID.langfuse_parent_observation_id: str
: Explicitly set the parent observation ID. Must be a valid W3C Trace Context span ID (16-char hex).
@observe()
def my_function(a, b):
return a + b
# Call with a specific trace context
my_function(1, 2, langfuse_trace_id="1234567890abcdef1234567890abcdef")
The observe decorator is capturing the args, kwargs and return value of decorated functions by default. This may lead to performance issues in your application if you have large or deeply nested objects there. To avoid this, explicitly disable function IO capture on the decorated function by passing capture_input / capture_output
with value False
or globally by setting the environment variable LANGFUSE_OBSERVE_DECORATOR_IO_CAPTURE_ENABLED=False
.
Nesting Observations
The function call hierarchy is automatically captured by the @observe
decorator reflected in the trace.
from langfuse import observe
@observe
def my_data_processing_function(data, parameter):
# ... processing logic ...
return {"processed_data": data, "status": "ok"}
@observe
def main_function(data, parameter):
return my_data_processing_function(data, parameter)
Updating Observations
You can update observations with new information as your code executes.
- For spans/generations created via context managers or assigned to variables: use the
.update()
method on the object. - To update the currently active observation in the context (without needing a direct reference to it): use
langfuse.update_current_span()
orlangfuse.update_current_generation()
.
LangfuseSpan.update()
/ LangfuseGeneration.update()
parameters:
Parameter | Type | Description | Applies To |
---|---|---|---|
input | Optional[Any] | Input data for the operation. | Both |
output | Optional[Any] | Output data from the operation. | Both |
metadata | Optional[Any] | Additional metadata (JSON-serializable). | Both |
version | Optional[str] | Version identifier for the code/component. | Both |
level | Optional[SpanLevel] | Severity: "DEBUG" , "DEFAULT" , "WARNING" , "ERROR" . | Both |
status_message | Optional[str] | A message describing the status, especially for errors. | Both |
completion_start_time | Optional[datetime] | Timestamp when the LLM started generating the completion (streaming). | Generation |
model | Optional[str] | Name/identifier of the AI model used. | Generation |
model_parameters | Optional[Dict[str, MapValue]] | Parameters used for the model call (e.g., temperature). | Generation |
usage_details | Optional[Dict[str, int]] | Token usage (e.g., {"input_tokens": 10, "output_tokens": 20} ). | Generation |
cost_details | Optional[Dict[str, float]] | Cost information (e.g., {"total_cost": 0.0023} ). | Generation |
prompt | Optional[PromptClient] | Associated PromptClient object from Langfuse prompt management. | Generation |
from langfuse import get_client
langfuse = get_client()
with langfuse.start_as_current_generation(name="llm-call", model="gpt-3.5-turbo") as gen:
gen.update(input={"prompt": "Why is the sky blue?"})
# ... make LLM call ...
response_text = "Rayleigh scattering..."
gen.update(
output=response_text,
usage_details={"input_tokens": 5, "output_tokens": 50},
metadata={"confidence": 0.9}
)
# Alternatively, update the current observation in context:
with langfuse.start_as_current_span(name="data-processing"):
# ... some processing ...
langfuse.update_current_span(metadata={"step1_complete": True})
# ... more processing ...
langfuse.update_current_span(output={"result": "final_data"})
Setting Trace Attributes
Trace-level attributes apply to the entire trace, not just a single observation. You can set or update these using:
- The
.update_trace()
method on anyLangfuseSpan
orLangfuseGeneration
object within that trace. langfuse.update_current_trace()
to update the trace associated with the currently active observation.
Trace attribute parameters:
Parameter | Type | Description |
---|---|---|
name | Optional[str] | Name for the trace. |
user_id | Optional[str] | ID of the user associated with this trace. |
session_id | Optional[str] | Session identifier for grouping related traces. |
version | Optional[str] | Version of your application/service for this trace. |
input | Optional[Any] | Overall input for the entire trace. |
output | Optional[Any] | Overall output for the entire trace. |
metadata | Optional[Any] | Additional metadata for the trace. |
tags | Optional[List[str]] | List of tags to categorize the trace. |
public | Optional[bool] | Whether the trace should be publicly accessible (if configured). |
Example: Setting Multiple Trace Attributes
from langfuse import get_client
langfuse = get_client()
with langfuse.start_as_current_span(name="initial-operation") as span:
# Set trace attributes early
span.update_trace(
user_id="user_xyz",
session_id="session_789",
tags=["beta-feature", "llm-chain"]
)
# ...
# Later, from another span in the same trace:
with span.start_as_current_generation(name="final-generation") as gen:
# ...
langfuse.update_current_trace(output={"final_status": "success"}, public=True)
Trace Input/Output Behavior
In v3, trace input and output are automatically set from the root observation (first span/generation) by default. This differs from v2 where integrations could set trace-level inputs/outputs directly.
Default Behavior
from langfuse import get_client
langfuse = get_client()
with langfuse.start_as_current_span(
name="user-request",
input={"query": "What is the capital of France?"} # This becomes the trace input
) as root_span:
with langfuse.start_as_current_generation(
name="llm-call",
model="gpt-4o",
input={"messages": [{"role": "user", "content": "What is the capital of France?"}]}
) as gen:
response = "Paris is the capital of France."
gen.update(output=response)
# LLM generation input/output are separate from trace input/output
root_span.update(output={"answer": "Paris"}) # This becomes the trace output
Override Default Behavior
If you need different trace inputs/outputs than the root observation, explicitly set them:
from langfuse import get_client
langfuse = get_client()
with langfuse.start_as_current_span(name="complex-pipeline") as root_span:
# Root span has its own input/output
root_span.update(input="Step 1 data", output="Step 1 result")
# But trace should have different input/output (e.g., for LLM-as-a-judge)
root_span.update_trace(
input={"original_query": "User's actual question"},
output={"final_answer": "Complete response", "confidence": 0.95}
)
# Now trace input/output are independent of root span input/output
Critical for LLM-as-a-Judge Features
LLM-as-a-judge and evaluation features typically rely on trace-level inputs and outputs. Make sure to set these appropriately:
from langfuse import observe, get_client
langfuse = get_client()
@observe()
def process_user_query(user_question: str):
# LLM processing...
answer = call_llm(user_question)
# Explicitly set trace input/output for evaluation features
langfuse.update_current_trace(
input={"question": user_question},
output={"answer": answer}
)
return answer
Trace and Observation IDs
Langfuse uses W3C Trace Context compliant IDs:
- Trace IDs: 32-character lowercase hexadecimal string (16 bytes).
- Observation IDs (Span IDs): 16-character lowercase hexadecimal string (8 bytes).
You can retrieve these IDs:
langfuse.get_current_trace_id()
: Gets the trace ID of the currently active observation.langfuse.get_current_observation_id()
: Gets the ID of the currently active observation.span_obj.trace_id
andspan_obj.id
: Access IDs directly from aLangfuseSpan
orLangfuseGeneration
object.
For scenarios where you need to generate IDs outside of an active trace (e.g., to link scores to traces/observations that will be created later, or to correlate with external systems), use:
Langfuse.create_trace_id(seed: Optional[str] = None)
(static method): Generates a new trace ID. If aseed
is provided, the ID is deterministic. Use the same seed to get the same ID. This is useful for correlating external IDs with Langfuse traces.
from langfuse import get_client, Langfuse
langfuse = get_client()
# Get current IDs
with langfuse.start_as_current_span(name="my-op") as current_op:
trace_id = langfuse.get_current_trace_id()
observation_id = langfuse.get_current_observation_id()
print(f"Current Trace ID: {trace_id}, Current Observation ID: {observation_id}")
print(f"From object: Trace ID: {current_op.trace_id}, Observation ID: {current_op.id}")
# Generate IDs deterministically
external_request_id = "req_12345"
deterministic_trace_id = Langfuse.create_trace_id(seed=external_request_id)
print(f"Deterministic Trace ID for {external_request_id}: {deterministic_trace_id}")
Linking to Existing Traces (Trace Context)
If you have a trace_id
(and optionally a parent_span_id
) from an external source (e.g., another service, a batch job), you can link new observations to it using the trace_context
parameter. Note that OpenTelemetry offers native cross-service context propagation, so this is not necessarily required for calls between services that are instrumented with OTEL.
from langfuse import get_client
langfuse = get_client()
existing_trace_id = "abcdef1234567890abcdef1234567890" # From an upstream service
existing_parent_span_id = "fedcba0987654321" # Optional parent span in that trace
with langfuse.start_as_current_span(
name="process-downstream-task",
trace_context={
"trace_id": existing_trace_id,
"parent_span_id": existing_parent_span_id # If None, this becomes a root span in the existing trace
}
) as span:
# This span is now part of the trace `existing_trace_id`
# and a child of `existing_parent_span_id` if provided.
print(f"This span's trace_id: {span.trace_id}") # Will be existing_trace_id
pass
Client Management
flush()
Manually triggers the sending of all buffered observations (spans, generations, scores, media metadata) to the Langfuse API. This is useful in short-lived scripts or before exiting an application to ensure all data is persisted.
from langfuse import get_client
langfuse = get_client()
# ... create traces and observations ...
langfuse.flush() # Ensures all pending data is sent
The flush()
method blocks until the queued data is processed by the respective background threads.
shutdown()
Gracefully shuts down the Langfuse client. This includes:
- Flushing all buffered data (similar to
flush()
). - Waiting for background threads (for data ingestion and media uploads) to finish their current tasks and terminate.
It’s crucial to call shutdown()
before your application exits to prevent data loss and ensure clean resource release. The SDK automatically registers an atexit
hook to call shutdown()
on normal program termination, but manual invocation is recommended in scenarios like:
- Long-running daemons or services when they receive a shutdown signal.
- Applications where
atexit
might not reliably trigger (e.g., certain serverless environments or forceful terminations).
from langfuse import get_client
langfuse = get_client()
# ... application logic ...
# Before exiting:
langfuse.shutdown()
Native Instrumentations
The Langfuse Python SDK has native integrations for the OpenAI and LangChain SDK. You can also use any other OTel-based instrumentation library to automatically trace your calls in Langfuse.
OpenAI Integration
Langfuse offers a drop-in replacement for the OpenAI Python SDK to automatically trace all your OpenAI API calls. Simply change your import statement:
- import openai
+ from langfuse.openai import openai
# Your existing OpenAI code continues to work as is
# For example:
# client = openai.OpenAI()
# completion = client.chat.completions.create(...)
What’s automatically captured:
- Requests & Responses: All prompts/completions, including support for streaming, async operations, and function/tool calls.
- Timings: Latencies for API calls.
- Errors: API errors are captured with their details.
- Model Usage: Token counts (input, output, total).
- Cost: Estimated cost in USD (based on model and token usage).
- Media: Input audio and output audio from speech-to-text and text-to-speech endpoints.
The integration is fully interoperable with @observe
and manual tracing methods (start_as_current_span
, etc.). If an OpenAI call is made within an active Langfuse span, the OpenAI generation will be correctly nested under it.
Passing Langfuse arguments to OpenAI calls:
You can pass Langfuse-specific arguments directly to OpenAI client methods. These will be used to enrich the trace data.
from langfuse import get_client
from langfuse.openai import openai
langfuse = get_client()
client = openai.OpenAI()
with langfuse.start_as_current_span(name="qna-bot-openai") as span:
langfuse.update_current_trace(tags=["qna-bot-openai"])
# This will be traced as a Langfuse generation
response = client.chat.completions.create(
name="qna-bot-openai", # Custom name for this generation in Langfuse
metadata={"user_tier": "premium", "request_source": "web_api"}, # will be added to the Langfuse generation
model="gpt-4o",
messages=[{"role": "user", "content": "What is OpenTelemetry?"}],
)
Setting trace attributes via metadata:
You can set trace attributes (session_id
, user_id
, tags
) directly on OpenAI calls using special fields in the metadata
parameter:
from langfuse.openai import openai
client = openai.OpenAI()
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello"}],
metadata={
"langfuse_session_id": "session_123",
"langfuse_user_id": "user_456",
"langfuse_tags": ["production", "chat-bot"],
"custom_field": "additional metadata" # Regular metadata fields work too
}
)
The special metadata fields are:
langfuse_session_id
: Sets the session ID for the tracelangfuse_user_id
: Sets the user ID for the tracelangfuse_tags
: Sets tags for the trace (should be a list of strings)
Supported Langfuse arguments: name
, metadata
, langfuse_prompt
Learn more in the OpenAI integration documentation.
Langchain Integration
Langfuse provides a callback handler for Langchain to trace its operations.
Setup:
Initialize the CallbackHandler
and add it to your Langchain calls, either globally or per-call.
from langfuse import get_client
from langfuse.langchain import CallbackHandler
from langchain_openai import ChatOpenAI # Example LLM
from langchain_core.prompts import ChatPromptTemplate
langfuse = get_client()
# Initialize the Langfuse handler
langfuse_handler = CallbackHandler()
# Example: Using it with an LLM call
llm = ChatOpenAI(model_name="gpt-4o")
prompt = ChatPromptTemplate.from_template("Tell me a joke about {topic}")
chain = prompt | llm
with langfuse.start_as_current_span(name="joke-chain") as span:
langfuse.update_current_trace(tags=["joke-chain"])
response = chain.invoke({"topic": "cats"}, config={"callbacks": [langfuse_handler]})
print(response)
Setting trace attributes via metadata:
You can set trace attributes (session_id
, user_id
, tags
) directly during chain invocation using special fields in the metadata
configuration:
from langfuse.langchain import CallbackHandler
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
# Initialize the Langfuse handler
langfuse_handler = CallbackHandler()
# Create your LangChain components
llm = ChatOpenAI(model_name="gpt-4o")
prompt = ChatPromptTemplate.from_template("Tell me a joke about {topic}")
chain = prompt | llm
# Set trace attributes via metadata in chain invocation
response = chain.invoke(
{"topic": "cats"},
config={
"callbacks": [langfuse_handler],
"metadata": {
"langfuse_session_id": "session_123",
"langfuse_user_id": "user_456",
"langfuse_tags": ["production", "humor-bot"],
"custom_field": "additional metadata" # Regular metadata fields work too
}
}
)
The special metadata fields are:
langfuse_session_id
: Sets the session ID for the tracelangfuse_user_id
: Sets the user ID for the tracelangfuse_tags
: Sets tags for the trace (should be a list of strings)
You can also pass update_trace=True
to the CallbackHandler init to force a trace update with the chains input, output and metadata.
What’s captured:
The callback handler maps various Langchain events to Langfuse observations:
- Chains (
on_chain_start
,on_chain_end
,on_chain_error
): Traced as spans. - LLMs (
on_llm_start
,on_llm_end
,on_llm_error
,on_chat_model_start
): Traced as generations, capturing model name, prompts, responses, and usage if available from the LLM provider. - Tools (
on_tool_start
,on_tool_end
,on_tool_error
): Traced as spans, capturing tool input and output. - Retrievers (
on_retriever_start
,on_retriever_end
,on_retriever_error
): Traced as spans, capturing the query and retrieved documents. - Agents (
on_agent_action
,on_agent_finish
): Agent actions and final finishes are captured within their parent chain/agent span.
Langfuse attempts to parse model names, usage, and other relevant details from the information provided by Langchain. The metadata
argument in Langchain calls can be used to pass additional information to Langfuse, including langfuse_prompt
to link with managed prompts.
Learn more in the Langchain integration documentation.
Third-party integrations
The Langfuse SDK seamlessly integrates with any third-party library that uses OpenTelemetry instrumentation. When these libraries emit spans, they are automatically captured and properly nested within your trace hierarchy. This enables unified tracing across your entire application stack without requiring any additional configuration.
For example, if you’re using OpenTelemetry-instrumented databases, HTTP clients, or other services alongside your LLM operations, all these spans will be correctly organized within your traces in Langfuse.
You can use any third-party, OTEL-based instrumentation library for Anthropic to automatically trace all your Anthropic API calls in Langfuse.
In this example, we are using the opentelemetry-instrumentation-anthropic
library.
from anthropic import Anthropic
from opentelemetry.instrumentation.anthropic import AnthropicInstrumentor
from langfuse import get_client
# This will automatically emit OTEL-spans for all Anthropic API calls
AnthropicInstrumentor().instrument()
langfuse = get_client()
anthropic_client = Anthropic()
with langfuse.start_as_current_span(name="myspan"):
# This will be traced as a Langfuse generation nested under the current span
message = anthropic_client.messages.create(
model="claude-3-7-sonnet-20250219",
max_tokens=1024,
messages=[{"role": "user", "content": "Hello, Claude"}],
)
print(message.content)
# Flush events to Langfuse in short-lived applications
langfuse.flush()
Learn more in the Anthropic integration documentation.