IntegrationsFrameworksLangChain & LangGraph

Observability & Tracing for LangChain & LangGraph (Python & JS/TS)

Langfuse Tracing integrates with Langchain using Langchain Callbacks (Python, JS). The Langfuse SDK automatically captures detailed traces of your Langchain executions, creating properly nested observations for chains, LLMs, tools, and retrievers. This allows you to monitor, analyze and debug your LangChain applications with full observability.

This documentation has been updated to show examples for the new Python SDK v3. If you are looking for documentation for the Python SDK version 2, see here.

Add Langfuse to your Langchain Application

You can configure the integration via (1) constructor arguments or (2) environment variables. Get your Langfuse credentials from the Langfuse dashboard.

Set environment variables:

export LANGFUSE_PUBLIC_KEY="your-public-key"
export LANGFUSE_SECRET_KEY="your-secret-key"
export LANGFUSE_HOST="https://cloud.langfuse.com"  # Optional: defaults to https://cloud.langfuse.com
pip install langfuse
from langfuse.langchain import CallbackHandler
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
 
# Initialize the Langfuse handler
langfuse_handler = CallbackHandler()
 
# Create your LangChain components
llm = ChatOpenAI(model_name="gpt-4o")
prompt = ChatPromptTemplate.from_template("Tell me a joke about {topic}")
chain = prompt | llm
 
# Run your chain with Langfuse tracing
response = chain.invoke({"topic": "cats"}, config={"callbacks": [langfuse_handler]})
print(response.content)

Done. Now you can explore detailed traces and metrics in the Langfuse dashboard.

End-to-end Examples

Supported LangChain interfaces

Feature/interfacePythonJS/TS
LCEL
invoke()
run()
call()
predict()
async
batch()(✅)
streaming

We are interested in your feedback! Raise an issue on GitHub to request support for additional interfaces.

Supported LangChain features

Additional Configuration

Configuration Options

The CallbackHandler does not accept any constructor arguments for trace attributes or global settings.

  • Global settings (like sample_rate, tracing_enabled) must be set when initializing the Langfuse client via Langfuse() constructor or environment variables
  • Trace attributes (like user_id, session_id, tags) can be set either:
    • Via metadata fields in chain invocation (langfuse_user_id, langfuse_session_id, langfuse_tags)
    • Via an enclosing span using span.update_trace() as shown in the examples above

Dynamic Trace Attributes

You can set trace attributes dynamically for each LangChain execution. The approach differs between SDK versions:

For Python SDK v3, you have two options to set trace attributes dynamically:

Option 1: Via metadata fields in chain invocation (simplest approach):

from langfuse.langchain import CallbackHandler
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
 
langfuse_handler = CallbackHandler()
 
llm = ChatOpenAI(model_name="gpt-4o")
prompt = ChatPromptTemplate.from_template("Tell me a joke about {topic}")
chain = prompt | llm
 
# Set trace attributes dynamically via metadata
response = chain.invoke(
    {"topic": "cats"},
    config={
        "callbacks": [langfuse_handler],
        "metadata": {
            "langfuse_user_id": "random-user",
            "langfuse_session_id": "random-session",
            "langfuse_tags": ["random-tag-1", "random-tag-2"]
        }
    }
)

Option 2: Via enclosing span (for more control):

from langfuse import get_client
from langfuse.langchain import CallbackHandler
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
 
langfuse = get_client()
langfuse_handler = CallbackHandler()
 
llm = ChatOpenAI(model_name="gpt-4o")
prompt = ChatPromptTemplate.from_template("Tell me a joke about {topic}")
chain = prompt | llm
 
# Set trace attributes dynamically via enclosing span
with langfuse.start_as_current_span(name="dynamic-langchain-trace") as span:
    span.update_trace(
        user_id="random-user",
        session_id="random-session",
        tags=["random-tag-1", "random-tag-2"],
        input={"animal": "dog"}
    )
 
    response = chain.invoke({"topic": "cats"}, config={"callbacks": [langfuse_handler]})
 
    span.update_trace(output={"response": response.content})

Predefined Trace ID + Add Evaluation or User Feedback Score

Predefined Trace ID

To score a Langchain execution, you can capture the trace ID for the score by either wrapping the execution in a span that sets a predefined trace ID, or retrieve the last trace ID a callback handler has created via langfuse_handler.last_trace_id.

from langfuse import get_client, Langfuse
from langfuse.langchain import CallbackHandler
 
langfuse = get_client()
 
# Generate deterministic trace ID from external system
external_request_id = "req_12345"
predefined_trace_id = Langfuse.create_trace_id(seed=external_request_id)
 
langfuse_handler = CallbackHandler()
 
# Use the predefined trace ID with trace_context
with langfuse.start_as_current_span(
    name="langchain-request",
    trace_context={"trace_id": predefined_trace_id}
) as span:
    span.update_trace(
        user_id="user_123",
        input={"person": "Ada Lovelace"}
    )
 
    # LangChain execution will be part of this trace
    response = chain.invoke(
        {"person": "Ada Lovelace"},
        config={"callbacks": [langfuse_handler]}
    )
 
    span.update_trace(output={"response": response})
 
print(f"Trace ID: {predefined_trace_id}")  # Use this for scoring later
print(f"Trace ID: {langfuse_handler.last_trace_id}") # Care needed in concurrent environments where handler is reused

Add Score to Trace

There are multiple ways to score a trace in Python SDK v3. See Scoring documentation for more details.

from langfuse import get_client
 
langfuse = get_client()
 
# Option 1: Use the yielded span object from the context manager
with langfuse.start_as_current_span(
    name="langchain-request",
    trace_context={"trace_id": predefined_trace_id}
) as span:
    # ... LangChain execution ...
 
    # Score using the span object
    span.score_trace(
        name="user-feedback",
        value=1,
        data_type="NUMERIC",
        comment="This was correct, thank you"
    )
 
# Option 2: Use langfuse.score_current_trace() if still in context
with langfuse.start_as_current_span(name="langchain-request") as span:
    # ... LangChain execution ...
 
    # Score using current context
    langfuse.score_current_trace(
        name="user-feedback",
        value=1,
        data_type="NUMERIC"
    )
 
# Option 3: Use create_score() with trace ID (when outside context)
langfuse.create_score(
    trace_id=predefined_trace_id,
    name="user-feedback",
    value=1,
    data_type="NUMERIC",
    comment="This was correct, thank you"
)

Interoperability with Langfuse SDKs

The Langchain integration works seamlessly with the Langfuse SDK to create comprehensive traces that combine Langchain operations with other application logic.

Common use cases:

  • Add non-Langchain related observations to the trace
  • Group multiple Langchain runs into a single trace
  • Set trace-level attributes (user_id, session_id, etc.)
  • Add custom spans for business logic around LLM calls

Learn more about the structure of a trace here.

from langfuse import observe, get_client
from langfuse.langchain import CallbackHandler
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
 
@observe() # Automatically log function as a trace to Langfuse
def process_user_query(user_input: str):
    langfuse = get_client()
 
    # Update trace attributes
    langfuse.update_current_trace(
        name="user-query-processing",
        session_id="session-1234",
        user_id="user-5678",
        input={"query": user_input}
    )
 
    # Initialize the Langfuse handler - automatically inherits the current trace context
    langfuse_handler = CallbackHandler()
 
    # Your Langchain code - will be nested under the @observe trace
    llm = ChatOpenAI(model_name="gpt-4o")
    prompt = ChatPromptTemplate.from_template("Respond to: {input}")
    chain = prompt | llm
 
    result = chain.invoke({"input": user_input}, config={"callbacks": [langfuse_handler]})
 
    # Update trace with final output
    langfuse.update_current_trace(output={"response": result.content})
 
    return result.content
 
# Usage
answer = process_user_query("What is the capital of France?")

If you pass these callback handlers to your Langchain code, the events will be nested under the respective trace or span in the Langfuse.

See the Langchain + decorator observability cookbook for an example of this in action (Python).

Queuing and flushing

The Langfuse SDKs queue and batch events in the background to reduce the number of network requests and improve overall performance. In a long-running application, this works without any additional configuration.

If you are running a short-lived application, you need to shutdown Langfuse to ensure that all events are flushed before the application exits.

from langfuse import get_client
 
# Shutdown the underlying singleton instance
get_client().shutdown()

If you want to flush events synchronously at a certain point, you can use the flush method. This will wait for all events that are still in the background queue to be sent to the Langfuse API. This is usually discouraged in production environments.

from langfuse import get_client
 
# Flush the underlying singleton instance
get_client().flush()

Serverless environments (JS/TS)

Since Langchain version > 0.3.0, the callbacks on which Langfuse relies have been backgrounded. This means that execution will not wait for the callback to either return before continuing. Prior to 0.3.0, this behavior was the opposite. If you are running code in serverless environments such as Google Cloud Functions, AWS Lambda or Cloudflare Workers you should set your callbacks to be blocking to allow them time to finish or timeout. This can be done either by

  • setting the LANGCHAIN_CALLBACKS_BACKGROUND environment variable to “false”
  • importing the global awaitAllCallbacks method to ensure all callbacks finish if necessary

Read more about awaiting callbacks here in the Langchain docs.

Azure OpenAI model names

Please add the model keyword argument to the AzureOpenAI or AzureChatOpenAI class to have the model name parsed correctly in Langfuse.

from langchain_openai import AzureChatOpenAI
 
llm = AzureChatOpenAI(
azure_deployment="my-gpt-4o-deployment",
model="gpt-4o",
)

Upgrade Paths for Langchain Integration

This doc is a collection of upgrade paths for different versions of the integration. If you want to add the integration to your project, you should start with the latest version and follow the integration guide above.

Langfuse and Langchain are under active development. Thus, we are constantly improving the integration. This means that we sometimes need to make breaking changes to our APIs or need to react to breaking changes in Langchain. We try to keep these to a minimum and to provide clear upgrade paths when we do make them.

Python

JS/TS

Python

From v2.x.x to v3.x.x

Python SDK v3 introduces a completely revised Langfuse core with a new observability API. While the LangChain integration still relies on a CallbackHandler, nearly all ergonomics have changed. The table below highlights the most important breaking changes:

Topicv2v3
Package importfrom langfuse.callback import CallbackHandlerfrom langfuse.langchain import CallbackHandler
Client handlingMultiple instantiated clientsSingleton pattern, access via get_client()
Trace/Span contextCallbackHandler optionally accepted root to group runsUse context managers with langfuse.start_as_current_span(...)
Dynamic trace attrsPass via LangChain config (e.g. metadata["langfuse_user_id"])Use metadata["langfuse_user_id"] OR span.update_trace(user_id=...)
Constructor argsCallbackHandler(sample_rate=..., user_id=...)No constructor args – use Langfuse client or spans

Minimal migration example:

# Install latest SDK (>=3.0.0)
pip install --upgrade langfuse
 
# v2 Code (for reference)
# from langfuse.callback import CallbackHandler
# handler = CallbackHandler()
# chain.invoke({"topic": "cats"}, config={"callbacks": [handler]})
 
# v3 Code
from langfuse import Langfuse, get_client
from langfuse.langchain  import CallbackHandler
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
 
# 1. Create/Configure Langfuse client (once at startup)
Langfuse(
    public_key="your-public-key",
    secret_key="your-secret-key",
)
 
# 2. Access singleton instance and create handler
langfuse = get_client()
handler = CallbackHandler()
 
# 3. Option 1: Use metadata in chain invocation (simplest migration)
llm = ChatOpenAI(model_name="gpt-4o")
prompt = ChatPromptTemplate.from_template("Tell me a joke about {topic}")
chain = prompt | llm
 
response = chain.invoke(
    {"topic": "cats"},
    config={
        "callbacks": [handler],
        "metadata": {"langfuse_user_id": "user_123"}
    }
)
 
# 3. Option 2: Wrap LangChain execution in a span (for more control)
# with langfuse.start_as_current_span(name="tell-joke") as span:
#     span.update_trace(user_id="user_123", input={"topic": "cats"})
#     response = chain.invoke({"topic": "cats"}, config={"callbacks": [handler]})
#     span.update_trace(output={"joke": response.content})
 
# (Optional) Flush events in short-lived scripts
langfuse.flush()
  • All arguments such as sample_rate or tracing_enabled must now be provided when constructing the Langfuse client (or via environment variables) – not on the handler.
  • Functions like flush() and shutdown() moved to the client instance (get_client().flush()).

From v1.x.x to v2.x.x

The CallbackHandler can be used in multiple invocations of a Langchain chain as shown below.

from langfuse.callback import CallbackHandler
langfuse_handler = CallbackHandler(PUBLIC_KEY, SECRET_KEY)
 
# Setup Langchain
from langchain.chains import LLMChain
...
chain = LLMChain(llm=llm, prompt=prompt, callbacks=[langfuse_handler])
 
# Add Langfuse handler as callback
chain.run(input="<first_user_input>", callbacks=[langfuse_handler])
chain.run(input="<second_user_input>", callbacks=[langfuse_handler])
 

So far, invoking the chain multiple times would group the observations in one trace.

TRACE
|
|-- SPAN: Retrieval
|   |
|   |-- SPAN: LLM Chain
|   |   |
|   |   |-- GENERATION: ChatOpenAi
|-- SPAN: Retrieval
|   |
|   |-- SPAN: LLM Chain
|   |   |
|   |   |-- GENERATION: ChatOpenAi

We changed this, so that each invocation will end up on its own trace. This allows us to derive the user inputs and outputs to Langchain applications.

TRACE_1
|
|-- SPAN: Retrieval
|   |
|   |-- SPAN: LLM Chain
|   |   |
|   |   |-- GENERATION: ChatOpenAi
 
TRACE_2
|
|-- SPAN: Retrieval
|   |
|   |-- SPAN: LLM Chain
|   |   |
|   |   |-- GENERATION: ChatOpenAi

If you still want to group multiple invocations on one trace, you can use the Langfuse SDK combined with the Langchain integration (more details).

from langfuse import Langfuse
langfuse = Langfuse()
 
# Get Langchain handler for a trace
trace = langfuse.trace()
langfuse_handler = trace.get_langchain_handler()
 
# langfuse_handler will use the trace for all invocations

JS/TS

From v2.x.x to v3.x.x

Requires langchain ^0.1.10. Langchain released a new stable version of the Callback Handler interface and this version of the Langfuse SDK implements it. Older versions are no longer supported.

From v1.x.x to v2.x.x

The CallbackHandler can be used in multiple invocations of a Langchain chain as shown below.

import { CallbackHandler } from "langfuse-langchain";
 
// create a handler
const langfuseHandler = new CallbackHandler({
  publicKey: LANGFUSE_PUBLIC_KEY,
  secretKey: LANGFUSE_SECRET_KEY,
});
 
import { LLMChain } from "langchain/chains";
 
// create a chain
const chain = new LLMChain({
  llm: model,
  prompt,
  callbacks: [langfuseHandler],
});
 
// execute the chain
await chain.call(
  { product: "<user_input_one>" },
  { callbacks: [langfuseHandler] }
);
await chain.call(
  { product: "<user_input_two>" },
  { callbacks: [langfuseHandler] }
);

So far, invoking the chain multiple times would group the observations in one trace.

TRACE
|
|-- SPAN: Retrieval
|   |
|   |-- SPAN: LLM Chain
|   |   |
|   |   |-- GENERATION: ChatOpenAi
|-- SPAN: Retrieval
|   |
|   |-- SPAN: LLM Chain
|   |   |
|   |   |-- GENERATION: ChatOpenAi

We changed this, so that each invocation will end up on its own trace. This is a more sensible default setting for most users.

TRACE_1
|
|-- SPAN: Retrieval
|   |
|   |-- SPAN: LLM Chain
|   |   |
|   |   |-- GENERATION: ChatOpenAi
 
TRACE_2
|
|-- SPAN: Retrieval
|   |
|   |-- SPAN: LLM Chain
|   |   |
|   |   |-- GENERATION: ChatOpenAi

If you still want to group multiple invocations on one trace, you can use the Langfuse SDK combined with the Langchain integration (more details).

const trace = langfuse.trace({ id: "special-id" });
// CallbackHandler will use the trace with the id "special-id" for all invocations
const langfuseHandler = new CallbackHandler({ root: trace });

FAQ

GitHub Discussions

Was this page helpful?