DocsIntegrationsDSPy
This is a Jupyter notebook

DSPy - Observability & Tracing

This cookbook demonstrates how to use DSPy with Langfuse. DSPy is a framework that systematically optimizes language model prompts and weights, making it easier to build and refine complex systems with LMs by automating the tuning process and improving reliability. For further information on DSPy, please visit the documentation.

Note: For this integration, we use the MLflow instrumentation library which sends traces to Langfuse’s OpenTelemetry backend.

Prerequisites

Install the latest versions of DSPy and MLflow. For example:

%pip install dspy mlflow opentelemetry-exporter-otlp-proto-http

Step 1: Setup Langfuse Environment Variables

First, we configure the environment variables. We set the OpenTelemetry endpoint, protocol, and authorization headers so that the traces from DSPy (via MLflow) are correctly sent to Langfuse. You can get your Langfuse API keys by signing up for Langfuse Cloud or self-hosting Langfuse.

import os
import base64
 
LANGFUSE_PUBLIC_KEY="pk-lf-..."
LANGFUSE_SECRET_KEY="sk-lf-..."
LANGFUSE_AUTH=base64.b64encode(f"{LANGFUSE_PUBLIC_KEY}:{LANGFUSE_SECRET_KEY}".encode()).decode()
 
os.environ["OTEL_EXPORTER_OTLP_TRACES_ENDPOINT"] = "https://cloud.langfuse.com/api/public/otel/v1/traces"  # 🇪🇺 EU data region
# os.environ["OTEL_EXPORTER_OTLP_TRACES_ENDPOINT"] = "https://us.cloud.langfuse.com/api/public/otel/v1/traces"  # 🇺🇸 US data region
os.environ["OTEL_EXPORTER_OTLP_TRACES_HEADERS"] = f"Authorization=Basic {LANGFUSE_AUTH}"
os.environ['OTEL_EXPORTER_OTLP_TRACES_PROTOCOL']= "http/protobuf"

Step 2: Enable MLflow Tracing for DSPy

Next, we use MLflow’s autologging module for DSPy to automatically capture your DSPy traces. This is done by a single call which instruments DSPy’s LM calls.

import mlflow
mlflow.dspy.autolog()

Step 3: Configure DSPy

Next, we set up DSPy. This involves initializing a language model and configuring DSPy to use it. You can then run various DSPy modules that showcase its features.

 
import dspy
lm = dspy.LM('openai/gpt-4o-mini', api_key='sk-proj-...')
dspy.configure(lm=lm)

Step 4: Running DSPy Modules with Observability

Here are a few examples from the DSPy documentation showing core features. Each example automatically sends trace data to Langfuse via MLflow.

Example 1: Using the Chain-of-Thought Module (Math Reasoning)

math = dspy.ChainOfThought("question -> answer: float")
math(question="Two dice are tossed. What is the probability that the sum equals two?")
Prediction(
    reasoning='When two dice are tossed, each die has 6 faces, resulting in a total of 6 * 6 = 36 possible outcomes. The only way to achieve a sum of 2 is if both dice show a 1 (1,1). There is only 1 favorable outcome for this event. Therefore, the probability of the sum equaling 2 is the number of favorable outcomes divided by the total number of outcomes, which is 1/36.',
    answer=0.027777777777777776
)

Example 2: Building a RAG Pipeline

def search_wikipedia(query: str) -> list[str]:
    results = dspy.ColBERTv2(url='http://20.102.90.50:2017/wiki17_abstracts')(query, k=3)
    return [x['text'] for x in results]
 
rag = dspy.ChainOfThought('context, question -> response')
 
question = "What's the name of the castle that David Gregory inherited?"
rag(context=search_wikipedia(question), question=question)
Prediction(
    reasoning='The context states that David Gregory inherited Kinnairdy Castle in 1664. Therefore, the name of the castle he inherited is Kinnairdy Castle.',
    response='The name of the castle that David Gregory inherited is Kinnairdy Castle.'
)

Example 3: Running a Classification Module with DSPy Signatures

def evaluate_math(expression: str):
    return dspy.PythonInterpreter({}).execute(expression)
 
def search_wikipedia(query: str):
    results = dspy.ColBERTv2(url='http://20.102.90.50:2017/wiki17_abstracts')(query, k=3)
    return [x['text'] for x in results]
 
react = dspy.ReAct("question -> answer: float", tools=[evaluate_math, search_wikipedia])
 
pred = react(question="What is 9362158 divided by the year of birth of David Gregory of Kinnairdy castle?")
print(pred.answer)
5765.0

Disabling Auto Tracing

If you decide that you want to disable auto tracing, you can do so by passing the disable=True parameter:

import mlflow
mlflow.dspy.autolog(disable=True)

MLflow Trace Decorator

If you want to trace additional application logic, you can use the MLflow trace decorator. This allows you to capture the inputs and outputs of a function by adding the @mlflow.trace decorator to its definition.

Note: For other native Langfuse integrations which do not rely on an Opentelemetry instrumentation module (such as OpenAI, Langchain or Hugging Face), you can use the Langfuse decorator to trace additional application logic.

import mlflow
 
# Mark any function with the trace decorator to automatically capture input(s) and output(s)
@mlflow.trace
def some_function(x, y, z=2):
    return x + (y - z)
 
# Invoking the function will generate a trace that is logged to the active experiment
some_function(2, 4)

Step 5: Viewing Traces in Langfuse

After running your DSPy application, you can inspect the traced events in Langfuse:

Example trace in Langfuse

Public example trace link in Langfuse

Was this page useful?

Questions? We're here to help

Subscribe to updates