OSS Observability for OpenAI Assistants API
This is a Jupyter notebook

Cookbook: Observability for OpenAI Assistants API with Langfuse

This cookbook demonstrates how to use the Langfuse observe decorator (opens in a new tab) to trace calls made to the OpenAI Assistants API (opens in a new tab). It covers creating an assistant, running it on a thread, and observing the execution with Langfuse tracing (opens in a new tab).

Note: The native OpenAI SDK wrapper (opens in a new tab) does not support tracing of the OpenAI assistants API, you need to instrument it via the decorator as shown in this notebook.

What is the Assistants API?

The Assistants API from OpenAI allows developers to build AI assistants that can utilize multiple tools and data sources in parallel, such as code interpreters, file search, and custom tools created by calling functions. These assistants can access OpenAI's language models like GPT-4 with specific prompts, maintain persistent conversation histories, and process various file formats like text, images, and spreadsheets. Developers can fine-tune the language models on their own data and control aspects like output randomness. The API provides a framework for creating AI applications that combine language understanding with external tools and data.

Example Trace Output

OpenAI Assistants Trace in Langfuse


Install the required packages:

%pip install --upgrade openai langfuse

Set your environment:

import os
# Get keys for your project from the project settings page
os.environ["LANGFUSE_PUBLIC_KEY"] = ""
os.environ["LANGFUSE_SECRET_KEY"] = ""
os.environ["LANGFUSE_HOST"] = "" # 🇪🇺 EU region
# os.environ["LANGFUSE_HOST"] = "" # 🇺🇸 US region
# Your openai key
os.environ["OPENAI_API_KEY"] = ""

Step by step

1. Creating an Assistant

Use the client.beta.assistants.create method to create a new assistant. Alternatively you can also create the assistant via the OpenAI console:

from langfuse.decorators import observe
from openai import OpenAI
def create_assistant():
    client = OpenAI()
    assistant = client.beta.assistants.create(
        name="Math Tutor",
        instructions="You are a personal math tutor. Answer questions briefly, in a sentence or less.",
    return assistant
assistant = create_assistant()
print(f"Created assistant: {}")

Public link of example trace (opens in a new tab) of assistant creation

2. Running the Assistant

Create a thread and run the assistant on it:

def run_assistant(assistant_id, user_input):
    client = OpenAI()
    thread = client.beta.threads.create()
    client.beta.threads.messages.create(, role="assistant", content="I am a math tutor that likes to help math students, how can I help?"
    client.beta.threads.messages.create(, role="user", content=user_input
    run = client.beta.threads.runs.create(,
    return run, thread
user_input = "I need to solve the equation `3x + 11 = 14`. Can you help me?"
run, thread = run_assistant(, user_input)
print(f"Created run: {}")

Public link of example trace (opens in a new tab) of message and trace creation

3. Getting the Response

Retrieve the assistant's response from the thread:

import json
from langfuse.decorators import langfuse_context
def get_response(thread_id, run_id):
    client = OpenAI()
    messages = client.beta.threads.messages.list(thread_id=thread_id, order="asc")
    assistant_response =[-1].content[0].text.value
    # get run for token counts
    run_log = client.beta.threads.runs.retrieve(
    message_log = client.beta.threads.messages.list(
    input_messages = [{"role": message.role, "content": message.content[0].text.value} for message in[::-1][:-1]]
    # log internal generation within the openai assistant as a separate child generation to langfuse
    # get langfuse client used by the decorator, uses the low-level Python SDK
    langfuse_client = langfuse_context._get_langfuse()
    # pass trace_id and current observation ids to the newly created child generation
    return assistant_response, run
response = get_response(,
print(f"Assistant response: {response[0]}")

Public link of example trace (opens in a new tab) of fetching the response

All in one trace

import time
def run_math_tutor(user_input):
    assistant = create_assistant()
    run, thread = run_assistant(, user_input)
    time.sleep(5) # notebook only, wait for the assistant to finish
    response = get_response(,
    return response[0]
user_input = "I need to solve the equation `3x + 11 = 14`. Can you help me?"
response = run_math_tutor(user_input)
print(f"Assistant response: {response}")

The Langfuse trace shows the flow of creating the assistant, running it on a thread with user input, and retrieving the response, along with the captured input/output data.

Public link of example trace (opens in a new tab)

OpenAI Assistants Trace in Langfuse

Learn more

If you use non-Assistants API endpoints, you can use the OpenAI SDK wrapper for tracing. Check out the Langfuse documentation (opens in a new tab) for more details.

Was this page useful?

Questions? We're here to help

Subscribe to updates