IntegrationsModel ProvidersOpenAI (JS/TS)

Observability for OpenAI SDK (JS/TS)

Looking for the Python version? Check it out here.

The Langfuse JS/TS SDK offers a wrapper function around the OpenAI SDK, enabling you to easily add observability to your OpenAI calls. This includes tracking latencies, time-to-first-token on stream responses, errors, and model usage.

import OpenAI from "openai";
import { observeOpenAI } from "@langfuse/openai";
 
const openai = observeOpenAI(new OpenAI());
 
const res = await openai.chat.completions.create({
  messages: [{ role: "system", content: "Tell me a story about a dog." }],
});

Langfuse automatically tracks:

  • All prompts/completions with support for streaming and function calling
  • Total latencies and time-to-first-token
  • OpenAI API Errors
  • Model usage (tokens) and cost (USD) (learn more)

How it works

Install Langfuse SDK

The integration is compatible with OpenAI SDK versions >=4.0.0.

npm install @langfuse/openai openai

Register your credentials

Add your Langfuse credentials to your environment variables. Make sure that you have a .env file in your project root and a package like dotenv to load the variables.

.env
LANGFUSE_SECRET_KEY = "sk-lf-..."
LANGFUSE_PUBLIC_KEY = "pk-lf-..."
LANGFUSE_BASE_URL = "https://cloud.langfuse.com" # 🇪🇺 EU region
# LANGFUSE_BASE_URL = "https://us.cloud.langfuse.com" # 🇺🇸 US region

Initialize OpenTelemetry

The Langfuse TypeScript SDK’s tracing is built on top of OpenTelemetry, so you need to set up the OpenTelemetry SDK. The LangfuseSpanProcessor is the key component that sends traces to Langfuse.

import { NodeSDK } from "@opentelemetry/sdk-node";
import { LangfuseSpanProcessor } from "@langfuse/otel";
 
const sdk = new NodeSDK({
  spanProcessors: [new LangfuseSpanProcessor()],
});
 
sdk.start();

Call OpenAI methods with the wrapped client

With your environment configured, call OpenAI SDK methods as usual from the wrapped client.

import OpenAI from "openai";
import { observeOpenAI } from "@langfuse/openai";
 
const openai = observeOpenAI(new OpenAI());
 
const res = await openai.chat.completions.create({
  messages: [{ role: "system", content: "Tell me a story about a dog." }],
  model: "gpt-4o",
  max_tokens: 300,
});

Done!✨ You now have full observability of your OpenAI calls in Langfuse.

Check out the notebook for end-to-end examples of the integration:

Troubleshooting

Queuing and batching of events

The Langfuse SDKs queue and batches events in the background to reduce the number of network requests and improve overall performance. In a long-running application, this works without any additional configuration.

If you are running a short-lived application, you need to flush Langfuse to ensure that all events are flushed before the application exits.

await langfuseSpanProcessor.forceFlush();
 
// If you have previously initialized a Langfuse client, you can use that for the flush call
await langfuse.flush();

Learn more about queuing and batching of events here.

Assistants API

Tracing of the assistants api is not supported by this integration as OpenAI Assistants have server-side state that cannot easily be captured without additional api requests. We added some more information on how to best track usage of the assistants api in this FAQ.

Advanced usage

Custom trace properties

You can add the following properties to the langfuseConfig of the observeOpenAI function to use additional Langfuse features:

PropertyDescription
generationNameSet generationName to identify a specific type of generation.
langfusePromptPass a created or fetched Langfuse prompt to link it with the generations
metadataSet metadata with additional information that you want to see in Langfuse.
sessionIdThe current session.
userIdThe current user_id.
tagsSet tags to categorize and filter traces.

Example:

const res = await observeOpenAI(new OpenAI(), {
  generationName: "Traced generation",
  metadata: { someMetadataKey: "someValue" },
  sessionId: "session-id",
  userId: "user-id",
  tags: ["tag1", "tag2"],
}).chat.completions.create({
  messages: [{ role: "system", content: "Tell me a story about a dog." }],
  model: "gpt-3.5-turbo",
  max_tokens: 300,
});

Adding custom properties requires you to wrap the OpenAI SDK with the observeOpenAI function and pass the properties as the second langfuseConfig argument. Since the Langfuse client here is a singleton and the same client is used for all calls, you do not need to worry about mistakingly having multiple clients running.

With Langfuse Prompt management you can effectively manage and version your prompts. You can link your OpenAI generations to a prompt by passing the langfusePrompt property to the observeOpenAI function.

import { observeOpenAI } from "@langfuse/openai";
import OpenAI from "openai";
 
const langfusePrompt = await langfuse.prompt.get("my-prompt"); // Fetch a previously created prompt
 
const res = await observeOpenAI(new OpenAI(), {
  langfusePrompt,
}).completions.create({
  prompt: langfusePrompt.prompt,
  model: "gpt-3.5-turbo-instruct",
  max_tokens: 300,
});

Resulting generations are now linked to the prompt in Langfuse, allowing you to track prompt usage and performance.

When working with chat prompts, you must typecast the compiled prompt messages as OpenAI.ChatCompletionMessageParam[] or use a type-guard utility function as Langfuse message roles can be arbitrary strings whereas the OpenAI type definition is more restrictive.

OpenAI token usage on streamed responses

OpenAI returns the token usage on streamed responses only when in stream_options the include_usage parameter is set to true. If you would like to benefit from OpenAI’s directly provided token usage, you can set { include_usage: true } in the stream_options argument.

import OpenAI from "openai";
import { observeOpenAI } from "@langfuse/openai";
 
const openai = observeOpenAI(new OpenAI());
 
const stream = await openai.chat.completions.create({
  model: "gpt-4",
  messages: [{ role: "user", content: "How are you?" }],
  stream: true,
  stream_options: { include_usage: true },
});
 
let result = "";
 
for await (const chunk of stream) {
  // Check if chunk choices are not empty. OpenAI returns token usage in a final chunk with an empty choices list.
  result += chunk.choices[0]?.delta?.content || "";
}

FAQ

GitHub Discussions

Was this page helpful?