Observability for BytePlus with Langfuse
This guide shows you how to integrate BytePlus with Langfuse. BytePlus API endpoints for chat, language and code, images, and embeddings are fully compatible with OpenAI’s API. This allows us to use the Langfuse OpenAI drop-in replacement to trace all parts of your application.
What is BytePlus? BytePlus is a suite of AI-powered APIs and services developed by ByteDance, including speech, video, and recommendation technologies. Langfuse integrates with BytePlus to trace and evaluate LLM workflows that use BytePlus tools, enabling observability across generation and user interaction.
What is Langfuse? Langfuse is an open source LLM engineering platform that helps teams trace API calls, monitor performance, and debug issues in their AI applications.
Note: You can also use BytePlus models in the Langfuse Playground and for LLM-as-a-Judge evaluations using the OpenAI adapter. Find out how to set up an LLM Connection in Langfuse here.
Step 1: Install Dependencies
Make sure you have installed the necessary Python packages:
%pip install openai langfuse
Step 2: Set Up Environment Variables
Next, set up your Langfuse API keys. You can get these keys by signing up for a free Langfuse Cloud account or by self-hosting Langfuse. These environment variables are essential for the OpenAI drop-in replacement to authenticate and send data to your Langfuse project.
Find a guide on creating your BytePlus API keys for model services here.
import os
# Get keys for your project from the project settings page: https://cloud.langfuse.com
os.environ["LANGFUSE_PUBLIC_KEY"] = "pk-lf-..."
os.environ["LANGFUSE_SECRET_KEY"] = "sk-lf-..."
os.environ["LANGFUSE_HOST"] = "https://cloud.langfuse.com" # 🇪🇺 EU region
# os.environ["LANGFUSE_HOST"] = "https://us.cloud.langfuse.com" # 🇺🇸 US region
# Get your BytePlus API key from the project settings page
os.environ["ARK_API_KEY"] = "***"
Step 3: Langfuse OpenAI drop-in Replacement
In this step we use the native OpenAI drop-in replacement by importing from langfuse.openai import openai
.
To start using BytePlus models with OpenAI’s client libraries, pass in your BytePlus API key to the api_key
option, and change the base_url
to https://ark.ap-southeast.bytepluses.com/api/v3
:
# instead of import openai:
from langfuse.openai import openai
client = openai.OpenAI(
api_key=os.environ.get("ARK_API_KEY"),
base_url="https://ark.ap-southeast.bytepluses.com/api/v3",
)
Note: The OpenAI drop-in replacement is fully compatible with the Low-Level Langfuse Python SDKs and @observe()
decorator to trace all parts of your application.
Step 4: Run An Example
The following cell demonstrates how to call the Kimi K2 model via BytePlus using the traced OpenAI client. All API calls will be automatically traced by Langfuse.
# Non-streaming:
print("----- standard request -----")
completion = client.chat.completions.create(
# Specify the Ark Inference Point ID that you created, which has been changed for you here to your Endpoint ID
model="kimi-k2-250711",
messages=[
{"role": "system", "content": "You're an AI assistant"},
{"role": "user", "content": "What is Langfuse?"},
],
name = "BytePlus-Generation" # Optional: Set the name of the generation in Langfuse
)
print(completion.choices[0].message.content)
# Streaming:
print("----- streaming request -----")
stream = client.chat.completions.create(
# Specify the Ark Inference Point ID that you created, which has been changed for you here to your Endpoint ID
model="kimi-k2-250711",
messages=[
{"role": "system", "content": "You're an AI assistant"},
{"role": "user", "content": "What is Langfuse?"},
],
name = "BytePlus-Generation", # Optional: Set the name of the generation in Langfuse
# Whether the response content is streamed back
stream=True,
)
for chunk in stream:
if not chunk.choices:
continue
print(chunk.choices[0].delta.content, end="")
print()
Step 5: See Traces in Langfuse
After running the example model call, you can see the traces in Langfuse. You will see detailed information about your BytePlus API calls, including:
- Request parameters (model, messages, temperature, etc.)
- Response content
- Token usage statistics
- Latency metrics
Interoperability with the Python SDK
You can use this integration together with the Langfuse Python SDK to add additional attributes to the trace.
The @observe()
decorator provides a convenient way to automatically wrap your instrumented code and add additional attributes to the trace.
from langfuse import observe, get_client
langfuse = get_client()
@observe()
def my_instrumented_function(input):
output = my_llm_call(input)
langfuse.update_current_trace(
input=input,
output=output,
user_id="user_123",
session_id="session_abc",
tags=["agent", "my-trace"],
metadata={"email": "[email protected]"},
version="1.0.0"
)
return output
Learn more about using the Decorator in the Python SDK docs.
Next Steps
Once you have instrumented your code, you can manage, evaluate and debug your application: