Helicone Integration
In this guide, we’ll show you how to integrate Langfuse with Helicone.
What is Helicone? Helicone is an open-source AI gateway enabling you access to 100+ AI models through an OpenAI-compatible interface. It offers features like intelligent routing, automatic failover, caching, cost tracking, and more.
What is Langfuse? Langfuse is an open source LLM engineering platform that helps teams trace LLM calls, monitor performance, and debug issues in their AI applications.
Since Helicone is OpenAI-compatible, we can utilize Langfuse’s native integration with the OpenAI SDK, available in both Python and TypeScript.
Get started
- In your terminal, install the following packages if you haven’t already:
pip install langfuse openai python-dotenv- Then, create a
.envfile in your project and add your environment variables:
HELICONE_API_KEY=sk-helicone-... # Get it from your Helicone dashboard
LANGFUSE_SECRET_KEY=sk-lf-...
LANGFUSE_PUBLIC_KEY=pk-lf-...
LANGFUSE_BASE_URL=https://cloud.langfuse.com # 🇪🇺 EU region
# LANGFUSE_BASE_URL=https://us.cloud.langfuse.com 🇺🇸 US regionExample 1: Simple LLM Call
We use Langfuse’s OpenAI SDK wrapper to automatically log Helicone calls as generations in Langfuse.
- The
base_urlis set to Helicone’s AI Gateway endpoint. - You can replace
"gpt-4o"with any model available in Helicone’s model registry. - The
api_keyuses your Helicone API key to handle authentication with model providers.
from langfuse.openai import openai
import os
from dotenv import load_dotenv
load_dotenv()
# Create an OpenAI client with Helicone's gateway endpoint
client = openai.OpenAI(
api_key=os.getenv("HELICONE_API_KEY"),
base_url="https://ai-gateway.helicone.ai/"
)
# Make a chat completion request
response = client.chat.completions.create(
model="gpt-4o", # Or any other 100+ models
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Tell me a fun fact about space."}
],
name="fun-fact-request" # Optional: Name of the generation in Langfuse
)
print(response.choices[0].message.content)Example 2: Nested LLM Calls
By using the @observe() decorator, we capture execution details of any Python function, including nested LLM calls, inputs, outputs, and execution times. This provides in-depth observability with minimal code changes.
- The
@observe()decorator captures inputs, outputs, and execution details of the functions. - Nested functions
summarize_textandanalyze_sentimentare also decorated, creating a hierarchy of traces. - Each LLM call within the functions is logged, providing a detailed trace of the execution flow.
By using the @observe() decorator, we can capture execution details of any Python function.
The @observe() decorator captures inputs, outputs, and execution details of the functions.
from langfuse import observe
from langfuse.openai import openai
import os
from dotenv import load_dotenv
load_dotenv()
# Create an OpenAI client with Helicone's base URL
client = openai.OpenAI(
base_url="https://ai-gateway.helicone.ai/",
api_key=os.getenv("HELICONE_API_KEY")
)
@observe() # This decorator enables tracing of the function
def analyze_text(text: str):
# First LLM call: Summarize the text
summary_response = summarize_text(text)
summary = summary_response.choices[0].message.content
# Second LLM call: Analyze the sentiment of the summary
sentiment_response = analyze_sentiment(summary)
sentiment = sentiment_response.choices[0].message.content
return {
"summary": summary,
"sentiment": sentiment
}
@observe() # Nested function to be traced
def summarize_text(text: str):
return client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": "You summarize texts in a concise manner."},
{"role": "user", "content": f"Summarize the following text:\n{text}"}
],
name="summarize-text"
)
@observe() # Nested function to be traced
def analyze_sentiment(summary: str):
return client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": "You analyze the sentiment of texts."},
{"role": "user", "content": f"Analyze the sentiment of the following summary:\n{summary}"}
],
name="analyze-sentiment"
)
# Example usage
text_to_analyze = "OpenAI's GPT-4 model has significantly advanced the field of AI, setting new standards for language generation."
analyze_text(text_to_analyze)Example 3: Streaming Responses
stream = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "user", "content": "Write a haiku about a robot."}
],
stream=True,
name="streaming-story"
)
print("🤖 Assistant (streaming):")
for chunk in stream:
if chunk.choices[0].delta.content is not None:
print(chunk.choices[0].delta.content, end="", flush=True)
print("\n")Example 4: Multi-Provider Access
Helicone provides access to 100+ LLM providers through a single interface. Simply change the model name to use different providers:
# Use Anthropic Claude
response = client.chat.completions.create(
model="claude-3.5-sonnet-v2/anthropic",
messages=[{"role": "user", "content": "Hello!"}]
)
# Use Gemini Pro if Anthropic's Claude Sonnet 3.5 is not available
response = client.chat.completions.create(
model="claude-3.5-sonnet-v2/anthropic,gemini-2.5-flash-lite/google-ai-studio",
messages=[{"role": "user", "content": "Hello!"}]
)Learn More
- Helicone Model Registry: Browse all available models in Helicone’s model registry.
- Helicone Documentation: Documentation for Helicone.
- Helicone GitHub: GitHub repository for Helicone.
- Helicone Twitter: Stay up to date with Helicone’s latest news and updates.