DocsIntegrationsGroq
This is a Jupyter notebook

Cookbook: Observability for Groq Models (Python)

This cookbook shows two ways to interact with Groq models and trace them with Langfuse:

  1. Using the OpenAI SDK to interact with the Groq model
  2. Using the Groq SDK to interact with Groq models

By following these examples, you’ll learn how to log and trace interactions with Groq language models, enabling you to debug and evaluate the performance of your AI-driven applications.

ℹ️

Note: Langfuse is also natively integrated with LangChain, LlamaIndex, LiteLLM, and other frameworks. If you use one of them, any use of Groq models is instrumented right away.

Overview

In this notebook, we will explore various use cases where Langfuse can be integrated with the Groq SDK, including:

  • Basic LLM Calls: Learn how to wrap standard Groq model interactions with Langfuse’s @observe decorator for comprehensive logging.
  • Chained Function Calls: See how to manage and observe complex workflows where multiple model interactions are linked together to produce a final result.
  • Streaming Support: Discover how to use Langfuse with streaming responses from Groq models, ensuring that real-time interactions are fully traceable.

To get started, set up your environment variables for Langfuse and Groq:

import os
 
# Get keys for your project from the project settings page: https://cloud.langfuse.com
os.environ["LANGFUSE_PUBLIC_KEY"] = "pk-lf-..." 
os.environ["LANGFUSE_SECRET_KEY"] = "sk-lf-..." 
os.environ["LANGFUSE_HOST"] = "https://cloud.langfuse.com" # 🇪🇺 EU region
# os.environ["LANGFUSE_HOST"] = "https://us.cloud.langfuse.com" # 🇺🇸 US region
 
# Your Groq API key
os.environ["GROQ_API_KEY"] = "gsk_..."

Option 1: Using the OpenAI SDK to interact with the Groq model

Note: This example shows how to use the OpenAI Python SDK. If you use JS/TS, have a look at our OpenAI JS/TS SDK.

Install Required Packages

%pip install langfuse openai --upgrade

Import Necessary Modules

Instead of importing openai directly, import it from langfuse.openai. Also, import any other necessary modules.

# Instead of: import openai
from langfuse.openai import OpenAI

Initialize the OpenAI Client for the Groq Model

Initialize the OpenAI client but point it to the Groq model endpoint. Replace the access token with your own.

client = OpenAI(
    base_url="https://api.groq.com/openai/v1",
    api_key=os.environ.get("GROQ_API_KEY")
)

Chat Completion Request

Use the client to make a chat completion request to the Groq model.

completion = client.chat.completions.create(
    model="llama3-8b-8192",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {
            "role": "user",
            "content": "Write a poem about language models"
        }
    ]
)
print(completion.choices[0].message.content)

Example trace in Langfuse

Option 2: Using the Groq SDK to interact with Groq models

For more detailed guidance on the Groq SDK or the @observe decorator from Langfuse, please refer to the Groq Documentation and the Langfuse Documentation.

Install Required Packages

%pip install groq langfuse

Initialize the Groq client:

from groq import Groq
 
# Initialize Groq client
groq_client = Groq(api_key=os.environ["GROQ_API_KEY"])

Examples

Basic LLM Calls

We are integrating the Groq SDK with Langfuse using the @observe decorator, which is crucial for logging and tracing interactions with large language models (LLMs). The @observe(as_type="generation") decorator specifically logs LLM interactions, capturing inputs, outputs, and model parameters. The resulting groq_chat_completion method can then be used across your project.

from langfuse import observe, get_client
langfuse = get_client()
 
# Function to handle Groq chat completion calls, wrapped with @observe to log the LLM interaction
@observe(as_type="generation")
def groq_chat_completion(**kwargs):
    # Clone kwargs to avoid modifying the original input
    kwargs_clone = kwargs.copy()
 
    # Extract relevant parameters from kwargs
    messages = kwargs_clone.pop('messages', None)
    model = kwargs_clone.pop('model', None)
    temperature = kwargs_clone.pop('temperature', None)
    max_tokens = kwargs_clone.pop('max_tokens', None)
    top_p = kwargs_clone.pop('top_p', None)
 
    # Filter and prepare model parameters for logging
    model_parameters = {
        "max_tokens": max_tokens,
        "temperature": temperature,
        "top_p": top_p
    }
    model_parameters = {k: v for k, v in model_parameters.items() if v is not None}
 
    # Log the input and model parameters before calling the LLM
    langfuse.update_current_generation(
        input=messages,
        model=model,
        model_parameters=model_parameters,
        metadata=kwargs_clone,
    )
 
    # Call the Groq model to generate a response
    response = groq_client.chat.completions.create(**kwargs)
 
    # Log the usage details and output content after the LLM call
    choice = response.choices[0]
    langfuse.update_current_generation(
        usage_details={
            "input": len(str(messages)),
            "output": len(choice.message.content)
        },
        output=choice.message.content
    )
 
    # Return the model's response object
    return response

Simple Example

In the following example, we also added the decorator to the top-level function find_best_painter_from. This function calls the groq_chat_completion function, which is decorated with @observe(as_type="generation"). This hierarchical setup helps to trace more complex applications that involve multiple LLM calls and other non-LLM methods decorated with @observe.

@observe()
def find_best_painter_from(country="France"):
    response = groq_chat_completion(
        model="llama3-70b-8192",
        max_tokens=1024,
        temperature=0.4,
        messages=[
            {
                "role": "user",
                "content": f"this is a test"
            }
        ]
    )
    return response.choices[0].message.content
 
print(find_best_painter_from())

Example trace in Langfuse

Example trace in Langfuse

Chained Completions

This example demonstrates chaining multiple LLM calls using the @observe() decorator. The first call identifies the best painter from a specified country, and the second call uses that painter’s name to find their most famous painting. Both interactions are logged by Langfuse as we use the wrapped groq_chat_completion method created above, ensuring full traceability across the chained requests.

from langfuse import observe, get_client
langfuse = get_client()
 
@observe()
def find_best_painting_from(country="France"):
    response = groq_chat_completion(
        model="llama3-70b-8192",
        max_tokens=1024,
        temperature=0.1,
        messages=[
            {
                "role": "user",
                "content": f"Who is the best painter from {country}? Only provide the name."
            }
        ]
    )
    painter_name = response.choices[0].message.content.strip()
 
    response = groq_chat_completion(
        model="llama3-70b-8192",
        max_tokens=1024,
        messages=[
            {
                "role": "user",
                "content": f"What is the most famous painting of {painter_name}? Answer in one short sentence."
            }
        ]
    )
    return response.choices[0].message.content
 
print(find_best_painting_from("Germany"))

Example trace in Langfuse

Example trace in Langfuse

Streaming Completions

The following example demonstrates how to handle streaming responses from the Groq model using the @observe(as_type="generation") decorator. The process is similar to the completion example but includes handling streamed data in real-time.

from langfuse import observe, get_client
langfuse = get_client()
 
@observe(as_type="generation")
def stream_groq_chat_completion(**kwargs):
    kwargs_clone = kwargs.copy()
    messages = kwargs_clone.pop('messages', None)
    model = kwargs_clone.pop('model', None)
    temperature = kwargs_clone.pop('temperature', None)
    max_tokens = kwargs_clone.pop('max_tokens', None)
    top_p = kwargs_clone.pop('top_p', None)
 
    model_parameters = {
        "max_tokens": max_tokens,
        "temperature": temperature,
        "top_p": top_p
    }
    model_parameters = {k: v for k, v in model_parameters.items() if v is not None}
 
    langfuse.update_current_generation(
        input=messages,
        model=model,
        model_parameters=model_parameters,
        metadata=kwargs_clone,
    )
 
    stream = groq_client.chat.completions.create(stream=True, **kwargs)
    final_response = ""
    for chunk in stream:
        content = str(chunk.choices[0].delta.content)
        final_response += content
        yield content
 
    langfuse.update_current_generation(
        usage_details={
            "total_tokens": len(final_response.split())
        },
        output=final_response
    )

Usage:

@observe()
def stream_find_best_five_painter_from(country="France"):
    response_chunks = stream_groq_chat_completion(
        model="llama3-70b-8192",
        max_tokens=1024,
        messages=[
            {
                "role": "user",
                "content": f"Who are the best five painters from {country}? Give me a list of names and their most famous painting."
            }
        ]
    )
    final_response = ""
    for chunk in response_chunks:
        final_response += str(chunk)
        print(chunk, end="")
 
    return final_response
 
stream_find_best_five_painter_from("Spain")

Example trace in Langfuse

Example trace in Langfuse

Interoperability with the Python SDK

You can use this integration together with the Langfuse Python SDK to add additional attributes to the trace.

The @observe() decorator provides a convenient way to automatically wrap your instrumented code and add additional attributes to the trace.

from langfuse import observe, get_client
 
langfuse = get_client()
 
@observe()
def my_instrumented_function(input):
    output = my_llm_call(input)
 
    langfuse.update_current_trace(
        input=input,
        output=output,
        user_id="user_123",
        session_id="session_abc",
        tags=["agent", "my-trace"],
        metadata={"email": "[email protected]"},
        version="1.0.0"
    )
 
    return output

Learn more about using the Decorator in the Python SDK docs.

Next Steps

Once you have instrumented your code, you can manage, evaluate and debug your application:

Was this page helpful?