Build a LLM Chat UI with 🤗 Gradio and trace it with 🪢 Langfuse
This is a simple end-to-end example notebook which showcases how to integrate a Gradio application with Langfuse for LLM Observability and Evaluation.
We recommend to run this notebook in Google Colab (see link above).
Thank you to @tkmamidi for the original implementation and contributions to this notebook.
Introduction
What is Gradio?
Gradio is an open-source Python library that enables quick creation of web interfaces for machine learning models, APIs, and Python functions. It allows developers to wrap any Python function with an interactive UI that can be easily shared or embedded, making it ideal for demos, prototypes, and ML model deployment. See docs for more details.
What is Langfuse?
Langfuse is an open-source LLM engineering platform that helps build reliable LLM applications via LLM Application Observability, Evaluation, Experiments, and Prompt Management. See docs for more details.
Walkthrough
We’ve recorded a walkthrough of the implementation below. You can follow along with the video or the notebook.
Outline
This notebook will show you how to
- Build a simple chat interface in Python and rendering it in a Notebook using Gradio
Chatbot
- Add Langfuse Tracing to the chatbot
- Implement additional Langfuse tracing features used frequently in chat applications: chat sessions, user feedback
Setup
Install requirements. We use OpenAI for this simple example. We could use any model here.
# pinning httpx as the latest version is not compatible with the OpenAI SDK at the time of creating this notebook
!pip install gradio langfuse openai httpx==0.27.2
Set credentials and initialize Langfuse SDK Client used to add user feedback later on.
You can either create a free Langfuse Cloud account or self-host Langfuse in a couple of minutes.
import os
# Get keys for your project from the project settings page
# https://cloud.langfuse.com
os.environ["LANGFUSE_PUBLIC_KEY"] = ""
os.environ["LANGFUSE_SECRET_KEY"] = ""
os.environ["LANGFUSE_HOST"] = "https://cloud.langfuse.com" # 🇪🇺 EU region
# os.environ["LANGFUSE_HOST"] = "https://us.cloud.langfuse.com" # 🇺🇸 US region
# Your openai key
# We use OpenAI for this demo, could easily change to other models
os.environ["OPENAI_API_KEY"] = ""
import gradio as gr
import json
import uuid
from langfuse import Langfuse
langfuse = Langfuse()
Implementation of Chat functions
Sessions/Threads
Each chat message belongs to a thread in the Gradio Chatbot which can be reset using clear
(reference).
We implement the following method that creates a session_id
that is used globally and can be reset via the set_new_session_id
method. This session_id will be used for Langfuse Sessions.
session_id = None
def set_new_session_id():
global session_id
session_id = str(uuid.uuid4())
# Initialize
set_new_session_id()
Response handler
When implementing the respond
method, we use the Langfuse @observe()
decorator to automatically log each response to Langfuse Tracing.
In addition we use the openai integration as it simplifies instrumenting the LLM call to capture model parameters, token counts, and other metadata. Alternatively, we could use the integrations with LangChain, LlamaIndex, other frameworks, or instrument the call itself with the decorator (example).
# Langfuse decorator
from langfuse.decorators import observe, langfuse_context
# Optional: automated instrumentation via OpenAI SDK integration
# See note above regarding alternative implementations
from langfuse.openai import openai
# Global reference for the current trace_id which is used to later add user feedback
current_trace_id = None
# Add decorator here to capture overall timings, input/output, and manipulate trace metadata via `langfuse_context`
@observe()
async def create_response(
prompt: str,
history,
):
# Save trace id in global var to add feedback later
global current_trace_id
current_trace_id = langfuse_context.get_current_trace_id()
# Add session_id to Langfuse Trace to enable session tracking
global session_id
langfuse_context.update_current_trace(
name="gradio_demo_chat",
session_id=session_id,
input=prompt,
)
# Add prompt to history
if not history:
history = [{"role": "system", "content": "You are a friendly chatbot"}]
history.append({"role": "user", "content": prompt})
yield history
# Get completion via OpenAI SDK
# Auto-instrumented by Langfuse via the import, see alternative in note above
response = {"role": "assistant", "content": ""}
oai_response = openai.chat.completions.create(
messages=history,
model="gpt-4o-mini",
)
response["content"] = oai_response.choices[0].message.content or ""
# Customize trace ouput for better readability in Langfuse Sessions
langfuse_context.update_current_trace(
output=response["content"],
)
yield history + [response]
async def respond(prompt: str, history):
async for message in create_response(prompt, history):
yield message
User feedback handler
We implement user feedback tracking in Langfuse via the like
event for the Gradio chatbot (reference). This methdod reuses the current trace id available in the global state of this application.
def handle_like(data: gr.LikeData):
global current_trace_id
if data.liked:
langfuse.score(value=1, name="user-feedback", trace_id=current_trace_id)
else:
langfuse.score(value=0, name="user-feedback", trace_id=current_trace_id)
Retries
Allow to retry a completion via the Gradio Chatbot retry
event (docs). This is not specific to the integration with Langfuse.
async def handle_retry(history, retry_data: gr.RetryData):
new_history = history[: retry_data.index]
previous_prompt = history[retry_data.index]["content"]
async for message in respond(previous_prompt, new_history):
yield message
Run Gradio Chatbot
After implementing all methods above, we can now put together the Gradio Chatbot and launch it. If run within Colab, you should see an embedded Chatbot interface.
with gr.Blocks() as demo:
gr.Markdown("# Chatbot using 🤗 Gradio + 🪢 Langfuse")
chatbot = gr.Chatbot(
label="Chat",
type="messages",
show_copy_button=True,
avatar_images=(
None,
"https://static.langfuse.com/cookbooks/gradio/hf-logo.png",
),
)
prompt = gr.Textbox(max_lines=1, label="Chat Message")
prompt.submit(respond, [prompt, chatbot], [chatbot])
chatbot.retry(handle_retry, chatbot, [chatbot])
chatbot.like(handle_like, None, None)
chatbot.clear(set_new_session_id)
if __name__ == "__main__":
demo.launch(share=True, debug=True)
Explore data in Langfuse
When interacting with the Chatbot, you should see traces, sessions, and feedback scores in your Langfuse project. See video above for a walkthrough.
Example trace, session, and user feedback in Langfuse (public link):
If you have any questions or feedback, please join the Langfuse Discord or create a new thread on GitHub Discussions.