Multi-Modality and Attachments

Support for media attachments to traces is currently in beta. Please report any issues and add feature requests to this ongoing discussion thread.

Langfuse supports multi-modal traces including text, images, audio, and other attachments.

By default, base64 encoded data URIs are handled automatically by the Langfuse SDKs. They are extracted from the payloads commonly used in multi-modal LLMs, uploaded to Langfuse’s object storage, and linked to the trace.

This also works if you:

Reference media files via external URLs.
Customize the handling of media files in the SDKs via the LangfuseMedia class.
Integrate via the Langfuse API directly.

Learn more on how to get started and how this works under the hood below.

Examples

Trace in Langfuse UI

Availability

Langfuse Cloud

Multi-modal attachments on Langfuse Cloud are free while in beta. We will be rolling out a new pricing metric to account for the additional storage and compute costs associated with large multi-modal traces in the coming weeks.

Self-hosting

Multi-modal attachments are available today. You need to configure your own object storage bucket via the Langfuse environment variables (LANGFUSE_S3_MEDIA_UPLOAD_*). See self-hosting documentation for details on these environment variables. S3-compatible APIs are supported across all major cloud providers and can be self-hosted via minio. Note that the configured storage bucket must have a publicly resolvable hostname to support direct uploads via our SDKs and media asset fetching directly from the browser.

Supported media formats

Langfuse supports:

Images: .png, .jpg, .webp
Audio files: .mpeg, .mp3, .wav
Other attachments: .pdf, plain text

If you require support for additional file types, please let us know in our GitHub Discussion where we’re actively gathering feedback on multi-modal support.

Get Started

Base64 data URI encoded media

If you use base64 encoded images, audio, or other files in your LLM applications, upgrade to the latest version of the Langfuse SDKs. The Langfuse SDKs automatically detect and handle base64 encoded media by extracting it, uploading it separately as a Langfuse Media file, and including a reference in the trace.

This works with standard Data URI (MDN) formatted media (like those used by OpenAI and other LLMs).

This notebook includes a couple of examples using the OpenAI SDK and LangChain.

External media (URLs)

Langfuse supports in-line rendering of media files via URLs if they follow common formats. In this case, the media file is not uploaded to Langfuse’s object storage but simply rendered in the UI directly from the source.

Supported formats:

![Alt text](https://example.com/image.jpg)

Custom attachments

If you want to have more control or your media is not base64 encoded, you can upload arbitrary media attachments to Langfuse via the SDKs using the new LangfuseMedia class. Wrap media with LangfuseMedia before including it in trace inputs, outputs, or metadata. See the multi-modal documentation for examples.

from langfuse import get_client, observe
from langfuse.media import LangfuseMedia
 
# Create a LangfuseMedia object from a file
 
with open("static/bitcoin.pdf", "rb") as pdf_file:
pdf_bytes = pdf_file.read()
 
# Wrap media in LangfuseMedia class
 
pdf_media = LangfuseMedia(content_bytes=pdf_bytes, content_type="application/pdf")
 
# Using with the decorator
 
@observe()
def process_document():
langfuse = get_client()
 
    # Update the current trace with the media file
    langfuse.update_current_trace(
        metadata={"document": pdf_media}
    )
 
    # Or update the current span
    langfuse.update_current_span(
        input={"document": pdf_media}
    )
 
# Using with context managers
 
langfuse = get_client()
 
with langfuse.start_as_current_span(name="analyze-document") as span: # Include media in the span input, output, or metadata
span.update(
input={"document": pdf_media},
metadata={"file_size": len(pdf_bytes)}
)
 
    # Process document...
 
    # Add results with media to the output
    span.update(output={
        "summary": "This document explains Bitcoin...",
        "original": pdf_media
    })

from langfuse.decorators import observe, langfuse_context
from langfuse.media import LangfuseMedia
 
with open("static/bitcoin.pdf", "rb") as pdf_file:
        pdf_bytes = pdf_file.read()
 
# Wrap media in LangfuseMedia class
wrapped_obj = LangfuseMedia(
    obj=pdf_bytes, content_bytes=pdf_bytes, content_type="application/pdf"
)
 
# Optionally, access media via wrapped_obj.obj
wrapped_obj.obj
 
@observe()
def main():
    langfuse_context.update_current_trace(
      input=wrapped_obj,
      metadata={
          "context": wrapped_obj
      },
    )
 
    return # Limitation: LangfuseMedia object does not work in decorated function IO, needs to be set via update_current_trace or update_current_observation
 
main()

import { Langfuse, LangfuseMedia } from "langfuse";
import fs from "fs";
 
// Initialize Langfuse client
const langfuse = new Langfuse();
 
// Wrap media in LangfuseMedia class
const wrappedMedia = new LangfuseMedia({
  contentBytes: fs.readFileSync("./static/bitcoin.pdf"),
  contentType: "application/pdf",
});
 
// Optionally, access media via wrappedMedia.obj
console.log(wrappedMedia.obj);
 
// Include media in any trace or observation
const trace = langfuse.trace({
  name: "test-trace-10",
  metadata: {
    context: wrappedMedia,
  },
});

API

If you use the API directly to log traces to Langfuse, you need to follow these steps:

Upload media to Langfuse

If you use base64 encoded media: you need to extract it from the trace payloads similar to how the Langfuse SDKs do it.
Initialize the upload and get a mediaId and presignedURL: POST /api/public/media.
Upload media file: PUT [presignedURL].

See this end-to-end example (Python) on how to use the API directly to upload media files.

Add reference to mediaId in trace/observation

Use the Langfuse Media Token to reference the mediaId in the trace or observation input, output, or metadata.

How does it work?

When using media files (that are not referenced via external URLs), Langfuse handles them in the following way:

1. Media Upload Process

Detection and Extraction

Langfuse supports media files in traces and observations on input, output, and metadata fields
SDKs separate media from tracing data client-side for performance optimization
Media files are uploaded directly to object storage (AWS S3 or compatible)
Original media content is replaced with a reference string

Security and Optimization

Uploads use presigned URLs with content validation (content length, content type, content SHA256 hash)
Deduplication: Files are simply replaced by their mediaId reference string if already uploaded
File uniqueness determined by project, content type, and content SHA256 hash

Implementation Details

Python SDK: Background thread handling for non-blocking execution
JS/TS SDKs: Asynchronous, non-blocking implementation
API support for direct uploads (see guide)

2. Media Reference System

The base64 data URIs and the wrapped LangfuseMedia objects in Langfuse traces are replaced by references to the mediaId in the following standardized token format, which helps reconstruct the original payload if needed:

@@@langfuseMedia:type={MIME_TYPE}|id={LANGFUSE_MEDIA_ID}|source={SOURCE_TYPE}@@@

MIME_TYPE: MIME type of the media file, e.g., image/jpeg
LANGFUSE_MEDIA_ID: ID of the media file in Langfuse’s object storage
SOURCE_TYPE: Source type of the media file, can be base64_data_uri, bytes, or file

Based on this token, the Langfuse UI can automatically detect the mediaId and render the media file inline. The LangfuseMedia class provides utility functions to extract the mediaId from the reference string.

3. Resolving Media References

When dealing with traces, observations, or dataset items that include media references, you can convert them back to their base64 data URI format using the resolve_media_references utility method provided by the Langfuse client. This is particularly useful for reinserting the original content during fine-tuning, dataset runs, or replaying a generation. The utility method traverses the parsed object and returns a deep copy with all media reference strings replaced by the corresponding base64 data URI representations.

from langfuse import get_client
 
# Initialize Langfuse client
langfuse = get_client()
 
# Example object with media references
obj = {
    "image": "@@@langfuseMedia:type=image/jpeg|id=some-uuid|source=bytes@@@",
    "nested": {
        "pdf": "@@@langfuseMedia:type=application/pdf|id=some-other-uuid|source=bytes@@@"
    }
}
 
# Resolve media references to base64 data URIs
resolved_obj = langfuse.resolve_media_references(
    obj=obj,
    resolve_with="base64_data_uri"
)
 
# Result:
# {
#     "image": "data:image/jpeg;base64,/9j/4AAQSkZJRg...",
#     "nested": {
#         "pdf": "data:application/pdf;base64,JVBERi0xLjcK..."
#     }
# }

from langfuse import Langfuse
 
# Initialize Langfuse client
langfuse = Langfuse()
 
# Example object with media references
obj = {
    "image": "@@@langfuseMedia:type=image/jpeg|id=some-uuid|source=bytes@@@",
    "nested": {
        "pdf": "@@@langfuseMedia:type=application/pdf|id=some-other-uuid|source=bytes@@@"
    }
}
 
# Resolve media references to base64 data URIs
resolved_trace = langfuse.resolve_media_references(
    obj=obj,
    resolve_with="base64_data_uri"
)
 
# Result:
# {
#     "image": "data:image/jpeg;base64,/9j/4AAQSkZJRg...",
#     "nested": {
#         "pdf": "data:application/pdf;base64,JVBERi0xLjcK..."
#     }
# }

import { Langfuse } from "langfuse";
 
// Initialize Langfuse client
const langfuse = new Langfuse();
 
// Example object with media references
const obj = {
  image: "@@@langfuseMedia:type=image/jpeg|id=some-uuid|source=bytes@@@",
  nested: {
    pdf: "@@@langfuseMedia:type=application/pdf|id=some-other-uuid|source=bytes@@@",
  },
};
 
// Resolve media references to base64 data URIs
const resolvedTrace = await langfuse.resolveMediaReferences({
  obj: obj,
  resolveWith: "base64DataUri",
});
 
// Result:
// {
//     image: "data:image/jpeg;base64,/9j/4AAQSkZJRg...",
//     nested: {
//         pdf: "data:application/pdf;base64,JVBERi0xLjcK..."
//     }
// }

GitHub Discussions

Masking Event queuing/batching

Was this page helpful?

Support