Datasets

A dataset is a collection of inputs and expected outputs and is used to test your application. Both, UI-based and SDK-based experiments support Langfuse Datasets.

Langfuse Dataset View

Why use datasets?

Create test cases for your application with real production traces
Collaboratively create and collect dataset items with your team
Have a single source of truth for your test data

Get Started

Creating a dataset

Datasets have a name which is unique within a project.

langfuse.create_dataset(
    name="<dataset_name>",
    # optional description
    description="My first dataset",
    # optional metadata
    metadata={
        "author": "Alice",
        "date": "2022-01-01",
        "type": "benchmark"
    }
)

See Python SDK docs for details on how to initialize the Python client.

import { LangfuseClient } from "@langfuse/client"
 
const langfuse = new LangfuseClient()
 
await langfuse.api.datasets.create({
  name: "<dataset_name>",
  // optional description
  description: "My first dataset",
  // optional metadata
  metadata: {
    author: "Alice",
    date: "2022-01-01",
    type: "benchmark",
  },
});

Navigate to Your Project > Datasets
Click on + New dataset to create a new dataset.

Create dataset

Upload or create new dataset items

Dataset items can be added to a dataset by providing the input and optionally the expected output. If preferred, dataset items can be imported using the CSV uploader in the Langfuse UI.

langfuse.create_dataset_item(
    dataset_name="<dataset_name>",
    # any python object or value, optional
    input={
        "text": "hello world"
    },
    # any python object or value, optional
    expected_output={
        "text": "hello world"
    },
    # metadata, optional
    metadata={
        "model": "llama3",
    }
)

See Python SDK docs for details on how to initialize the Python client.

import { LangfuseClient } from "@langfuse/client";
 
const langfuse = new LangfuseClient();
 
await langfuse.api.datasetItems.create({
  datasetName: "<dataset_name>",
  // any JS object or value
  input: {
    text: "hello world",
  },
  // any JS object or value, optional
  expectedOutput: {
    text: "hello world",
  },
  // metadata, optional
  metadata: {
    model: "llama3",
  },
});

See JS/TS SDK docs for details on how to initialize the JS/TS client.

Create synthetic datasets

Frequently, you want to create synthetic examples to test your application to bootstrap your dataset. LLMs are great at generating these by prompting for common questions/tasks.

To get started have a look at this cookbook for examples on how to generate synthetic datasets:

Notebook: Synthetic Datasets

Create items from production data

A common workflow is to select production traces where the application did not perform as expected. Then you let an expert add the expected output to test new versions of your application on the same data.

langfuse.create_dataset_item(
    dataset_name="<dataset_name>",
    input={ "text": "hello world" },
    expected_output={ "text": "hello world" },
    # link to a trace
    source_trace_id="<trace_id>",
    # optional: link to a specific span, event, or generation
    source_observation_id="<observation_id>"
)

import { LangfuseClient } from "@langfuse/client";
 
const langfuse = new LangfuseClient();
 
await langfuse.api.datasetItems.create({
  datasetName: "<dataset_name>",
  input: { text: "hello world" },
  expectedOutput: { text: "hello world" },
  // link to a trace
  sourceTraceId: "<trace_id>",
  // optional: link to a specific span, event, or generation
  sourceObservationId: "<observation_id>",
});

In the UI, use + Add to dataset on any observation (span, event, generation) of a production trace.

Edit/archive dataset items

You can edit or archive dataset items. Archiving items will remove them from future experiment runs.

You can upsert items by providing the id of the item you want to update.

langfuse.create_dataset_item(
    id="<item_id>",
    # example: update status to "ARCHIVED"
    status="ARCHIVED"
)

You can upsert items by providing the id of the item you want to update.

import { LangfuseClient } from "@langfuse/client";
 
const langfuse = new LangfuseClient();
 
await langfuse.api.datasetItems.create({
  id: "<item_id>",
  // example: update status to "ARCHIVED"
  status: "ARCHIVED",
});

In the UI, you can edit the item by clicking on the item id. To archive or delete the item, click on the dots next to the item and select Archive or Delete.

Dataset runs

Once you created a dataset, you can test and evaluate your application based on it.

Experiments via SDK Experiments via UI

Learn more about the Experiments data model.

Overview Experiments via SDK

Was this page helpful?

Support