IntegrationsGatewaysKong Gateway

Trace AI APIs through Kong API Gateway with Langfuse

This guide demonstrates how to integrate Langfuse into your Kong API Gateway to automatically monitor, debug, and evaluate AI API calls without modifying your application code.

What is Kong API Gateway?: Kong Gateway is a cloud-native, platform-agnostic, scalable API Gateway that manages APIs and microservices. It acts as a central point of control for API traffic, providing features like authentication, rate limiting, and monitoring.

What is Langfuse?: Langfuse is an open-source observability platform for AI agents. It helps you visualize and monitor LLM calls, tool usage, cost, latency, and more.

Features

  • Zero-code instrumentation: Automatic tracing for AI API calls proxied through Kong
  • Multi-provider support: OpenAI-compatible APIs, vLLM, and custom providers
  • Rich context capture: User sessions, conversations, and metadata
  • Performance metrics: Latency, throughput, and token-level analytics
  • Non-blocking architecture: Async operation with minimal overhead
  • Production-ready: Error resilience and graceful degradation

Supported AI Providers

ProviderEndpointsStatus
OpenAI-Compatible/v1/chat/completions, /v1/completions, /v1/embeddingsβœ…
vLLM/generate, /v1/completionsβœ…
Custom ProvidersExtensible detection frameworkβœ…

1. Install the Kong Plugin

Below we install the Kong Langfuse Tracing plugin using one of the available methods.

Prerequisites

  • Kong Gateway 3.0+ installed and running
  • Langfuse account (sign up)
  • Access to Kong’s Admin API
luarocks install kong-plugin-ai-tracing

Option 2: From Source

git clone https://github.com/Ramtinboreili/kong-langfuse-tracing.git
cd kong-langfuse-tracing
luarocks make rockspec/kong-plugin-ai-tracing-1.0.0-1.rockspec

Option 3: Docker Compose

version: '3.8'
services:
  kong:
    image: kong:3.4
    environment:
      KONG_PLUGINS: bundled,ai-tracing
      KONG_LUA_PACKAGE_PATH: /usr/local/kong/plugins/?.lua;;
      KONG_DATABASE: postgres
      KONG_PG_HOST: postgres
      KONG_PG_USER: kong
      KONG_PG_PASSWORD: kong
    volumes:
      - ./plugins/ai-tracing:/usr/local/kong/plugins/ai-tracing
    ports:
      - "8000:8000"
      - "8001:8001"

Enable the Plugin in Kong

Add the plugin to your Kong configuration:

# In kong.conf
plugins = bundled,ai-tracing
 
# Or via environment variable
export KONG_PLUGINS=bundled,ai-tracing

Restart Kong Gateway to load the plugin.

2. Configure Langfuse Credentials

Next, set up your Langfuse API keys. You can get these keys by signing up for a free Langfuse Cloud account or by self-hosting Langfuse.

# Get keys for your project from the project settings page
export LANGFUSE_PUBLIC_KEY="pk-lf-..."
export LANGFUSE_SECRET_KEY="sk-lf-..."
export LANGFUSE_BASE_URL="https://cloud.langfuse.com" # πŸ‡ͺπŸ‡Ί EU region
# export LANGFUSE_BASE_URL="https://us.cloud.langfuse.com" # πŸ‡ΊπŸ‡Έ US region

3. Enable Plugin on Kong Service

Configure the plugin for your AI service using Kong’s Admin API:

curl -X POST http://localhost:8001/services/YOUR_AI_SERVICE/plugins \
  -H "Content-Type: application/json" \
  -d '{
    "name": "ai-tracing",
    "config": {
      "langfuse_enabled": true,
      "langfuse_public_key": "pk-lf-...",
      "langfuse_secret_key": "sk-lf-...",
      "langfuse_endpoint": "https://cloud.langfuse.com/api/public/ingestion",
      "environment": "production"
    }
  }'

Configuration Parameters

ParameterTypeDefaultRequiredDescription
langfuse_enabledbooleanfalseYesEnable/disable Langfuse integration
langfuse_public_keystring-YesYour Langfuse public API key
langfuse_secret_keystring-YesYour Langfuse secret API key
langfuse_endpointstringhttps://cloud.langfuse.com/api/public/ingestionNoLangfuse API endpoint
langfuse_timeoutnumber5000NoHTTP timeout in milliseconds
environmentstringproductionNoEnvironment tag for filtering traces
log_levelstringinfoNoLogging verbosity (debug, info, warn, error)

For self-hosted Langfuse instances, update langfuse_endpoint to your instance URL followed by /api/public/ingestion.

4. Hello World Example

Below we make a simple AI request through Kong Gateway. The plugin automatically captures the request and creates a trace in Langfuse.

curl -X POST http://kong-gateway:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "X-User-Id: user-12345" \
  -d '{
    "model": "gpt-4",
    "messages": [
      {"role": "user", "content": "Explain quantum computing"}
    ],
    "temperature": 0.7,
    "max_tokens": 500
  }'

Example trace in Langfuse

Clicking the link above (or your own project link) lets you view all observations, token usage, latencies, etc., for debugging or optimization.

5. Adding Context with Headers

Enrich your traces with user and session context using HTTP headers. This allows you to filter and analyze traces by user, session, or conversation.

curl -X POST http://kong-gateway:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "X-User-Id: user-12345" \
  -H "X-Session-Id: session-abc" \
  -H "X-Chat-Id: chat-789" \
  -H "X-Organization-Id: org-acme" \
  -d '{
    "model": "gpt-4",
    "messages": [
      {"role": "user", "content": "What is machine learning?"}
    ]
  }'

Supported Context Headers

HeaderDescriptionExample
X-User-IdUnique user identifieruser-12345
X-Session-IdSession identifiersession-abc
X-Chat-IdConversation/chat IDchat-789
X-Message-IdIndividual message IDmsg-54321
X-Organization-IdOrganization contextorg-acme
X-Project-IdProject identifierproject-xyz

Example trace with context

6. Adding Metadata

Include additional metadata in your request body for richer traces. This is useful for tracking features, experiments, or user-specific variables.

curl -X POST http://kong-gateway:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "X-User-Id: user-12345" \
  -d '{
    "model": "gpt-4",
    "messages": [
      {"role": "user", "content": "Recommend a movie"}
    ],
    "metadata": {
      "user_id": "user-12345",
      "chat_id": "chat-789",
      "project_id": "project-xyz",
      "features": {
        "web_search": true,
        "image_generation": false
      },
      "variables": {
        "user_tier": "premium",
        "language": "en"
      }
    }
  }'

Python / FastAPI Application

import httpx
from fastapi import FastAPI
 
KONG_URL = "http://kong-gateway:8000"
 
async def chat_with_ai(user_id: str, session_id: str, message: str):
    async with httpx.AsyncClient() as client:
        response = await client.post(
            f"{KONG_URL}/v1/chat/completions",
            json={
                "model": "gpt-4",
                "messages": [{"role": "user", "content": message}],
                "metadata": {
                    "user_id": user_id,
                    "session_id": session_id
                }
            },
            headers={
                "Content-Type": "application/json",
                "X-User-Id": user_id,
                "X-Session-Id": session_id
            }
        )
        return response.json()
 
# Usage
result = await chat_with_ai("user-123", "session-abc", "Hello!")
print(result)

Captured Data

The plugin creates comprehensive traces in Langfuse with the following structure:

{
  "trace": {
    "id": "trace-12345",
    "name": "/v1/chat/completions",
    "userId": "user-12345",
    "sessionId": "session-abc",
    "metadata": {
      "provider": "openai_compatible",
      "model": "gpt-4",
      "status_code": 200,
      "environment": "production",
      "total_duration_ms": 1250,
      "time_per_token_ms": 12.5,
      "throughput_tokens_per_second": 80.0
    }
  },
  "observations": [
    {
      "type": "generation",
      "name": "chat_completion",
      "usage": {
        "promptTokens": 150,
        "completionTokens": 25,
        "totalTokens": 175
      },
      "metadata": {
        "temperature": 0.7,
        "max_tokens": 500,
        "finish_reason": "stop"
      }
    }
  ]
}

Performance Metrics

  • Total Duration: End-to-end request processing time
  • Time per Token: Average latency per generated token
  • Throughput: Tokens processed per second

Token Analytics

  • Prompt Tokens: Input token count
  • Completion Tokens: Output token count
  • Total Tokens: Combined usage
  • Cost Tracking: Monitor spending across requests

Environment-Specific Configuration

You can configure different Langfuse projects for development, staging, and production environments.

Development Environment

curl -X POST http://localhost:8001/services/ai-service-dev/plugins \
  --data "name=ai-tracing" \
  --data "config.langfuse_enabled=true" \
  --data "config.langfuse_public_key=pk-lf-dev-xxx" \
  --data "config.langfuse_secret_key=sk-lf-dev-xxx" \
  --data "config.environment=development" \
  --data "config.log_level=debug"

Production Environment

curl -X POST http://localhost:8001/services/ai-service-prod/plugins \
  --data "name=ai-tracing" \
  --data "config.langfuse_enabled=true" \
  --data "config.langfuse_public_key=pk-lf-prod-xxx" \
  --data "config.langfuse_secret_key=sk-lf-prod-xxx" \
  --data "config.environment=production" \
  --data "config.log_level=warn"

Troubleshooting

Data Not Appearing in Langfuse

  1. Verify Credentials: Ensure your Langfuse API keys are correct
  2. Check Connectivity: Confirm Kong can reach the Langfuse endpoint
# View plugin status
curl http://localhost:8001/services/YOUR_SERVICE/plugins | \
  jq '.data[] | select(.name=="ai-tracing")'
 
# Check Kong logs
docker-compose logs kong | grep "ai-tracing"
 
# Test Langfuse endpoint
curl -I https://cloud.langfuse.com/api/public/ingestion
  1. Review Logs: Check Kong logs for errors

Missing User Context

  • Ensure HTTP headers are properly set in requests
  • Verify header names match expected format (X-User-Id, not X-UserId)
  • Check that headers are forwarded through any proxies

Performance Issues

  • Monitor Kong metrics during high load
  • Adjust langfuse_timeout if experiencing timeouts
  • Check async timer performance in Kong logs

Enable Debug Logging

curl -X PATCH http://localhost:8001/plugins/PLUGIN_ID \
  --data "config.log_level=debug"

Advanced Usage

Custom AI Providers

Extend the plugin to support additional AI providers by modifying the detection logic in handler.lua:

local function detect_ai_provider(path, headers)
  if path:find("/v1/chat/completions") then
    return "openai_compatible"
  elseif path:find("/anthropic") then
    return "anthropic"
  elseif path:find("/cohere") then
    return "cohere"
  else
    return "custom_provider"
  end
end

Integration with Other Kong Plugins

The AI tracing plugin works alongside other Kong plugins:

  • Rate Limiting: Combine with rate limiting for cost control
  • Authentication: Use with key-auth or JWT for user identification
  • Request Transformer: Modify requests before tracing
⚠️

Security Best Practices:

  • Store Langfuse API keys securely (use Kong’s vault integration or environment variables)
  • Review exported data for PII compliance
  • Restrict access to plugin configuration in production
  • Ensure Kong-Langfuse communication uses HTTPS
  • Consider data retention policies for sensitive content

Resources

Was this page helpful?