SDK-level prompt caching
The latest release of the Python and JS/TS Langfuse SDK includes a new prompt caching feature that improves the reliability and performance of your applications.
After launching the highly anticipated prompt management feature in Langfuse, we’ve enhanced its performance and reliability in the Python and JS/TS SDKs through prompt caching. This feature stores prompts in the client SDKs’ memory, cutting down on server requests and boosting your applications’ reliability and speed. With a default TTL of 60 seconds, caching settings can be adjusted for individual prompts within the SDK. Should a refetch of a prompt fail, the SDK will automatically revert to the cached version, ensuring your application’s availability is not compromised.
How it works
# Get current production prompt version and cache for 5 minutes
prompt = langfuse.get_prompt("prompt name", cache_ttl_seconds=300)
# Get a specific prompt version and cache for 5 minutes
prompt = langfuse.get_prompt("prompt name", version=3, cache_ttl_seconds=300)
# Disable caching for a prompt
prompt = langfuse.get_prompt("prompt name", cache_ttl_seconds=0)
Upgrade path
To benefit from prompt caching, upgrade to the latest version of the Python and JS/TS SDKs. The caching feature is enabled with a 60 seconds TTL default, so you don’t need to make any changes to your code to start using it.
More details
Check out the full documentation for more details on how to use this feature.