Overview
Caching is a technique used to temporarily store copies of data or computation results to improve performance by reducing the need to repeatedly fetch or compute the same data from slower or more resource-intensive sources. In the context of AI applications, caching provides several important benefits:- 🚀 Performance improvement - Avoid repeating expensive operations like API calls or complex calculations
- 💰 Cost reduction - Minimize repeated calls to paid services (like external APIs or LLM providers)
- ⚡ Latency reduction - Deliver faster responses to users by serving cached results
- 🔄 Consistency - Ensure consistent responses for identical inputs
Core concepts
Cache types
BeeAI framework offers several cache implementations out of the box:| Type | Description |
|---|---|
| UnconstrainedCache | Simple in-memory cache with no limits |
| SlidingCache | In-memory cache that maintains a maximum number of entries |
| FileCache | Persistent cache that stores data on disk |
| NullCache | Special implementation that performs no caching (useful for testing) |
BaseCache interface, making them interchangeable in your code.
Usage patterns
BeeAI framework supports several caching patterns:| Usage pattern | Description |
|---|---|
| Direct caching | Manually store and retrieve values |
| Function decoration | Automatically cache function returns |
| Tool integration | Cache tool execution results |
| LLM integration | Cache model responses |
Basic usage
Caching function output
The simplest way to use caching is to wrap a function that produces deterministic output:Using with tools
BeeAI framework’s caching system seamlessly integrates with tools:Using with LLMs
You can also cache LLM responses to save on API costs:Cache types
UnconstrainedCache
The simplest cache type with no constraints on size or entry lifetime. Good for development and smaller applications.SlidingCache
Maintains a maximum number of entries, removing the oldest entries when the limit is reached.FileCache
Persists cache data to disk, allowing data to survive if application restarts. Use it when caches must survive process restarts or you need to share state between workers. Persisted entries still respect TTL and eviction settings, so design your limits accordingly.With custom provider
Seed a file-backed cache from another provider when you want to warm the disk cache before first use or promote hot data captured in memory. The example below clones anUnconstrainedCache into the JSON file cache so new processes can reuse it immediately.
NullCache
A special cache that implements theBaseCache interface but performs no caching. Useful for testing or temporarily disabling caching.
The reason for implementing is to enable Null object pattern.
Advanced usage
Cache decorator
Create a reusable decorator when you want to keep caching logic close to your functions without wiring cache calls manually.CacheFn helper
For more dynamic caching needs, theCacheFn helper provides a functional approach:
It is well-suited for API tokens or other resources that return an expiry with each refresh—call update_ttl before returning the value so the cache matches the upstream lifetime.
Creating a custom cache provider
You can create your own cache implementation by extending theBaseCache class:
Examples
Python
Explore reference cache implementations in Python
TypeScript
Explore reference cache implementations in TypeScript