Tool-Level Caching
Complete guide to optimizing tool performance with intelligent caching, including TTL expiration, LRU eviction, context-aware keys, and comprehensive metrics.
Table of Contentsβ
- Overview
- Quick Start
- Core Concepts
- Configuration
- Cache Key Generation
- Usage Patterns
- Advanced Techniques
- Performance
- Monitoring & Debugging
- Best Practices
- Testing
- Troubleshooting
- API Reference
Overviewβ
Tool-Level Caching provides intelligent, automatic caching for any tool in the Spice Framework. It dramatically reduces execution time for expensive operations like database queries, external API calls, and heavy computations.
Why Tool-Level Caching?β
Without Caching:
Request 1: fetch_user(id=123) β Database query (100ms)
Request 2: fetch_user(id=123) β Database query (100ms)
Request 3: fetch_user(id=123) β Database query (100ms)
Total: 300ms
With Caching:
Request 1: fetch_user(id=123) β Database query (100ms) β CACHED
Request 2: fetch_user(id=123) β Cache hit (0.001ms)
Request 3: fetch_user(id=123) β Cache hit (0.001ms)
Total: ~100ms (66% faster!)
Key Featuresβ
| Feature | Description | Benefit |
|---|---|---|
| β±οΈ TTL Expiration | Automatic time-based invalidation | Fresh data without manual clearing |
| π LRU Eviction | Least Recently Used removal | Bounded memory usage |
| π― Context-Aware Keys | Tenant/user/session in keys | Perfect for multi-tenancy |
| π Metrics | Hits, misses, hit rate tracking | Monitor cache efficiency |
| π§ Custom Keys | Full control over key generation | Optimize for your use case |
| π Thread-Safe | Concurrent access without locks | High-performance under load |
| β¨ DSL Integration | Works with contextAwareTool | Clean, declarative syntax |
When to Use Cachingβ
β Good Use Cases:
- Database queries (user lookups, policy retrieval)
- External API calls (weather, geocoding, translation)
- Expensive computations (NLP, image processing, PDF parsing)
- Static/semi-static data (configuration, catalogs)
- High-frequency repeated requests
β Bad Use Cases:
- Real-time data (stock prices, live updates)
- User-specific writes (mutations should invalidate)
- One-time operations (no repeated access)
- Tiny operations (<1ms execution time)
Quick Startβ
1. Basic Cachingβ
The simplest way to add caching:
// Original tool
val userLookup = SimpleTool("user_lookup") { params ->
val userId = params["id"] as String
database.findUser(userId) // Expensive!
}
// Add caching with one line
val cachedUserLookup = userLookup.cached(
ttl = 300, // Cache for 5 minutes
maxSize = 1000 // Keep up to 1000 users
)
// Use it normally
val result = cachedUserLookup.execute(mapOf("id" to "user-123"))
// First call: Database query (slow)
// Second call: Cache hit (instant!)
2. Context-Aware Cachingβ
For multi-tenant tools, use the cache {} DSL:
val policyLookup = contextAwareTool("policy_lookup") {
description = "Get tenant policy configuration"
param("policyType", "string", "Policy type", required = true)
// π― Configure caching with context awareness
cache {
// Cache key includes tenant ID automatically
keyBuilder = { params, context ->
"${context.tenantId}|policy:${params["policyType"]}"
}
ttl = 600 // 10 minutes
maxSize = 500 // 500 policies per tenant
}
execute { params, context ->
val policyType = params["policyType"] as String
val tenantId = context.tenantId!!
// Expensive policy fetch
policyService.getPolicy(tenantId, policyType)
}
}
// Execute with context
withAgentContext("tenantId" to "ACME") {
val result = policyLookup.execute(mapOf("policyType" to "premium"))
// First call: Fetches from service
// Second call: Cache hit (tenant-isolated)
}
3. Zero-Config Cachingβ
For simple cases, let the framework handle everything:
val weatherTool = contextAwareTool("weather") {
param("city", "string", required = true)
// β¨ Default caching: auto key, 1hr TTL, 1000 entries
cache { }
execute { params, context ->
val city = params["city"] as String
weatherApi.getWeather(city)
}
}
Core Conceptsβ
Cache Lifecycleβ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β 1. Request arrives with parameters β
βββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β 2. Generate cache key from params + context β
β Example: "tenant:ACME|user:123|doc:456" β
βββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β 3. Check cache for existing entry β
βββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββββββ
β
βββββββ΄ββββββ
β β
HIT β MISS β
β β
βΌ βΌ
ββββββββββ βββββββββββββββββββββββββββ
β Return β β 4. Execute tool β
β cached β β 5. Store result in cacheβ
β result β β 6. Return result β
ββββββββββ βββββββββββββββββββββββββββ
Cache Entry Structureβ
data class CacheEntry(
val result: SpiceResult<ToolResult>, // Cached tool result
val timestamp: Long, // Creation time (milliseconds)
val accessTime: Long // Last access time (for LRU)
)
TTL (Time-To-Live)β
TTL defines how long an entry stays valid:
cache { ttl = 300 } // 300 seconds = 5 minutes
// Timeline:
// t=0s : Entry created
// t=150s : Entry accessed β Cache hit
// t=300s : Entry expires
// t=301s : Entry accessed β Cache miss (expired)
Expiration Check:
fun isExpired(entry: CacheEntry): Boolean {
val age = System.currentTimeMillis() - entry.timestamp
return age > (ttl * 1000)
}
LRU (Least Recently Used) Evictionβ
When cache is full, the least recently accessed entry is removed:
cache { maxSize = 3 }
// Operations:
put("A") β Cache: [A]
put("B") β Cache: [A, B]
put("C") β Cache: [A, B, C] // Full!
get("A") β Cache: [B, C, A] // A moved to end (recently used)
put("D") β Cache: [C, A, D] // B evicted (least recently used)
Context-Aware Keysβ
Cache keys can include context information for multi-tenant isolation:
// Without context: UNSAFE for multi-tenant
cache {
keyBuilder = { params, _ ->
"doc:${params["id"]}" // β Same key for all tenants!
}
}
// With context: SAFE for multi-tenant
cache {
keyBuilder = { params, context ->
"${context.tenantId}|doc:${params["id"]}" // β
Tenant-isolated!
}
}
Configurationβ
CacheConfigBlock DSLβ
cache {
// Custom key builder
keyBuilder = { params, context ->
"${context.tenantId}|${params["id"]}"
}
// Time-to-live in seconds
ttl = 300
// Maximum cache size
maxSize = 1000
// Enable/disable metrics collection
enableMetrics = true
}
ToolCacheConfig (Direct API)β
val config = ToolCacheConfig(
maxSize = 1000,
ttl = 3600,
enableMetrics = true,
keyBuilder = { params, context ->
"${context.tenantId}:${params["id"]}"
}
)
val cachedTool = CachedTool(baseTool, config)
Default Valuesβ
ToolCacheConfig(
maxSize = 1000, // 1000 entries
ttl = 3600, // 1 hour
enableMetrics = true, // Metrics enabled
keyBuilder = null // Auto-generated keys
)
Cache Key Generationβ
Automatic Key Generationβ
When keyBuilder is not provided, keys are generated automatically:
// Inputs:
params = mapOf("sku" to "ABC123", "version" to "v2")
context = AgentContext(tenantId = "ACME", userId = "user-456")
// Generated key:
"sku=ABC123|version=v2::tenantId=ACME|userId=user-456"
// β SHA-256 hash β "7f3a8c2d..."
Algorithm:
- Sort parameters alphabetically
- Format as
key1=value1|key2=value2 - Append context as
::tenantId=X|userId=Y - Compute SHA-256 hash for consistent length
Parameters Excluded:
- Internal parameters starting with
__(e.g.,__context,__internal) - Null values
Custom Key Buildersβ
Simple Keysβ
cache {
keyBuilder = { params, _ ->
params["id"] as String // Just use ID
}
}
Multi-Parameter Keysβ
cache {
keyBuilder = { params, context ->
val userId = params["userId"] as String
val docId = params["docId"] as String
"${context.tenantId}:$userId:$docId"
}
}
Normalized Keysβ
cache {
keyBuilder = { params, context ->
val query = (params["query"] as String).lowercase().trim()
val category = params["category"] as? String ?: "all"
"${context.tenantId}|search:$query:$category"
}
}
Conditional Keysβ
cache {
keyBuilder = { params, context ->
val includeArchived = params["includeArchived"] as? Boolean ?: false
if (includeArchived) {
// Different cache for archived queries
"${context.tenantId}|with-archived|${params["query"]}"
} else {
"${context.tenantId}|active-only|${params["query"]}"
}
}
}
Hierarchical Keysβ
cache {
keyBuilder = { params, context ->
// Cache at different levels
when {
context.userId != null ->
"user:${context.userId}|${params["id"]}"
context.tenantId != null ->
"tenant:${context.tenantId}|${params["id"]}"
else ->
"global|${params["id"]}"
}
}
}
Usage Patternsβ
Pattern 1: Database Query Cachingβ
Scenario: Reduce database load for frequently accessed users.
val userProfileTool = contextAwareTool("get_user_profile") {
description = "Fetch complete user profile"
param("userId", "string", "User identifier", required = true)
param("includePreferences", "boolean", "Include user preferences", required = false)
cache {
keyBuilder = { params, context ->
val userId = params["userId"] as String
val includePrefs = params["includePreferences"] as? Boolean ?: false
"${context.tenantId}|user:$userId|prefs:$includePrefs"
}
ttl = 600 // 10 minutes (profiles don't change often)
maxSize = 5000 // Cache 5000 user profiles
}
execute { params, context ->
val userId = params["userId"] as String
val includePrefs = params["includePreferences"] as? Boolean ?: false
// Expensive database queries
val profile = userRepository.findById(userId, context.tenantId)
if (includePrefs) {
val prefs = preferencesRepository.findByUserId(userId)
profile.copy(preferences = prefs)
} else {
profile
}
}
}
// Usage:
withAgentContext("tenantId" to "ACME") {
// First call: Database queries (slow)
val profile1 = userProfileTool.execute(mapOf("userId" to "u123"))
// Second call: Cache hit (instant!)
val profile2 = userProfileTool.execute(mapOf("userId" to "u123"))
}
Metrics:
- Without caching: 150ms average query time
- With caching: 0.001ms on cache hit
- Expected hit rate: 85-95% for active users
Pattern 2: External API Cachingβ
Scenario: Cache slow third-party API responses.
val geocodingTool = contextAwareTool("geocode_address") {
description = "Convert address to coordinates"
param("address", "string", "Full address", required = true)
param("country", "string", "Country code", required = false)
cache {
keyBuilder = { params, _ ->
// Geocoding doesn't need tenant isolation
val address = (params["address"] as String).lowercase().trim()
val country = params["country"] as? String ?: "US"
"geocode:$country:$address"
}
ttl = 86400 // 24 hours (addresses don't move!)
maxSize = 10000 // Cache 10k addresses
}
execute { params, context ->
val address = params["address"] as String
val country = params["country"] as? String ?: "US"
// Expensive external API call (500ms+)
geocodingApiClient.geocode(address, country)
}
}
// Usage:
val coords = geocodingTool.execute(mapOf(
"address" to "1600 Amphitheatre Parkway, Mountain View, CA"
))
Benefits:
- Reduces API costs (pay-per-request APIs)
- Faster response (no network latency)
- Resilience (works even if API is down temporarily)
Pattern 3: Computation Cachingβ
Scenario: Cache expensive document analysis results.
val documentAnalysisTool = contextAwareTool("analyze_document") {
description = "Perform NLP analysis on document"
param("documentId", "string", "Document identifier", required = true)
param("analysisType", "string", "Type of analysis", required = true)
cache {
keyBuilder = { params, context ->
val docId = params["documentId"] as String
val analysisType = params["analysisType"] as String
"${context.tenantId}|doc:$docId|analysis:$analysisType"
}
ttl = 3600 // 1 hour
maxSize = 1000 // 1000 analysis results
}
execute { params, context ->
val docId = params["documentId"] as String
val analysisType = params["analysisType"] as String
// Load document
val document = documentRepository.load(docId, context.tenantId)
// Expensive NLP processing (5-30 seconds!)
when (analysisType) {
"sentiment" -> nlpService.analyzeSentiment(document.content)
"entities" -> nlpService.extractEntities(document.content)
"summary" -> nlpService.generateSummary(document.content)
else -> throw IllegalArgumentException("Unknown analysis type")
}
}
}
// Usage:
withAgentContext("tenantId" to "ACME") {
// First analysis: 15 seconds
val sentiment = documentAnalysisTool.execute(mapOf(
"documentId" to "doc-789",
"analysisType" to "sentiment"
))
// Re-run same analysis: 0.001 seconds (cached!)
val sentiment2 = documentAnalysisTool.execute(mapOf(
"documentId" to "doc-789",
"analysisType" to "sentiment"
))
}
Pattern 4: Configuration Cachingβ
Scenario: Cache rarely-changing tenant configuration.
val tenantConfigTool = contextAwareTool("get_tenant_config") {
description = "Get tenant configuration"
cache {
keyBuilder = { _, context ->
// Key only by tenant (no parameters)
"config:${context.tenantId}"
}
ttl = 3600 // 1 hour
maxSize = 100 // 100 tenants
}
execute { _, context ->
val tenantId = context.tenantId!!
// Load configuration (database + compute defaults)
configService.loadTenantConfig(tenantId)
}
}
// Manual cache invalidation on config update
fun updateTenantConfig(tenantId: String, newConfig: Config) {
configService.save(tenantId, newConfig)
tenantConfigTool.clearCache() // Invalidate cache
}
Pattern 5: Search Result Cachingβ
Scenario: Cache search results for common queries.
val searchTool = contextAwareTool("search_documents") {
description = "Full-text document search"
param("query", "string", "Search query", required = true)
param("filters", "object", "Search filters", required = false)
param("page", "integer", "Page number", required = false)
cache {
keyBuilder = { params, context ->
val query = (params["query"] as String).lowercase().trim()
val filters = params["filters"] as? Map<*, *> ?: emptyMap<String, Any>()
val page = params["page"] as? Int ?: 1
// Serialize filters to string
val filterStr = filters.entries
.sortedBy { it.key.toString() }
.joinToString("|") { "${it.key}=${it.value}" }
"${context.tenantId}|search:$query|filters:$filterStr|page:$page"
}
ttl = 300 // 5 minutes (search indexes update frequently)
maxSize = 5000 // Cache popular searches
}
execute { params, context ->
val query = params["query"] as String
val filters = params["filters"] as? Map<*, *> ?: emptyMap<String, Any>()
val page = params["page"] as? Int ?: 1
// Expensive full-text search
searchEngine.search(
query = query,
filters = filters,
page = page,
tenantId = context.tenantId
)
}
}
Pattern 6: Multi-Level Cachingβ
Scenario: Cache at different granularities for flexibility.
val productDataTool = contextAwareTool("get_product_data") {
description = "Get product information with variant data"
param("sku", "string", "Product SKU", required = true)
param("includeInventory", "boolean", "Include inventory data", required = false)
cache {
keyBuilder = { params, context ->
val sku = params["sku"] as String
val includeInventory = params["includeInventory"] as? Boolean ?: false
if (includeInventory) {
// Shorter TTL for inventory (changes frequently)
// Signal: use different cache namespace
"${context.tenantId}|product-with-inv:$sku"
} else {
// Longer TTL for base product data
"${context.tenantId}|product-base:$sku"
}
}
// Base TTL (can be overridden by key logic)
ttl = 600
maxSize = 10000
}
execute { params, context ->
val sku = params["sku"] as String
val includeInventory = params["includeInventory"] as? Boolean ?: false
val baseProduct = productRepository.findBySku(sku, context.tenantId)
if (includeInventory) {
val inventory = inventoryService.getInventory(sku)
baseProduct.copy(inventory = inventory)
} else {
baseProduct
}
}
}
Advanced Techniquesβ
Cache Warmingβ
Pre-populate cache with frequently accessed data:
suspend fun warmUserCache(tenantId: String, userIds: List<String>) {
withAgentContext("tenantId" to tenantId) {
userIds.forEach { userId ->
// Execute once to populate cache
userLookupTool.execute(mapOf("userId" to userId))
}
}
}
// Call during application startup or maintenance window
warmUserCache("ACME", listOf("admin", "user-1", "user-2"))
Conditional Cachingβ
Cache only certain results:
val conditionalCacheTool = contextAwareTool("conditional_cache") {
param("priority", "string", required = true)
cache {
keyBuilder = { params, context ->
val priority = params["priority"] as String
// Only cache non-urgent requests
if (priority == "urgent") {
// Use timestamp to prevent caching
"${context.tenantId}|urgent:${System.currentTimeMillis()}"
} else {
"${context.tenantId}|normal:${params["id"]}"
}
}
ttl = 300
maxSize = 1000
}
execute { params, context ->
// Execute logic...
}
}
Cache Layersβ
Combine tool caching with application-level caching:
// L1: Application cache (e.g., Redis)
val appCache = RedisCacheManager()
// L2: Tool-level cache (in-memory)
val cachedTool = baseTool.cached(ttl = 300)
// Check L1 first, then L2
suspend fun getCachedResult(key: String): Result {
// Try L1 (Redis)
appCache.get(key)?.let { return it }
// Try L2 (Tool cache)
val result = cachedTool.execute(params)
// Store in L1 for next time
appCache.set(key, result)
return result
}
Cache Partitioningβ
Separate caches for different use cases:
// High-frequency, short-lived cache
val shortLivedTool = baseTool.cached(ttl = 60, maxSize = 10000)
// Low-frequency, long-lived cache
val longLivedTool = baseTool.cached(ttl = 3600, maxSize = 100)
// Route based on context
fun selectTool(context: AgentContext): Tool {
return if (context.metadata["cacheStrategy"] == "short") {
shortLivedTool
} else {
longLivedTool
}
}
Dynamic TTLβ
Adjust TTL based on data characteristics:
cache {
keyBuilder = { params, context ->
val dataType = params["dataType"] as String
val id = params["id"] as String
// Encode TTL hint in key
val ttl = when (dataType) {
"static" -> 86400 // 24 hours
"dynamic" -> 300 // 5 minutes
"realtime" -> 10 // 10 seconds
else -> 600
}
"${context.tenantId}|$dataType:$id|ttl:$ttl"
}
ttl = 600 // Default TTL
maxSize = 5000
}
Performanceβ
Benchmarksβ
Typical performance characteristics:
| Operation | Time | Notes |
|---|---|---|
| Cache Hit | ~1 Β΅s | In-memory map lookup |
| Cache Miss | Tool execution + ~1 Β΅s | Store in cache |
| Key Generation (Auto) | ~10 Β΅s | SHA-256 hash |
| Key Generation (Custom) | ~0.1 Β΅s | String concatenation |
| LRU Eviction | ~50 Β΅s | Find + remove LRU entry |
| Metrics Update | ~0.01 Β΅s | Atomic increment |
Memory Usageβ
Estimate memory footprint:
// Formula:
// Memory (MB) = (maxSize Γ avgResultSize) / 1024 / 1024
// Example 1: User profiles
// maxSize = 5000
// avgResultSize = 2 KB
// Memory = (5000 Γ 2048) / 1024 / 1024 β 10 MB
// Example 2: Search results
// maxSize = 10000
// avgResultSize = 10 KB
// Memory = (10000 Γ 10240) / 1024 / 1024 β 98 MB
Monitoring Memory:
val stats = cachedTool.getCacheStats()
val estimatedMemoryMB = (stats.size * 2) / 1024 // Assume 2KB per entry
if (estimatedMemoryMB > 100) {
println("Warning: Cache using ${estimatedMemoryMB}MB")
}
Optimization Tipsβ
1. Right-Size Your Cacheβ
// Too small: Low hit rate, frequent evictions
cache { maxSize = 10 } // β
// Too large: Wasted memory
cache { maxSize = 1000000 } // β
// Just right: Based on working set
cache { maxSize = 5000 } // β
2. Choose Optimal TTLβ
// Static data: Long TTL
cache { ttl = 86400 } // 24 hours
// Semi-static: Medium TTL
cache { ttl = 3600 } // 1 hour
// Dynamic data: Short TTL
cache { ttl = 300 } // 5 minutes
// Real-time data: Don't cache!
// (no cache block)
3. Efficient Key Buildersβ
// β Inefficient: Serialize to JSON
cache {
keyBuilder = { params, context ->
json.encodeToString(params) // Slow!
}
}
// β
Efficient: String concatenation
cache {
keyBuilder = { params, context ->
"${context.tenantId}|${params["id"]}" // Fast!
}
}
4. Disable Metrics in Production (if not needed)β
cache {
enableMetrics = false // Saves ~0.01Β΅s per operation
}
Concurrency Performanceβ
CachedTool is highly concurrent:
// Load test: 1000 concurrent requests
val tool = baseTool.cached(ttl = 3600)
repeat(1000) {
launch {
tool.execute(params)
}
}
// Performance:
// - No lock contention
// - ConcurrentHashMap scales linearly
// - Atomic metrics updates
Monitoring & Debuggingβ
Cache Statisticsβ
Get real-time cache metrics:
val stats = cachedTool.getCacheStats()
println("""
Tool: ${stats.toolName}
Size: ${stats.size} / ${stats.maxSize}
Hits: ${stats.hits}
Misses: ${stats.misses}
Hit Rate: ${stats.hitRate * 100}%
TTL: ${stats.ttl}s
""".trimIndent())
Output:
Tool: user_lookup
Size: 347 / 1000
Hits: 8532
Misses: 1247
Hit Rate: 87.23%
TTL: 600s
Loggingβ
Enable debug logging to trace cache behavior:
// In your logging config
logger("io.github.noailabs.spice.performance.CachedTool").level = Level.DEBUG
// Logs:
// [DEBUG] Cache key generated: tenant:ACME|user:123
// [DEBUG] Cache HIT for key: tenant:ACME|user:123
// [DEBUG] Cache MISS for key: tenant:ACME|user:456
// [DEBUG] LRU eviction: removed key tenant:OLD|user:789
Metrics Exportβ
Export cache metrics to monitoring systems:
// Periodically export to Prometheus/Grafana
fun exportCacheMetrics() {
val tools = listOf(userLookup, policyLookup, documentAnalysis)
tools.forEach { tool ->
val stats = tool.getCacheStats()
metricsRegistry.gauge("cache_size", stats.size.toDouble())
metricsRegistry.gauge("cache_hit_rate", stats.hitRate)
metricsRegistry.counter("cache_hits", stats.hits)
metricsRegistry.counter("cache_misses", stats.misses)
}
}
// Schedule export every 30 seconds
scheduler.scheduleAtFixedRate(::exportCacheMetrics, 0, 30, TimeUnit.SECONDS)
Debug Utilitiesβ
Inspect cache contents:
// Access internal cache (for debugging only!)
val cachedTool = userLookup.cached(ttl = 300)
val internalCache = (cachedTool as CachedTool).getCache()
// Print all keys
internalCache.keys.forEach { key ->
println("Cached key: $key")
}
// Check specific entry
val entry = internalCache["tenant:ACME|user:123"]
if (entry != null) {
val age = System.currentTimeMillis() - entry.timestamp
println("Entry age: ${age / 1000}s")
println("Expires in: ${(ttl * 1000 - age) / 1000}s")
}
Best Practicesβ
1. Always Include Tenant in Multi-Tenant Systemsβ
// β BAD: Tenant data leak
cache {
keyBuilder = { params, _ ->
"user:${params["userId"]}" // Same key for all tenants!
}
}
// β
GOOD: Tenant isolation
cache {
keyBuilder = { params, context ->
"${context.tenantId}|user:${params["userId"]}"
}
}
2. Cache Only Successful Resultsβ
CachedTool automatically skips caching errors:
execute { params, context ->
val result = riskyOperation()
if (result.isError) {
// Not cached! β
return@execute ToolResult.error(result.message)
}
// Cached only if successful β
ToolResult.success(result.data)
}
3. Choose TTL Based on Data Volatilityβ
// User profiles: change infrequently
cache { ttl = 3600 } // 1 hour
// Search results: change frequently
cache { ttl = 300 } // 5 minutes
// Real-time stock prices: don't cache
// (no cache block)
4. Monitor Hit Ratesβ
// Low hit rate = ineffective caching
val stats = cachedTool.getCacheStats()
if (stats.hitRate < 0.5) {
log.warn("Low cache hit rate: ${stats.hitRate}")
// Consider:
// - Increasing maxSize
// - Increasing TTL
// - Reviewing key builder logic
}
5. Clear Cache on Mutationsβ
// When data changes, invalidate cache
fun updateUser(userId: String, newData: UserData) {
userRepository.update(userId, newData)
userLookupTool.clearCache() // Invalidate entire cache
}
// Or use short TTL for frequently updated data
cache { ttl = 60 } // 1 minute
6. Test Cache Behaviorβ
@Test
fun `test caching works`() = runBlocking {
var execCount = 0
val tool = contextAwareTool("test") {
cache { ttl = 300 }
execute { _, _ ->
execCount++
"result"
}
}
withAgentContext("tenantId" to "TEST") {
tool.execute(mapOf("id" to "123"))
tool.execute(mapOf("id" to "123"))
tool.execute(mapOf("id" to "123"))
assertEquals(1, execCount) // Only executed once!
}
}
7. Handle Missing Context Gracefullyβ
cache {
keyBuilder = { params, context ->
val tenantId = context.tenantId ?: "default" // Fallback
"$tenantId|${params["id"]}"
}
}
8. Use Structured Keysβ
// β BAD: Ambiguous
"user123policy456" // Where does user end and policy start?
// β
GOOD: Clear structure
"tenant:ACME|user:123|policy:456"
Testingβ
Unit Testsβ
Test cache hit/miss behavior:
@Test
fun `first call misses cache, second hits`() = runBlocking {
val tool = baseTool.cached(ttl = 300)
withAgentContext("tenantId" to "TEST") {
tool.execute(mapOf("id" to "123")) // Miss
tool.execute(mapOf("id" to "123")) // Hit
val stats = tool.getCacheStats()
assertEquals(1, stats.hits)
assertEquals(1, stats.misses)
}
}
Test TTL expiration:
@Test
fun `cache expires after TTL`() = runBlocking {
val tool = baseTool.cached(ttl = 1) // 1 second
// Note: .cached() uses parameter-only cache keys (no context needed)
tool.execute(mapOf("id" to "123"))
delay(1100) // Wait for expiration
tool.execute(mapOf("id" to "123"))
val stats = tool.getCacheStats()
assertEquals(0, stats.hits) // Both were misses
assertEquals(2, stats.misses)
}
Test LRU eviction:
@Test
fun `LRU eviction removes least recently used`() = runBlocking {
val tool = baseTool.cached(ttl = 3600, maxSize = 2)
tool.execute(mapOf("id" to "A")) // Cache: [A]
tool.execute(mapOf("id" to "B")) // Cache: [A, B]
tool.execute(mapOf("id" to "A")) // Cache: [B, A] (A accessed)
tool.execute(mapOf("id" to "C")) // Cache: [A, C] (B evicted!)
val stats = tool.getCacheStats()
assertEquals(2, stats.size)
}
Test tenant isolation:
@Test
fun `different tenants have separate cache entries`() = runBlocking {
val tool = contextAwareTool("test") {
cache {
keyBuilder = { params, context ->
"${context.tenantId}|${params["id"]}"
}
}
execute { params, _ -> "result:${params["id"]}" }
}
// Tenant A
withAgentContext("tenantId" to "TENANT_A") {
tool.execute(mapOf("id" to "123"))
}
// Tenant B (different cache entry!)
withAgentContext("tenantId" to "TENANT_B") {
tool.execute(mapOf("id" to "123"))
}
val stats = tool.getCacheStats()
assertEquals(0, stats.hits) // Both were misses (different keys)
assertEquals(2, stats.misses)
assertEquals(2, stats.size) // 2 separate entries
}
Integration Testsβ
Test with real database:
@Test
fun `caching reduces database queries`() = runBlocking {
var dbQueryCount = 0
val mockRepo = object : UserRepository {
override fun findById(id: String): User {
dbQueryCount++
return User(id, "Test User")
}
}
val tool = contextAwareTool("user_lookup") {
cache { ttl = 300 }
execute { params, _ ->
mockRepo.findById(params["id"] as String)
}
}
repeat(10) {
tool.execute(mapOf("id" to "user-123"))
}
assertEquals(1, dbQueryCount) // Only 1 query despite 10 calls!
}
Load Testsβ
Test concurrent access:
@Test
fun `cache handles concurrent access`() = runBlocking {
val tool = baseTool.cached(ttl = 3600)
// 1000 concurrent requests
val jobs = List(1000) {
async {
tool.execute(mapOf("id" to (it % 10).toString()))
}
}
jobs.awaitAll()
val stats = tool.getCacheStats()
assertTrue(stats.size <= 10) // At most 10 unique entries
assertTrue(stats.hits > 900) // High hit rate
}
Troubleshootingβ
Cache Not Workingβ
Symptom: Every request is a cache miss.
Possible Causes:
-
Context not propagated:
// β Wrong: No context
tool.execute(params)
// β Correct: With context
withAgentContext("tenantId" to "ACME") {
tool.execute(params)
} -
Inconsistent parameters:
// These generate different cache keys:
tool.execute(mapOf("id" to "123", "debug" to true))
tool.execute(mapOf("id" to "123", "debug" to false)) -
TTL too short:
cache { ttl = 1 } // Expires too quickly!
Low Hit Rateβ
Symptom: Hit rate is consistently below 50%.
Solutions:
-
Increase cache size:
cache { maxSize = 10000 } // Was 1000 -
Increase TTL:
cache { ttl = 3600 } // Was 300 -
Review key builder:
// β Too specific (creates many keys)
keyBuilder = { params, context ->
"${context.tenantId}|${params}|${System.currentTimeMillis()}"
}
// β More general (better reuse)
keyBuilder = { params, context ->
"${context.tenantId}|${params["id"]}"
}
Memory Issuesβ
Symptom: High memory usage or OOM errors.
Solutions:
-
Reduce maxSize:
cache { maxSize = 1000 } // Was 10000 -
Reduce TTL:
cache { ttl = 300 } // Entries expire faster -
Manually cleanup:
scheduler.scheduleAtFixedRate({
cachedTool.cleanupExpired()
}, 0, 5, TimeUnit.MINUTES)
Stale Dataβ
Symptom: Getting outdated cached results.
Solutions:
-
Reduce TTL:
cache { ttl = 60 } // 1 minute instead of 1 hour -
Clear cache on mutations:
fun updateData() {
repository.update()
cachedTool.clearCache()
} -
Use cache versioning:
cache {
keyBuilder = { params, context ->
val version = getCurrentDataVersion()
"${context.tenantId}|${params["id"]}|v:$version"
}
}
Tenant Data Leakβ
Symptom: Users seeing other tenants' data.
Solution: Always include tenant in cache key:
// β INSECURE
cache {
keyBuilder = { params, _ ->
"user:${params["userId"]}"
}
}
// β
SECURE
cache {
keyBuilder = { params, context ->
"${context.tenantId}|user:${params["userId"]}"
}
}
API Referenceβ
CachedToolβ
class CachedTool(
private val delegate: Tool,
private val config: ToolCacheConfig = ToolCacheConfig()
) : Tool {
// Get cache statistics
fun getCacheStats(): ToolCacheStats
// Clear all cached entries
fun clearCache()
// Remove expired entries
fun cleanupExpired()
// Execute with caching
override suspend fun execute(parameters: Map<String, Any>): SpiceResult<ToolResult>
}
ToolCacheConfigβ
data class ToolCacheConfig(
val maxSize: Int = 1000,
val ttl: Long = 3600,
val enableMetrics: Boolean = true,
val keyBuilder: ((Map<String, Any>, AgentContext?) -> String)? = null
)
ToolCacheStatsβ
data class ToolCacheStats(
val toolName: String,
val size: Int,
val maxSize: Int,
val hits: Long,
val misses: Long,
val hitRate: Double,
val ttl: Long
) {
override fun toString(): String =
"""
Tool Cache Statistics ($toolName):
- Size: $size / $maxSize
- Hits: $hits
- Misses: $misses
- Hit Rate: ${"%.2f".format(hitRate * 100)}%
- TTL: ${ttl}s
""".trimIndent()
}
Extension Functionsβ
// Wrap tool with caching
fun Tool.cached(
keyBuilder: ((Map<String, Any>, AgentContext?) -> String)? = null,
ttl: Long = 3600,
maxSize: Int = 1000,
enableMetrics: Boolean = true
): Tool
// Create cached tool
fun cachedTool(
delegate: Tool,
config: ToolCacheConfig = ToolCacheConfig()
): Tool
DSL (ContextAwareTool)β
contextAwareTool("my_tool") {
// ... tool config ...
cache {
keyBuilder = { params, context ->
"${context.tenantId}|${params["id"]}"
}
ttl = 300
maxSize = 1000
enableMetrics = true
}
execute { params, context ->
// ... tool logic ...
}
}
Related Documentationβ
- Output Validation - Validate cached results
- Context-Aware Tools - Build tools with context
- Tool Pipeline DSL - Chain cached tools
- Performance Overview - Other optimization techniques
Summaryβ
Tool-Level Caching provides:
β Automatic caching with minimal configuration β TTL-based expiration for fresh data β LRU eviction for bounded memory β Context-aware keys for multi-tenancy β Comprehensive metrics for monitoring β Thread-safe for high concurrency β Flexible key builders for custom logic
Start optimizing your tools today! π