exact-cache
Category: cache · Cloud + Local · Status: v1 — production
Hashes the canonical request. If the same hash has a cached response, return it. Zero latency on a hit. Zero risk of false positives.
What it does
The dumbest possible cache. No embeddings, no similarity, no surprises.
When to use it
✅ Dev/test loops (same request fired repeatedly)
✅ Idempotent endpoints (look up X, format X)
✅ Workloads that hit the same prompt verbatim
✅ As the first module in a pipeline that also has semantic-cache — exact hits short-circuit before the more expensive embedding lookup.
❌ Anything with a timestamp, request ID, or per-call random nonce in the prompt
Configuration
exact-cache:
ttlSeconds: 3600
scope: 'user' # 'user' | 'global'
includeSystem: true # include system prompt in the hash?
excludeFields: [] # canonical fields to exclude from the hashMetrics emitted
cache.exact.hit(boolean)cache.exact.lookup_ms(number)
Examples
Default — fast cache for true repeats:
exact-cache:
ttlSeconds: 3600Long TTL for stable workloads:
exact-cache:
ttlSeconds: 86400Composed with semantic (recommended order):
pipeline:
- exact-cache:
ttlSeconds: 3600
- semantic-cache:
similarity: 0.88
ttlSeconds: 3600The exact cache catches identical hits in microseconds. The semantic cache catches paraphrases in low-millisecond range.
How it works
-
Pre hook:
- Canonicalize the request (sort keys, normalize whitespace, drop
metadata). - SHA256 the canonical JSON → cache key.
kv.get()the key.- If hit: return cached response, skip provider call.
- If miss: continue.
- Canonicalize the request (sort keys, normalize whitespace, drop
-
Post hook (on miss):
kv.set()the response under the canonical key withttlSecondsTTL.
Streaming
Exact hits on streaming requests replay as a synthetic SSE stream — same shape as a real one.
Cloud vs Local
| Mode | Backend |
|---|---|
| Cloud | Upstash Redis (REST) |
| Local | In-memory Map with TTL sweep |
Local mode has no persistence for exact-cache — restarting the container clears it. Use semantic-cache if you need cache survival across restarts (semantic-cache lives in SQLite).