Skip to Content
prxy.monster v1 is in early access. See what shipped →
Modulesexact-cache

exact-cache

Category: cache · Cloud + Local · Status: v1 — production

Hashes the canonical request. If the same hash has a cached response, return it. Zero latency on a hit. Zero risk of false positives.

What it does

The dumbest possible cache. No embeddings, no similarity, no surprises.

When to use it

✅ Dev/test loops (same request fired repeatedly) ✅ Idempotent endpoints (look up X, format X) ✅ Workloads that hit the same prompt verbatim ✅ As the first module in a pipeline that also has semantic-cache — exact hits short-circuit before the more expensive embedding lookup.

❌ Anything with a timestamp, request ID, or per-call random nonce in the prompt

Configuration

exact-cache: ttlSeconds: 3600 scope: 'user' # 'user' | 'global' includeSystem: true # include system prompt in the hash? excludeFields: [] # canonical fields to exclude from the hash

Metrics emitted

  • cache.exact.hit (boolean)
  • cache.exact.lookup_ms (number)

Examples

Default — fast cache for true repeats:

exact-cache: ttlSeconds: 3600

Long TTL for stable workloads:

exact-cache: ttlSeconds: 86400

Composed with semantic (recommended order):

pipeline: - exact-cache: ttlSeconds: 3600 - semantic-cache: similarity: 0.88 ttlSeconds: 3600

The exact cache catches identical hits in microseconds. The semantic cache catches paraphrases in low-millisecond range.

How it works

  1. Pre hook:

    • Canonicalize the request (sort keys, normalize whitespace, drop metadata).
    • SHA256 the canonical JSON → cache key.
    • kv.get() the key.
    • If hit: return cached response, skip provider call.
    • If miss: continue.
  2. Post hook (on miss):

    • kv.set() the response under the canonical key with ttlSeconds TTL.

Streaming

Exact hits on streaming requests replay as a synthetic SSE stream — same shape as a real one.

Cloud vs Local

ModeBackend
CloudUpstash Redis (REST)
LocalIn-memory Map with TTL sweep

Local mode has no persistence for exact-cache — restarting the container clears it. Use semantic-cache if you need cache survival across restarts (semantic-cache lives in SQLite).

Source

packages/modules-core/src/exact-cache.ts

Last updated on