prompt-optimizer
Category: optimization · Cloud + Local · Status: v1.0 — production
Anthropic’s prefix cache rewards stable prefixes: identical leading bytes get cached for ~5 minutes. The cheapest input tokens are the ones never re-billed. This module makes sure you’re getting that win automatically.
What it does
- Stable tool ordering — sorts your
toolsarray by name. Two requests with the same toolset (in any order) now serialize identically at the prefix. - Auto cache markers — stamps
cache_control: { type: 'ephemeral' }on the last system block. Anthropic caches everything up to and including a marker, so this gives you the maximal cacheable region. - Optional assistant breakpoint — for multi-turn sessions, can mark the last assistant message as a mid-conversation cache breakpoint.
For non-Anthropic providers this is a no-op — the markers ride on the canonical SystemBlock and providers that don’t recognize them just ignore them.
When to use it
- Any workload with a stable system prompt + variable user message (almost everything).
- High-traffic apps where prompt cache savings compound.
- Whenever you’re using tools — the tool list is usually stable; sorting it helps.
Configuration
prompt-optimizer:
cacheControl: 'auto' # 'auto' | 'manual' | 'off'
separateStatic: true # sort tools alphabetically — default true
minCacheableChars: 1024 # don't bother below this — default 1024
markAssistantHistory: false # also mark last assistant turn — default falseMetrics emitted
prompt-optimizer.mode— the configured mode.prompt-optimizer.applied— whether the request was actually mutated.prompt-optimizer.assistant_breakpoint_index— set whenmarkAssistantHistoryis on and a breakpoint was placed.
Compatibility
Place prompt-optimizer AFTER ipc — ipc may reshape the prompt, then prompt-optimizer lays cache markers on the resulting stable prefix.
Source
Last updated on