Skip to Content
prxy.monster v1 is in early access. See what shipped →
Modulesprompt-optimizer

prompt-optimizer

Category: optimization · Cloud + Local · Status: v1.0 — production

Anthropic’s prefix cache rewards stable prefixes: identical leading bytes get cached for ~5 minutes. The cheapest input tokens are the ones never re-billed. This module makes sure you’re getting that win automatically.

What it does

  • Stable tool ordering — sorts your tools array by name. Two requests with the same toolset (in any order) now serialize identically at the prefix.
  • Auto cache markers — stamps cache_control: { type: 'ephemeral' } on the last system block. Anthropic caches everything up to and including a marker, so this gives you the maximal cacheable region.
  • Optional assistant breakpoint — for multi-turn sessions, can mark the last assistant message as a mid-conversation cache breakpoint.

For non-Anthropic providers this is a no-op — the markers ride on the canonical SystemBlock and providers that don’t recognize them just ignore them.

When to use it

  • Any workload with a stable system prompt + variable user message (almost everything).
  • High-traffic apps where prompt cache savings compound.
  • Whenever you’re using tools — the tool list is usually stable; sorting it helps.

Configuration

prompt-optimizer: cacheControl: 'auto' # 'auto' | 'manual' | 'off' separateStatic: true # sort tools alphabetically — default true minCacheableChars: 1024 # don't bother below this — default 1024 markAssistantHistory: false # also mark last assistant turn — default false

Metrics emitted

  • prompt-optimizer.mode — the configured mode.
  • prompt-optimizer.applied — whether the request was actually mutated.
  • prompt-optimizer.assistant_breakpoint_index — set when markAssistantHistory is on and a breakpoint was placed.

Compatibility

Place prompt-optimizer AFTER ipcipc may reshape the prompt, then prompt-optimizer lays cache markers on the resulting stable prefix.

Source

packages/modules-core/src/prompt-optimizer.ts

Last updated on