Skip to Content
prxy.monster v1 is in early access. See what shipped →
Modulesmcp-optimizer

mcp-optimizer

Category: optimization · Cloud + Local · Status: v1 — production

Embeds each tool’s name + description, scores each one against the current user message, drops the tools that score below the threshold. The kept set is stable per-session so provider prompt caches don’t shatter.

What it does

A typical MCP-using agent ships 30–80 tools on every request, even though the user is only asking about one. mcp-optimizer keeps the relevant subset.

Before: 67k tokens of tool definitions After: 8k tokens of tool definitions 88% reduction in MCP overhead

When to use it

✅ Any agent that uses MCP tools ✅ Claude Code, Cline, Continue.dev, custom MCP clients ✅ Multi-server MCP setups (filesystem + GitHub + Slack + …)

❌ Apps that don’t use tools ❌ Apps where every tool is always relevant (rare)

Configuration

mcp-optimizer: relevanceThreshold: 0.6 # 0.0 - 1.0; lower = keep more tools preserveTools: [] # always keep these (by name) embeddingModel: 'voyage-3-lite' # or 'text-embedding-3-small' minTools: 1 # never drop below this many

Metrics emitted

  • mcp-optimizer.tools.before (number)
  • mcp-optimizer.tools.after (number)
  • mcp-optimizer.tokens.saved (number)
  • mcp-optimizer.duration_ms (number)

Examples

Conservative — keep most tools, drop only obvious mismatches:

mcp-optimizer: relevanceThreshold: 0.4

Aggressive — strip hard, save tokens:

mcp-optimizer: relevanceThreshold: 0.75 preserveTools: ['read_file', 'write_file', 'bash']

Coding-assistant tuned — keep file ops always, drop the rest by relevance:

mcp-optimizer: relevanceThreshold: 0.6 preserveTools: - read_file - write_file - bash - grep - glob

How it works

  1. Pre hook: extract the user’s last message text.
  2. Embed the user message.
  3. For each tool in request.tools: embed ${name}: ${description}. (Cached by tool hash — first request pays the cost, subsequent ones hit the cache.)
  4. Compute cosine similarity. Keep tools above threshold + any in preserveTools. Always keep at least minTools.
  5. Replace request.tools with the kept subset.
  6. Attach metadata['mcp-optimizer.tokens.saved'] for downstream visibility.

The kept subset is stable per session. We hash the input set + threshold + user message into a session key — so the same cache prefix lands at Anthropic on the next turn. Critical for prompt cache hit rates.

Cloud vs Local

ModeEmbedding backend
CloudVoyage AI (configurable) — falls back to deterministic stub if no key
LocalSame — uses your VOYAGE_API_KEY if set, else stub

The stub is a SHA256-of-trigrams projected to 256-dim. Quality is poor but stable, so caches behave deterministically in tests.

Source

packages/modules-core/src/mcp-optimizer.ts

Last updated on