Using prxy.monster with the Anthropic SDK
prxy.monster speaks the native Anthropic API at https://api.prxy.monster — same /v1/messages shape, same headers, same streaming format. Set one env var, your existing @anthropic-ai/sdk (or anthropic Python) code routes through prxy.monster unchanged.
Install
npm install @anthropic-ai/sdk
# or
pip install anthropicConfigure
export ANTHROPIC_BASE_URL=https://api.prxy.monster
export ANTHROPIC_API_KEY=prxy_live_xxxxxxxxxxxxxxxxxxxxxxxxAnthropic SDK uses ANTHROPIC_BASE_URL (no /v1 suffix). The OpenAI SDK uses OPENAI_BASE_URL with a /v1 suffix. Don’t mix them.
Code change
None.
Node
// Before AND after — no diff
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic(); // reads env
const msg = await client.messages.create({
model: 'claude-sonnet-4-6',
max_tokens: 256,
messages: [{ role: 'user', content: 'hi' }],
});If you prefer explicit:
const client = new Anthropic({
baseURL: 'https://api.prxy.monster',
apiKey: process.env.ANTHROPIC_API_KEY,
});Verify
curl https://api.prxy.monster/healthOr:
prxy doctorWhat you get
- Infinite context — long Claude conversations stop hitting the context wall.
- MCP optimization — if you wire MCP servers through Claude tool-use, irrelevant tool defs get pruned per request (~67k → ~8k tokens on a real Claude Code session).
- Semantic cache — similar prompts return cached responses.
- Pattern memory —
the issue was X, the fix is Ystyle insights get logged and re-injected.
Recommended pipeline
For long-running Claude apps or coding agents:
PRXY_PIPE=mcp-optimizer,semantic-cache,patterns,ipcFor Claude Code specifically, add the compaction bridge:
PRXY_PIPE=mcp-optimizer,compaction-bridge,rehydrator,semantic-cache,patterns,ipcStreaming
Anthropic’s per-block SSE format passes through unchanged — message_start, content_block_start, content_block_delta, message_stop all flow exactly as Anthropic emits them.
const stream = client.messages.stream({
model: 'claude-sonnet-4-6',
max_tokens: 1024,
messages: [{ role: 'user', content: 'write a haiku' }],
});
for await (const event of stream) {
if (event.type === 'content_block_delta' && event.delta.type === 'text_delta') {
process.stdout.write(event.delta.text);
}
}Cache hits are replayed as synthetic Anthropic SSE so your stream parser doesn’t notice the difference.
Common issues
- Vertex / Bedrock SDK variants —
@anthropic-ai/vertex-sdkand@anthropic-ai/bedrock-sdkuse cloud-provider auth, so they don’t talk to prxy.monster directly. Use the standard@anthropic-ai/sdkand route Bedrock models viamodel: "bedrock/<model-id>"(see Bedrock provider). - Beta headers (
anthropic-beta: prompt-caching-2024-07-31etc.) — pass-through. We don’t strip them. - Tool use /
input_schema— pass-through.mcp-optimizerreads tool defs to score relevance; tool execution still happens client-side as usual.
Full example
Plain Node script: github.com/Ekkos-Technologies-Inc/prxy-examples/tree/main/examples/anthropic-quickstart