Skip to Content
prxy.monster v1 is in early access. See what shipped →
IntegrationsUsing prxy.monster with the Anthropic SDK

Using prxy.monster with the Anthropic SDK

prxy.monster speaks the native Anthropic API at https://api.prxy.monster — same /v1/messages shape, same headers, same streaming format. Set one env var, your existing @anthropic-ai/sdk (or anthropic Python) code routes through prxy.monster unchanged.

Install

npm install @anthropic-ai/sdk # or pip install anthropic

Configure

export ANTHROPIC_BASE_URL=https://api.prxy.monster export ANTHROPIC_API_KEY=prxy_live_xxxxxxxxxxxxxxxxxxxxxxxx

Anthropic SDK uses ANTHROPIC_BASE_URL (no /v1 suffix). The OpenAI SDK uses OPENAI_BASE_URL with a /v1 suffix. Don’t mix them.

Code change

None.

// Before AND after — no diff import Anthropic from '@anthropic-ai/sdk'; const client = new Anthropic(); // reads env const msg = await client.messages.create({ model: 'claude-sonnet-4-6', max_tokens: 256, messages: [{ role: 'user', content: 'hi' }], });

If you prefer explicit:

const client = new Anthropic({ baseURL: 'https://api.prxy.monster', apiKey: process.env.ANTHROPIC_API_KEY, });

Verify

curl https://api.prxy.monster/health

Or:

prxy doctor

What you get

  • Infinite context — long Claude conversations stop hitting the context wall.
  • MCP optimization — if you wire MCP servers through Claude tool-use, irrelevant tool defs get pruned per request (~67k → ~8k tokens on a real Claude Code session).
  • Semantic cache — similar prompts return cached responses.
  • Pattern memorythe issue was X, the fix is Y style insights get logged and re-injected.

For long-running Claude apps or coding agents:

PRXY_PIPE=mcp-optimizer,semantic-cache,patterns,ipc

For Claude Code specifically, add the compaction bridge:

PRXY_PIPE=mcp-optimizer,compaction-bridge,rehydrator,semantic-cache,patterns,ipc

Streaming

Anthropic’s per-block SSE format passes through unchanged — message_start, content_block_start, content_block_delta, message_stop all flow exactly as Anthropic emits them.

const stream = client.messages.stream({ model: 'claude-sonnet-4-6', max_tokens: 1024, messages: [{ role: 'user', content: 'write a haiku' }], }); for await (const event of stream) { if (event.type === 'content_block_delta' && event.delta.type === 'text_delta') { process.stdout.write(event.delta.text); } }

Cache hits are replayed as synthetic Anthropic SSE so your stream parser doesn’t notice the difference.

Common issues

  • Vertex / Bedrock SDK variants@anthropic-ai/vertex-sdk and @anthropic-ai/bedrock-sdk use cloud-provider auth, so they don’t talk to prxy.monster directly. Use the standard @anthropic-ai/sdk and route Bedrock models via model: "bedrock/<model-id>" (see Bedrock provider).
  • Beta headers (anthropic-beta: prompt-caching-2024-07-31 etc.) — pass-through. We don’t strip them.
  • Tool use / input_schema — pass-through. mcp-optimizer reads tool defs to score relevance; tool execution still happens client-side as usual.

Full example

Plain Node script: github.com/Ekkos-Technologies-Inc/prxy-examples/tree/main/examples/anthropic-quickstart 

Last updated on