Using prxy.monster with the Anthropic SDK

prxy.monster speaks the native Anthropic API at https://api.prxy.monster — same /v1/messages shape, same headers, same streaming format. Set one env var, your existing @anthropic-ai/sdk (or anthropic Python) code routes through prxy.monster unchanged.

Install


npm install @anthropic-ai/sdk
# or
pip install anthropic

Configure


export ANTHROPIC_BASE_URL=https://api.prxy.monster
export ANTHROPIC_API_KEY=prxy_live_xxxxxxxxxxxxxxxxxxxxxxxx

Anthropic SDK uses ANTHROPIC_BASE_URL (no /v1 suffix). The OpenAI SDK uses OPENAI_BASE_URL with a /v1 suffix. Don’t mix them.

Code change

None.

Node


// Before AND after — no diff
import Anthropic from '@anthropic-ai/sdk';
 
const client = new Anthropic(); // reads env
 
const msg = await client.messages.create({
  model: 'claude-sonnet-4-6',
  max_tokens: 256,
  messages: [{ role: 'user', content: 'hi' }],
});

If you prefer explicit:


const client = new Anthropic({
  baseURL: 'https://api.prxy.monster',
  apiKey: process.env.ANTHROPIC_API_KEY,
});

Verify


curl https://api.prxy.monster/health

Or:


prxy doctor

What you get

Infinite context — long Claude conversations stop hitting the context wall.
MCP optimization — if you wire MCP servers through Claude tool-use, irrelevant tool defs get pruned per request (~67k → ~8k tokens on a real Claude Code session).
Semantic cache — similar prompts return cached responses.
Pattern memory — the issue was X, the fix is Y style insights get logged and re-injected.

Recommended pipeline

For long-running Claude apps or coding agents:


PRXY_PIPE=mcp-optimizer,semantic-cache,patterns,ipc

For Claude Code specifically, add the compaction bridge:


PRXY_PIPE=mcp-optimizer,compaction-bridge,rehydrator,semantic-cache,patterns,ipc

Streaming

Anthropic’s per-block SSE format passes through unchanged — message_start, content_block_start, content_block_delta, message_stop all flow exactly as Anthropic emits them.


const stream = client.messages.stream({
  model: 'claude-sonnet-4-6',
  max_tokens: 1024,
  messages: [{ role: 'user', content: 'write a haiku' }],
});
 
for await (const event of stream) {
  if (event.type === 'content_block_delta' && event.delta.type === 'text_delta') {
    process.stdout.write(event.delta.text);
  }
}

Cache hits are replayed as synthetic Anthropic SSE so your stream parser doesn’t notice the difference.

Common issues

Vertex / Bedrock SDK variants — @anthropic-ai/vertex-sdk and @anthropic-ai/bedrock-sdk use cloud-provider auth, so they don’t talk to prxy.monster directly. Use the standard @anthropic-ai/sdk and route Bedrock models via model: "bedrock/<model-id>" (see Bedrock provider).
Beta headers (anthropic-beta: prompt-caching-2024-07-31 etc.) — pass-through. We don’t strip them.
Tool use / input_schema — pass-through. mcp-optimizer reads tool defs to score relevance; tool execution still happens client-side as usual.

Full example

Plain Node script: github.com/Ekkos-Technologies-Inc/prxy-examples/tree/main/examples/anthropic-quickstart