`POST /v1/messages`

Anthropic-compatible Messages API. Mirrors the Anthropic Messages API shape one-to-one. Anything that works against api.anthropic.com works against this endpoint when you swap the base URL.

Endpoint


POST https://api.prxy.monster/v1/messages

Local mode: http://localhost:3099/v1/messages.

Headers

Header	Required	Notes
`Authorization: Bearer <key>`	yes	Your `prxy_live_xxx` key. Local mode: any string.
`Content-Type: application/json`	yes
`x-prxy-pipe`	no	Override pipeline for this request only. Comma list of module names.
`anthropic-version`	no	Forwarded to provider.
`anthropic-beta`	no	Forwarded to provider.

Request body

The full Anthropic Messages schema. The gateway validates with Zod and forwards to the provider.


{
  model: string;                  // e.g. "claude-sonnet-4-6"
  max_tokens: number;             // positive integer
  messages: Array<{
    role: 'user' | 'assistant';
    content: string | ContentBlock[];
  }>;
  system?: string | SystemBlock[];
  temperature?: number;           // 0–2
  top_p?: number;                 // 0–1
  top_k?: number;                 // positive integer
  stop_sequences?: string[];
  stream?: boolean;
  tools?: Tool[];
  metadata?: Record<string, unknown>;
}

ContentBlock is one of: text, image, tool_use, tool_result. SystemBlock supports cache_control for prompt caching.

Response (non-streaming)


{
  "id": "msg_01abc...",
  "type": "message",
  "role": "assistant",
  "model": "claude-sonnet-4-6",
  "content": [
    { "type": "text", "text": "Hello!" }
  ],
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "usage": {
    "input_tokens": 12,
    "output_tokens": 5,
    "cache_read_input_tokens": 0,
    "cache_creation_input_tokens": 0
  }
}

Response (streaming, `stream: true`)

Server-Sent Events with Anthropic’s event-typed envelope:


event: message_start
data: { "type": "message_start", "message": { ... } }

event: content_block_start
data: { "type": "content_block_start", "index": 0, "content_block": { ... } }

event: content_block_delta
data: { "type": "content_block_delta", "index": 0, "delta": { "type": "text_delta", "text": "Hello" } }

event: content_block_stop
data: { "type": "content_block_stop", "index": 0 }

event: message_delta
data: { "type": "message_delta", "delta": { "stop_reason": "end_turn" } }

event: message_stop
data: { "type": "message_stop" }

Cache hits on streaming requests are replayed as a synthetic stream in this exact format. Your client cannot distinguish a cache replay from a real stream — same events, same field shapes.

Examples

curl


curl https://api.prxy.monster/v1/messages \
  -H "Authorization: Bearer prxy_live_xxx" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-6",
    "max_tokens": 1024,
    "messages": [
      { "role": "user", "content": "Write a haiku about distributed systems." }
    ]
  }'

Anthropic SDK (Node)


import Anthropic from '@anthropic-ai/sdk';
 
const client = new Anthropic({
  baseURL: 'https://api.prxy.monster',
  apiKey: process.env.PRXY_KEY,
});
 
const msg = await client.messages.create({
  model: 'claude-sonnet-4-6',
  max_tokens: 1024,
  messages: [{ role: 'user', content: 'Write a haiku about distributed systems.' }],
});
 
console.log(msg.content[0]);

Anthropic SDK (Python)


from anthropic import Anthropic
 
client = Anthropic(
    base_url="https://api.prxy.monster",
    api_key=os.environ["PRXY_KEY"],
)
 
msg = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Write a haiku about distributed systems."}],
)
 
print(msg.content[0])

Per-request pipeline override


curl https://api.prxy.monster/v1/messages \
  -H "Authorization: Bearer prxy_live_xxx" \
  -H "x-prxy-pipe: exact-cache,patterns" \
  -H "Content-Type: application/json" \
  -d '{ ... }'

The override applies to this single call only. Useful for A/B testing.

Error codes

Status	`error.type`	When
400	`invalid_request`	Body fails schema validation.
401	`authentication_error`	Missing / malformed / revoked key.
402	`payment_required`	Out of credits (cloud paid tier).
403	`permission_error`	Action not allowed in current mode (e.g. local-mode billing).
404	`not_found`	Endpoint or resource not found.
429	`cost_limit_per_request` / `cost_limit_per_day` / `cost_limit_per_month`	`cost-guard` enforced a cap.
429	`rate_limit_error`	Per-key request rate limit exceeded.
502	`upstream_error`	Provider returned 5xx after all retries.
503	`service_unavailable`	Storage backend or critical dependency down.

Error body shape:


{
  "type": "error",
  "error": {
    "type": "cost_limit_per_day",
    "message": "Daily cost cap exceeded",
    "limit": 5.00,
    "spent": 4.87,
    "estimated": 0.21,
    "resets_at": "2026-04-28T00:00:00.000Z"
  }
}

POST /v1/messages