Skip to Content
prxy.monster v1 is in early access. See what shipped →
APIPOST /v1/messages

POST /v1/messages

Anthropic-compatible Messages API. Mirrors the Anthropic Messages API  shape one-to-one. Anything that works against api.anthropic.com works against this endpoint when you swap the base URL.

Endpoint

POST https://api.prxy.monster/v1/messages

Local mode: http://localhost:3099/v1/messages.

Headers

HeaderRequiredNotes
Authorization: Bearer <key>yesYour prxy_live_xxx key. Local mode: any string.
Content-Type: application/jsonyes
x-prxy-pipenoOverride pipeline for this request only. Comma list of module names.
anthropic-versionnoForwarded to provider.
anthropic-betanoForwarded to provider.

Request body

The full Anthropic Messages schema. The gateway validates with Zod and forwards to the provider.

{ model: string; // e.g. "claude-sonnet-4-6" max_tokens: number; // positive integer messages: Array<{ role: 'user' | 'assistant'; content: string | ContentBlock[]; }>; system?: string | SystemBlock[]; temperature?: number; // 0–2 top_p?: number; // 0–1 top_k?: number; // positive integer stop_sequences?: string[]; stream?: boolean; tools?: Tool[]; metadata?: Record<string, unknown>; }

ContentBlock is one of: text, image, tool_use, tool_result. SystemBlock supports cache_control for prompt caching.

Response (non-streaming)

{ "id": "msg_01abc...", "type": "message", "role": "assistant", "model": "claude-sonnet-4-6", "content": [ { "type": "text", "text": "Hello!" } ], "stop_reason": "end_turn", "stop_sequence": null, "usage": { "input_tokens": 12, "output_tokens": 5, "cache_read_input_tokens": 0, "cache_creation_input_tokens": 0 } }

Response (streaming, stream: true)

Server-Sent Events with Anthropic’s event-typed envelope:

event: message_start data: { "type": "message_start", "message": { ... } } event: content_block_start data: { "type": "content_block_start", "index": 0, "content_block": { ... } } event: content_block_delta data: { "type": "content_block_delta", "index": 0, "delta": { "type": "text_delta", "text": "Hello" } } event: content_block_stop data: { "type": "content_block_stop", "index": 0 } event: message_delta data: { "type": "message_delta", "delta": { "stop_reason": "end_turn" } } event: message_stop data: { "type": "message_stop" }

Cache hits on streaming requests are replayed as a synthetic stream in this exact format. Your client cannot distinguish a cache replay from a real stream — same events, same field shapes.

Examples

curl https://api.prxy.monster/v1/messages \ -H "Authorization: Bearer prxy_live_xxx" \ -H "Content-Type: application/json" \ -d '{ "model": "claude-sonnet-4-6", "max_tokens": 1024, "messages": [ { "role": "user", "content": "Write a haiku about distributed systems." } ] }'

Per-request pipeline override

curl https://api.prxy.monster/v1/messages \ -H "Authorization: Bearer prxy_live_xxx" \ -H "x-prxy-pipe: exact-cache,patterns" \ -H "Content-Type: application/json" \ -d '{ ... }'

The override applies to this single call only. Useful for A/B testing.

Error codes

Statuserror.typeWhen
400invalid_requestBody fails schema validation.
401authentication_errorMissing / malformed / revoked key.
402payment_requiredOut of credits (cloud paid tier).
403permission_errorAction not allowed in current mode (e.g. local-mode billing).
404not_foundEndpoint or resource not found.
429cost_limit_per_request / cost_limit_per_day / cost_limit_per_monthcost-guard enforced a cap.
429rate_limit_errorPer-key request rate limit exceeded.
502upstream_errorProvider returned 5xx after all retries.
503service_unavailableStorage backend or critical dependency down.

Error body shape:

{ "type": "error", "error": { "type": "cost_limit_per_day", "message": "Daily cost cap exceeded", "limit": 5.00, "spent": 4.87, "estimated": 0.21, "resets_at": "2026-04-28T00:00:00.000Z" } }
Last updated on