POST /v1/messages
Anthropic-compatible Messages API. Mirrors the Anthropic Messages API shape one-to-one. Anything that works against api.anthropic.com works against this endpoint when you swap the base URL.
Endpoint
POST https://api.prxy.monster/v1/messagesLocal mode: http://localhost:3099/v1/messages.
Headers
| Header | Required | Notes |
|---|---|---|
Authorization: Bearer <key> | yes | Your prxy_live_xxx key. Local mode: any string. |
Content-Type: application/json | yes | |
x-prxy-pipe | no | Override pipeline for this request only. Comma list of module names. |
anthropic-version | no | Forwarded to provider. |
anthropic-beta | no | Forwarded to provider. |
Request body
The full Anthropic Messages schema. The gateway validates with Zod and forwards to the provider.
{
model: string; // e.g. "claude-sonnet-4-6"
max_tokens: number; // positive integer
messages: Array<{
role: 'user' | 'assistant';
content: string | ContentBlock[];
}>;
system?: string | SystemBlock[];
temperature?: number; // 0–2
top_p?: number; // 0–1
top_k?: number; // positive integer
stop_sequences?: string[];
stream?: boolean;
tools?: Tool[];
metadata?: Record<string, unknown>;
}ContentBlock is one of: text, image, tool_use, tool_result. SystemBlock supports cache_control for prompt caching.
Response (non-streaming)
{
"id": "msg_01abc...",
"type": "message",
"role": "assistant",
"model": "claude-sonnet-4-6",
"content": [
{ "type": "text", "text": "Hello!" }
],
"stop_reason": "end_turn",
"stop_sequence": null,
"usage": {
"input_tokens": 12,
"output_tokens": 5,
"cache_read_input_tokens": 0,
"cache_creation_input_tokens": 0
}
}Response (streaming, stream: true)
Server-Sent Events with Anthropic’s event-typed envelope:
event: message_start
data: { "type": "message_start", "message": { ... } }
event: content_block_start
data: { "type": "content_block_start", "index": 0, "content_block": { ... } }
event: content_block_delta
data: { "type": "content_block_delta", "index": 0, "delta": { "type": "text_delta", "text": "Hello" } }
event: content_block_stop
data: { "type": "content_block_stop", "index": 0 }
event: message_delta
data: { "type": "message_delta", "delta": { "stop_reason": "end_turn" } }
event: message_stop
data: { "type": "message_stop" }Cache hits on streaming requests are replayed as a synthetic stream in this exact format. Your client cannot distinguish a cache replay from a real stream — same events, same field shapes.
Examples
curl
curl https://api.prxy.monster/v1/messages \
-H "Authorization: Bearer prxy_live_xxx" \
-H "Content-Type: application/json" \
-d '{
"model": "claude-sonnet-4-6",
"max_tokens": 1024,
"messages": [
{ "role": "user", "content": "Write a haiku about distributed systems." }
]
}'Per-request pipeline override
curl https://api.prxy.monster/v1/messages \
-H "Authorization: Bearer prxy_live_xxx" \
-H "x-prxy-pipe: exact-cache,patterns" \
-H "Content-Type: application/json" \
-d '{ ... }'The override applies to this single call only. Useful for A/B testing.
Error codes
| Status | error.type | When |
|---|---|---|
| 400 | invalid_request | Body fails schema validation. |
| 401 | authentication_error | Missing / malformed / revoked key. |
| 402 | payment_required | Out of credits (cloud paid tier). |
| 403 | permission_error | Action not allowed in current mode (e.g. local-mode billing). |
| 404 | not_found | Endpoint or resource not found. |
| 429 | cost_limit_per_request / cost_limit_per_day / cost_limit_per_month | cost-guard enforced a cap. |
| 429 | rate_limit_error | Per-key request rate limit exceeded. |
| 502 | upstream_error | Provider returned 5xx after all retries. |
| 503 | service_unavailable | Storage backend or critical dependency down. |
Error body shape:
{
"type": "error",
"error": {
"type": "cost_limit_per_day",
"message": "Daily cost cap exceeded",
"limit": 5.00,
"spent": 4.87,
"estimated": 0.21,
"resets_at": "2026-04-28T00:00:00.000Z"
}
}