Skip to Content
prxy.monster v1 is in early access. See what shipped →
APIPOST /v1/chat/completions

POST /v1/chat/completions

OpenAI-compatible Chat Completions API. The gateway translates between OpenAI’s wire shape and the canonical internal representation; modules see the canonical form, your client sees OpenAI shape.

Endpoint

POST https://api.prxy.monster/v1/chat/completions

Headers

HeaderRequiredNotes
Authorization: Bearer <key>yesYour prxy_live_xxx key.
Content-Type: application/jsonyes
x-prxy-pipenoPer-request pipeline override.

Request body

{ model: string; // any provider's model name messages: Array<{ role: 'system' | 'user' | 'assistant' | 'tool'; content: string | ContentPart[]; name?: string; tool_calls?: ToolCall[]; tool_call_id?: string; }>; max_tokens?: number; temperature?: number; top_p?: number; stream?: boolean; stop?: string | string[]; tools?: Array<{ type: 'function'; function: { name: string; description?: string; parameters: object }; }>; tool_choice?: 'none' | 'auto' | { type: 'function'; function: { name: string } }; response_format?: { type: 'text' | 'json_object' }; seed?: number; }

System message hoisting: the first role: 'system' message gets hoisted to the canonical system field. Subsequent system messages are inlined into the surrounding user/assistant turn (preserving intent across translators).

Response (non-streaming)

{ "id": "chatcmpl-...", "object": "chat.completion", "created": 1714200000, "model": "gpt-4o-mini", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "Hello!" }, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 12, "completion_tokens": 5, "total_tokens": 17 } }

Response (streaming)

data: {"id":"chatcmpl-...","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"role":"assistant"}}]} data: {"id":"chatcmpl-...","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":"Hello"}}]} data: {"id":"chatcmpl-...","object":"chat.completion.chunk","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]} data: [DONE]

Examples

curl https://api.prxy.monster/v1/chat/completions \ -H "Authorization: Bearer prxy_live_xxx" \ -H "Content-Type: application/json" \ -d '{ "model": "gpt-4o-mini", "messages": [{ "role": "user", "content": "hi" }] }'

Cross-provider routing

The model field accepts any model from any wired provider. The gateway’s router (currently: explicit-match; v1.1: smart routing) picks the upstream:

{ "model": "claude-sonnet-4-6", "messages": [...] } // → Anthropic { "model": "gpt-4o", "messages": [...] } // → OpenAI { "model": "gemini-2.0-pro", "messages": [...] } // → Google (v1.1)

Translation is automatic. A request that says model: 'claude-sonnet-4-6' against /v1/chat/completions (OpenAI shape) gets converted to canonical, sent to Anthropic, response converted back to OpenAI shape on the way out.

Errors

Same error shape and codes as POST /v1/messages, except wrapped in OpenAI’s error format:

{ "error": { "type": "cost_limit_per_day", "message": "Daily cost cap exceeded", "code": "cost_limit_per_day" } }
Last updated on