POST /v1/chat/completions
OpenAI-compatible Chat Completions API. The gateway translates between OpenAI’s wire shape and the canonical internal representation; modules see the canonical form, your client sees OpenAI shape.
Endpoint
POST https://api.prxy.monster/v1/chat/completionsHeaders
| Header | Required | Notes |
|---|---|---|
Authorization: Bearer <key> | yes | Your prxy_live_xxx key. |
Content-Type: application/json | yes | |
x-prxy-pipe | no | Per-request pipeline override. |
Request body
{
model: string; // any provider's model name
messages: Array<{
role: 'system' | 'user' | 'assistant' | 'tool';
content: string | ContentPart[];
name?: string;
tool_calls?: ToolCall[];
tool_call_id?: string;
}>;
max_tokens?: number;
temperature?: number;
top_p?: number;
stream?: boolean;
stop?: string | string[];
tools?: Array<{
type: 'function';
function: { name: string; description?: string; parameters: object };
}>;
tool_choice?: 'none' | 'auto' | { type: 'function'; function: { name: string } };
response_format?: { type: 'text' | 'json_object' };
seed?: number;
}System message hoisting: the first role: 'system' message gets hoisted to the canonical system field. Subsequent system messages are inlined into the surrounding user/assistant turn (preserving intent across translators).
Response (non-streaming)
{
"id": "chatcmpl-...",
"object": "chat.completion",
"created": 1714200000,
"model": "gpt-4o-mini",
"choices": [
{
"index": 0,
"message": { "role": "assistant", "content": "Hello!" },
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 12,
"completion_tokens": 5,
"total_tokens": 17
}
}Response (streaming)
data: {"id":"chatcmpl-...","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"role":"assistant"}}]}
data: {"id":"chatcmpl-...","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":"Hello"}}]}
data: {"id":"chatcmpl-...","object":"chat.completion.chunk","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}
data: [DONE]Examples
curl
curl https://api.prxy.monster/v1/chat/completions \
-H "Authorization: Bearer prxy_live_xxx" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o-mini",
"messages": [{ "role": "user", "content": "hi" }]
}'Cross-provider routing
The model field accepts any model from any wired provider. The gateway’s router (currently: explicit-match; v1.1: smart routing) picks the upstream:
{ "model": "claude-sonnet-4-6", "messages": [...] } // → Anthropic
{ "model": "gpt-4o", "messages": [...] } // → OpenAI
{ "model": "gemini-2.0-pro", "messages": [...] } // → Google (v1.1)Translation is automatic. A request that says model: 'claude-sonnet-4-6' against /v1/chat/completions (OpenAI shape) gets converted to canonical, sent to Anthropic, response converted back to OpenAI shape on the way out.
Errors
Same error shape and codes as POST /v1/messages, except wrapped in OpenAI’s error format:
{
"error": {
"type": "cost_limit_per_day",
"message": "Daily cost cap exceeded",
"code": "cost_limit_per_day"
}
}