Using prxy.monster with the OpenAI SDK

prxy.monster exposes a fully OpenAI-compatible API at https://api.prxy.monster/v1. Any client that talks to the OpenAI API talks to prxy.monster.

Install


npm install openai
# or
pip install openai

Configure

The official OpenAI client respects OPENAI_BASE_URL everywhere — Node, Python, edge.


export OPENAI_BASE_URL=https://api.prxy.monster/v1
export OPENAI_API_KEY=prxy_live_xxxxxxxxxxxxxxxxxxxxxxxx

Code change

None. Both openai (Node) and openai (Python) auto-pick up the env var.

Node


// Before AND after — no diff
import OpenAI from 'openai';
 
const client = new OpenAI();
 
const r = await client.chat.completions.create({
  model: 'gpt-4o-mini',
  messages: [{ role: 'user', content: 'hi' }],
});

If you prefer explicit:


const client = new OpenAI({
  baseURL: 'https://api.prxy.monster/v1',
  apiKey: process.env.OPENAI_API_KEY,
});

Verify


curl https://api.prxy.monster/health

Or, with the CLI:


prxy doctor

What you get

Infinite context — chat.completions.create calls compress old turns instead of dropping them.
Semantic cache — similar prompts hit cache, return in 15-30ms.
Pattern memory — successful answers get learned and re-injected.
Cost guards — hard per-request budget caps before the OpenAI bill arrives.

Recommended pipeline


PRXY_PIPE=mcp-optimizer,semantic-cache,patterns,ipc

For batch / cost-sensitive workloads, add exact-cache first:


PRXY_PIPE=exact-cache,semantic-cache,cost-guard,patterns

Streaming


const stream = await client.chat.completions.create({
  model: 'gpt-4o-mini',
  messages: [{ role: 'user', content: 'tell a story' }],
  stream: true,
});
 
for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content ?? '');
}

Works identically. Cache hits replay as synthetic SSE.

Common issues

Function calling / tools — pass-through. mcp-optimizer prunes irrelevant tool defs automatically if you ship many.
response_format: { type: 'json_object' } — pass-through.
Assistants API (v1/assistants, threads, runs) — currently not proxied. Use the Chat Completions API instead.
Embeddings (v1/embeddings) — pass-through, used internally by the semantic-cache module too.

Full example

Plain Node script: github.com/Ekkos-Technologies-Inc/prxy-examples/tree/main/examples/openai-quickstart

prxy.monster speaks the OpenAI Chat Completions wire format. Newer OpenAI features (Responses API, Realtime API) are not yet proxied — track /changelog for support.