Using prxy.monster with the OpenAI SDK
prxy.monster exposes a fully OpenAI-compatible API at https://api.prxy.monster/v1. Any client that talks to the OpenAI API talks to prxy.monster.
Install
npm install openai
# or
pip install openaiConfigure
The official OpenAI client respects OPENAI_BASE_URL everywhere — Node, Python, edge.
export OPENAI_BASE_URL=https://api.prxy.monster/v1
export OPENAI_API_KEY=prxy_live_xxxxxxxxxxxxxxxxxxxxxxxxCode change
None. Both openai (Node) and openai (Python) auto-pick up the env var.
Node
// Before AND after — no diff
import OpenAI from 'openai';
const client = new OpenAI();
const r = await client.chat.completions.create({
model: 'gpt-4o-mini',
messages: [{ role: 'user', content: 'hi' }],
});If you prefer explicit:
const client = new OpenAI({
baseURL: 'https://api.prxy.monster/v1',
apiKey: process.env.OPENAI_API_KEY,
});Verify
curl https://api.prxy.monster/healthOr, with the CLI:
prxy doctorWhat you get
- Infinite context —
chat.completions.createcalls compress old turns instead of dropping them. - Semantic cache — similar prompts hit cache, return in 15-30ms.
- Pattern memory — successful answers get learned and re-injected.
- Cost guards — hard per-request budget caps before the OpenAI bill arrives.
Recommended pipeline
PRXY_PIPE=mcp-optimizer,semantic-cache,patterns,ipcFor batch / cost-sensitive workloads, add exact-cache first:
PRXY_PIPE=exact-cache,semantic-cache,cost-guard,patternsStreaming
const stream = await client.chat.completions.create({
model: 'gpt-4o-mini',
messages: [{ role: 'user', content: 'tell a story' }],
stream: true,
});
for await (const chunk of stream) {
process.stdout.write(chunk.choices[0]?.delta?.content ?? '');
}Works identically. Cache hits replay as synthetic SSE.
Common issues
- Function calling / tools — pass-through.
mcp-optimizerprunes irrelevant tool defs automatically if you ship many. response_format: { type: 'json_object' }— pass-through.- Assistants API (
v1/assistants, threads, runs) — currently not proxied. Use the Chat Completions API instead. - Embeddings (
v1/embeddings) — pass-through, used internally by thesemantic-cachemodule too.
Full example
Plain Node script: github.com/Ekkos-Technologies-Inc/prxy-examples/tree/main/examples/openai-quickstart
prxy.monster speaks the OpenAI Chat Completions wire format. Newer OpenAI features (Responses API, Realtime API) are not yet proxied — track /changelog for support.
Last updated on