Using prxy.monster with the Vercel AI SDK
The Vercel AI SDK (ai + @ai-sdk/* provider packages) is the most popular Node-side AI SDK. Both the OpenAI and Anthropic providers respect the standard OPENAI_BASE_URL / ANTHROPIC_BASE_URL env vars, so the integration is zero code change for most setups.
Install
You probably already have these. If not:
npm install ai @ai-sdk/openai @ai-sdk/anthropicConfigure
Pick the provider you’re using.
Anthropic provider
export ANTHROPIC_BASE_URL=https://api.prxy.monster
export ANTHROPIC_API_KEY=prxy_live_xxxxxxxxxxxxxxxxxxxxxxxxCode change
None. Both @ai-sdk/openai and @ai-sdk/anthropic read those env vars automatically.
// Before AND after — no diff
import { anthropic } from '@ai-sdk/anthropic';
import { generateText } from 'ai';
const { text } = await generateText({
model: anthropic('claude-sonnet-4-6'),
prompt: 'Why is the sky blue?',
});If you prefer to be explicit (or you’ve configured the provider via factory), pass baseURL:
import { createAnthropic } from '@ai-sdk/anthropic';
const anthropic = createAnthropic({
baseURL: 'https://api.prxy.monster',
apiKey: process.env.ANTHROPIC_API_KEY, // your prxy key
});import { createOpenAI } from '@ai-sdk/openai';
const openai = createOpenAI({
baseURL: 'https://api.prxy.monster/v1',
apiKey: process.env.OPENAI_API_KEY,
});Verify
curl https://api.prxy.monster/health
# → {"status":"ok"}Or, if you’ve installed the CLI (@prxy/cli):
prxy doctorWhat you get
- Infinite context (
ipc) — yourstreamText/generateTextcalls compress old messages instead of dropping them. - Semantic cache (
semantic-cache) — repeated similar prompts return cached responses, near-zero latency. - Pattern memory (
patterns) — successful answers get learned and re-injected into similar future calls. - Cost guards (
cost-guard) — hard per-request / per-day budget caps.
Recommended pipeline for AI SDK apps
For a typical Next.js chat or RAG app:
PRXY_PIPE=mcp-optimizer,semantic-cache,patterns,ipcFor high-traffic production where cost is the priority:
PRXY_PIPE=exact-cache,semantic-cache,cost-guard,patternsFor agents that use a lot of tools (MCP, function-calling):
PRXY_PIPE=mcp-optimizer,semantic-cache,patterns,ipcSee Customize a pipeline for the full reference.
Streaming
streamText works exactly the same. Cache hits are replayed as synthetic SSE so the streaming UI doesn’t notice. Cache misses pass through the provider stream untouched.
import { anthropic } from '@ai-sdk/anthropic';
import { streamText } from 'ai';
const result = streamText({
model: anthropic('claude-sonnet-4-6'),
prompt: 'Tell me a story',
});
for await (const chunk of result.textStream) {
process.stdout.write(chunk);
}Common issues
- Headers being overwritten? The AI SDK passes through your
apiKeyasAuthorization: Bearer …— that’s exactly what prxy.monster expects. Don’t add a customAuthorizationheader on top. - Using
toolsparameter? Themcp-optimizermodule sees those tool definitions and prunes the irrelevant ones. If you only want a few tools loaded, you don’t need to do anything — the optimizer handles it. generateObject/streamObject? Both work. Schema-based structured output is just a tool call under the hood, and prxy.monster passes it through.
Full example
Working Next.js 15 chat app: github.com/Ekkos-Technologies-Inc/prxy-examples/tree/main/examples/nextjs-vercel-ai
Verify with the Vercel AI SDK docs for the exact options on your installed version — the SDK ships frequent updates and option names occasionally shift.