Skip to Content
prxy.monster v1 is in early access. See what shipped →
IntegrationsUsing prxy.monster with LangChain JS

Using prxy.monster with LangChain JS

LangChain JS uses provider chat models (ChatOpenAI, ChatAnthropic) that accept a configuration.baseURL (OpenAI) or clientOptions.baseURL (Anthropic) override. One arg per model — every chain, agent, and LangGraph node built on top inherits it.

Install

You probably have these already.

npm install @langchain/openai @langchain/anthropic @langchain/core

Configure

You can either set env vars OR pass options explicitly. Both work; explicit is clearer when reviewing code.

import { ChatOpenAI } from '@langchain/openai'; const llm = new ChatOpenAI({ model: 'gpt-4o', apiKey: process.env.OPENAI_API_KEY, // your prxy key configuration: { baseURL: 'https://api.prxy.monster/v1', }, }); const r = await llm.invoke('hi');

Code change

If you used env vars: zero. If you wired explicit configuration: the baseURL line above is the only diff.

Verify

curl https://api.prxy.monster/health

Build a one-line chain and run it — successful response confirms routing.

What you get

  • Caching across chains — every chain that uses the patched LLM gets semantic-cache for free.
  • Infinite context for long-running agentsipc compresses old messages, your agents stop hitting the wall.
  • Pattern memory — successful agent solutions get re-injected on similar future runs.
  • MCP optimization — if you use LangChain’s MCP tool integration, irrelevant tool defs get pruned per call.

How this composes with LangChain memory

LangChain memory (BufferMemory, SummaryMemory, etc.) lives inside your chain. prxy.monster lives between your chain and the LLM. They don’t conflict — they handle different layers:

ConcernLangChain memoryprxy.monster
Per-chain conversation stateYesNo
Cross-chain cachingNoYes (semantic-cache)
Cross-session pattern learningNoYes (patterns)
Token-budget managementManualAutomatic (ipc, cost-guard)
Tool-def overheadNoYes (mcp-optimizer)

Use both. They compose cleanly.

LangGraph

LangGraph wraps the same chat models. Set baseURL on the underlying ChatOpenAI / ChatAnthropic instance you pass to nodes — every node inherits it.

import { ChatAnthropic } from '@langchain/anthropic'; import { StateGraph } from '@langchain/langgraph'; const llm = new ChatAnthropic({ model: 'claude-sonnet-4-6', clientOptions: { baseURL: 'https://api.prxy.monster' }, }); // Every node that uses `llm` now goes through prxy.monster const graph = new StateGraph(...) .addNode('reason', async (state) => llm.invoke(state.messages)) .addNode('answer', async (state) => llm.invoke(state.messages)) .compile();

LangSmith tracing is unaffected — traces happen client-side before the request leaves your process.

For RAG / chain workflows:

PRXY_PIPE=semantic-cache,patterns,ipc,cost-guard

For agentic workflows with tool use:

PRXY_PIPE=mcp-optimizer,semantic-cache,patterns,ipc

Common issues

  • maxTokens vs max_tokens — LangChain JS uses camelCase; the SDK translates to snake_case before hitting the wire. prxy.monster sees the snake_case version. Nothing to do.
  • Streamingllm.stream(...) works. Cache hits replay as synthetic SSE.
  • Custom callbacks — fire client-side, before the request leaves. They see the original chain state, not what the proxy did.

Full example

LangChain RAG pipeline with semantic caching: github.com/Ekkos-Technologies-Inc/prxy-examples/tree/main/examples/langchain-rag 

Verify the exact constructor option name (configuration.baseURL for OpenAI, clientOptions.baseURL for Anthropic) with the LangChain JS docs  for your installed version.

Last updated on