Using prxy.monster with LangChain Python
LangChain Python’s chat model classes (ChatOpenAI, ChatAnthropic) accept base_url (or anthropic_api_url on ChatAnthropic) constructor args. Set it once, every chain / agent / LangGraph node downstream inherits it.
Install
pip install langchain-openai langchain-anthropic langchain-coreConfigure
ChatOpenAI
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(
model="gpt-4o",
base_url="https://api.prxy.monster/v1",
api_key="prxy_live_xxxxxxxxxxxxxxxxxxxxxxxx",
)
r = llm.invoke("hi")Code change
If you used env vars: zero. If you wired explicit args: the base_url / anthropic_api_url line above is the only diff.
Verify
curl https://api.prxy.monster/healthRun any chain — successful response confirms routing.
What you get
- Semantic cache across every chain that uses the patched LLM.
- Infinite context for long agent conversations (
ipcmodule). - Pattern memory — successful solutions get learned and re-injected.
- Cost guards — hard per-request budget caps.
LangChain memory vs prxy.monster
LangChain memory (ConversationBufferMemory, ConversationSummaryMemory, etc.) is inside your chain. prxy.monster is between your chain and the LLM. They don’t conflict.
Use LangChain memory for chain-local conversation state. Use prxy.monster for cross-chain caching, cross-session pattern memory, and token-budget management.
LangGraph
from langchain_anthropic import ChatAnthropic
from langgraph.graph import StateGraph
llm = ChatAnthropic(
model="claude-sonnet-4-6",
anthropic_api_url="https://api.prxy.monster",
)
graph = StateGraph(MyState)
graph.add_node("reason", lambda s: {"messages": [llm.invoke(s["messages"])]})
graph.add_node("answer", lambda s: {"messages": [llm.invoke(s["messages"])]})
# every node that uses `llm` routes through prxy.monsterLangSmith tracing is unaffected — it happens client-side before the request leaves the process.
CrewAI
CrewAI agents accept any LangChain LLM. The same ChatAnthropic / ChatOpenAI instance with the prxy base URL plugs in:
from crewai import Agent
from langchain_anthropic import ChatAnthropic
llm = ChatAnthropic(
model="claude-sonnet-4-6",
anthropic_api_url="https://api.prxy.monster",
)
researcher = Agent(role="Researcher", goal="...", llm=llm)Recommended pipeline
RAG chains:
PRXY_PIPE=semantic-cache,patterns,ipc,cost-guardAgentic workflows with tool use:
PRXY_PIPE=mcp-optimizer,semantic-cache,patterns,ipcCommon issues
- Async chains (
await llm.ainvoke(...)) — work identically. Streaming viaastreamworks too. - Tool calling (
llm.bind_tools([...])) — pass-through.mcp-optimizerprunes irrelevant tool defs per call. - Structured output (
llm.with_structured_output(MyModel)) — pass-through. Schema-based extraction is just a tool call under the hood.
Full example
A minimal Python LangChain script will land in examples/langchain-rag — for now, the JS version is one-to-one with the Python equivalent.
Verify the exact constructor argument name with the LangChain Python docs for your installed version. base_url (OpenAI) and anthropic_api_url (Anthropic) are stable as of langchain 0.3.x.