Migrating from LiteLLM
LiteLLM is most often used as a Python SDK that normalizes calls across providers. It also ships a proxy server. If you’re using the proxy (litellm --port 4000), this guide is for you. If you’re using the SDK only, you’d be stacking it in front of prxy.monster, not replacing it.
What’s the same
- OpenAI-compatible HTTP API at the gateway.
- Cross-provider routing (Anthropic / OpenAI / Google / Groq / etc.).
- BYOK pattern — you bring the provider keys.
- Cost tracking + budget controls.
What’s different
| LiteLLM Proxy | prxy.monster | |
|---|---|---|
| Pipeline | Linear plugins | Composable modules with explicit ordering |
| Caching | Redis-only | Redis (cloud) or SQLite (local) |
| Local mode | Self-hosted but stateful | Single Docker container, zero config |
| Persistent memory | No | patterns module |
Anthropic shape (/v1/messages) | Translation through canonical | Native — Anthropic clients work unchanged |
| TypeScript SDK | Limited | Full canonical types in @prxy/shared-types |
Diff: env + config
LiteLLM config.yaml:
model_list:
- model_name: claude
litellm_params:
model: anthropic/claude-sonnet-4-6
api_key: os.environ/ANTHROPIC_API_KEY
litellm_settings:
cache: true
cache_params:
type: redis
host: redis-hostprxy.monster equivalent (much shorter):
# .env
ANTHROPIC_API_KEY=sk-ant-xxx
PRXY_PIPE='exact-cache,semantic-cache,cost-guard,patterns'# Run
docker run -d -p 3099:3099 -v ~/.prxy:/data \
-e ANTHROPIC_API_KEY=sk-ant-xxx \
-e PRXY_PIPE='exact-cache,semantic-cache,cost-guard' \
prxymonster/local:latestNo external Redis required for local mode — SQLite handles caching.
Diff: client code
- # LiteLLM proxy
- export OPENAI_BASE_URL=http://localhost:4000
- export OPENAI_API_KEY=sk-litellm-xxx
+ # prxy.monster (cloud)
+ export OPENAI_BASE_URL=https://api.prxy.monster/v1
+ export OPENAI_API_KEY=prxy_live_xxx
+ # OR prxy.monster (local)
+ export OPENAI_BASE_URL=http://localhost:3099/v1
+ export OPENAI_API_KEY=prxy_local_anythingMapping LiteLLM features
| LiteLLM | prxy.monster |
|---|---|
cache: true | exact-cache + semantic-cache modules |
max_budget | cost-guard perDay / perMonth |
fallbacks: [...] | router module (v1.1) |
router: usage-based-routing | router strategy smart (v1.1) |
success_callback / failure_callback | Custom module’s post hook |
litellm_settings.guardrails | guardrails module (v1.1) |
set_verbose | LOG_LEVEL=debug env var |
What you gain
- Native Anthropic shape — Claude SDK works unchanged. LiteLLM normalizes everything to OpenAI-shape on the wire even for Anthropic models.
- Persistent memory —
patternsmodule beats LiteLLM’s caching for repetitive coding/support workloads. - Type safety end-to-end —
@prxy/shared-typesis shipped TypeScript. LiteLLM is Python-first. - Single binary local mode — no Redis, no Postgres, no Docker Compose. Just one container.
What you lose (and the workaround)
| LiteLLM feature | prxy.monster status |
|---|---|
| Massive provider list (~50) | Anthropic + OpenAI + Google + Groq in v1. More via custom provider modules. |
| Python-native SDK | Use the OpenAI Python SDK against our /v1/chat/completions endpoint — same shape. |
litellm.completion() async batching | Use the OpenAI SDK’s acreate directly. |
If you’ve built a lot of glue around LiteLLM’s callback hooks, those map cleanly to prxy.monster custom modules. See SDK → Module interface for the migration target.
Last updated on