Migrating from LiteLLM

LiteLLM is most often used as a Python SDK that normalizes calls across providers. It also ships a proxy server. If you’re using the proxy (litellm --port 4000), this guide is for you. If you’re using the SDK only, you’d be stacking it in front of prxy.monster, not replacing it.

What’s the same

OpenAI-compatible HTTP API at the gateway.
Cross-provider routing (Anthropic / OpenAI / Google / Groq / etc.).
BYOK pattern — you bring the provider keys.
Cost tracking + budget controls.

What’s different

	LiteLLM Proxy	prxy.monster
Pipeline	Linear plugins	Composable modules with explicit ordering
Caching	Redis-only	Redis (cloud) or SQLite (local)
Local mode	Self-hosted but stateful	Single Docker container, zero config
Persistent memory	No	`patterns` module
Anthropic shape (`/v1/messages`)	Translation through canonical	Native — Anthropic clients work unchanged
TypeScript SDK	Limited	Full canonical types in `@prxy/shared-types`

Diff: env + config

LiteLLM config.yaml:


model_list:
  - model_name: claude
    litellm_params:
      model: anthropic/claude-sonnet-4-6
      api_key: os.environ/ANTHROPIC_API_KEY
litellm_settings:
  cache: true
  cache_params:
    type: redis
    host: redis-host

prxy.monster equivalent (much shorter):


# .env
ANTHROPIC_API_KEY=sk-ant-xxx
PRXY_PIPE='exact-cache,semantic-cache,cost-guard,patterns'


# Run
docker run -d -p 3099:3099 -v ~/.prxy:/data \
  -e ANTHROPIC_API_KEY=sk-ant-xxx \
  -e PRXY_PIPE='exact-cache,semantic-cache,cost-guard' \
  prxymonster/local:latest

No external Redis required for local mode — SQLite handles caching.

Diff: client code


- # LiteLLM proxy
- export OPENAI_BASE_URL=http://localhost:4000
- export OPENAI_API_KEY=sk-litellm-xxx
+ # prxy.monster (cloud)
+ export OPENAI_BASE_URL=https://api.prxy.monster/v1
+ export OPENAI_API_KEY=prxy_live_xxx
+ # OR prxy.monster (local)
+ export OPENAI_BASE_URL=http://localhost:3099/v1
+ export OPENAI_API_KEY=prxy_local_anything

Mapping LiteLLM features

LiteLLM	prxy.monster
`cache: true`	`exact-cache` + `semantic-cache` modules
`max_budget`	`cost-guard` `perDay` / `perMonth`
`fallbacks: [...]`	`router` module (v1.1)
`router: usage-based-routing`	`router` strategy `smart` (v1.1)
`success_callback` / `failure_callback`	Custom module’s `post` hook
`litellm_settings.guardrails`	`guardrails` module (v1.1)
`set_verbose`	`LOG_LEVEL=debug` env var

What you gain

Native Anthropic shape — Claude SDK works unchanged. LiteLLM normalizes everything to OpenAI-shape on the wire even for Anthropic models.
Persistent memory — patterns module beats LiteLLM’s caching for repetitive coding/support workloads.
Type safety end-to-end — @prxy/shared-types is shipped TypeScript. LiteLLM is Python-first.
Single binary local mode — no Redis, no Postgres, no Docker Compose. Just one container.

What you lose (and the workaround)

LiteLLM feature	prxy.monster status
Massive provider list (~50)	Anthropic + OpenAI + Google + Groq in v1. More via custom provider modules.
Python-native SDK	Use the OpenAI Python SDK against our `/v1/chat/completions` endpoint — same shape.
`litellm.completion()` async batching	Use the OpenAI SDK’s `acreate` directly.

If you’ve built a lot of glue around LiteLLM’s callback hooks, those map cleanly to prxy.monster custom modules. See SDK → Module interface for the migration target.