Skip to Content
prxy.monster v1 is in early access. See what shipped →
RecipesRecipe — Customer support bot

Customer support bot

For chat agents that field hundreds of similar questions per day. Built around aggressive caching plus tight cost caps to keep per-conversation cost bounded.

What this pipeline is good at

  • 30–50% cache hit rate on common questions.
  • Hard $-per-conversation cap so a single user can’t drain your budget.
  • (v1.1) PII redaction before requests leave the gateway.
  • (v1.1) Smart routing — Haiku for simple questions, Sonnet for hard ones.

The pipeline

PRXY_PIPE='exact-cache,semantic-cache,cost-guard,patterns'

When v1.1 ships, add guardrails first and router last:

PRXY_PIPE='guardrails,exact-cache,semantic-cache,cost-guard,patterns,router'

Why this order

  1. Caches first — exact then semantic. Most common questions hit the cache, never hit the provider.
  2. cost-guard after caches — no point burning cap budget on a request the cache would have answered.
  3. patterns last — only relevant for cache misses; injects context before the provider call.
  4. (v1.1) guardrails before everything — strip PII before any module sees the request.
  5. (v1.1) router after everything else — picks the cheapest model that can handle what’s left after all the optimizations.

Cost math

For a support bot doing 10,000 conversations/month, ~3 turns each:

Without prxy.monsterWith this pipeline
30,000 calls × $0.02 each = $600/mo18,000 cache misses × $0.02 + 12,000 hits × $0 = $360/mo

Plus cap protection: a single user looping a bug at 10 calls/sec costs you $0.50 max instead of $50.

Variants

Knowledge-base Q&A only (no per-user data):

pipeline: - semantic-cache: similarity: 0.85 # looser — KB answers tolerate more variance scope: 'global' ttlSeconds: 604800 # week - cost-guard: { perRequest: 0.10 }

High-traffic with strict redaction (when v1.1 lands):

pipeline: - guardrails: pii_redact: true custom_patterns: ['/sk-[a-zA-Z0-9]{32,}/'] - exact-cache: { ttlSeconds: 3600 } - semantic-cache: { similarity: 0.90, scope: 'global' } - cost-guard: { perRequest: 0.05, perDay: 0.50 }

See also

Last updated on