guardrails
Category: safety · Cloud + Local · Status: v1.0 — production (regex backend)
Content filtering at the gateway layer instead of in your app code.
What it does
Pluggable content filtering. v1 ships the regex backend — local, zero-cost, no third-party calls.
- PII redaction — emails, phone numbers, SSNs, credit cards (16-digit) replaced with placeholders before the request leaves the gateway.
- Profanity block — short built-in list, can be extended via
custom_patterns. - Custom regex policies — your own deny/redact rules.
When to use it
- Customer-facing apps where users send unmoderated input.
- Compliance-sensitive workloads (SOC 2, HIPAA, etc).
- Internal tools where you need to keep secrets out of model logs (block API key paste, etc).
Configuration
guardrails:
pii_redact: true # default false
profanity_block: false # default false
custom_patterns:
- 'sk-[a-zA-Z0-9]{20,}' # block accidental API key paste
- '^password:' # block "password:" leading lines
backend: 'regex' # only 'regex' in v1
on_pii: 'redact' # 'redact' | 'block' | 'log-only'Behavior on violation
| Match type | Default action |
|---|---|
| PII (email/SSN/card/phone) | Redact in place, continue |
| Profanity | 400 short-circuit |
| Custom pattern | 400 short-circuit |
Set on_pii: 'block' to short-circuit on PII too. Set on_pii: 'log-only' to only count matches in metadata without mutating the request.
PII patterns built in
- Email →
[REDACTED_EMAIL] - US SSN (
123-45-6789) →[REDACTED_SSN] - 16-digit credit card (
4111-1111-1111-1111) →[REDACTED_CARD] - North-American phone (
(555) 123-4567) →[REDACTED_PHONE]
Metrics emitted
guardrails.backend— which backend ran.guardrails.stats.pii_redactions— how many PII strings matched.guardrails.stats.blocked_by—'profanity'or'custom:<pattern>'when short-circuited.
Roadmap
- v1.1:
backend: 'callout'for NVIDIA NIM / Anthropic Constitutional / OpenAI Moderation backends. Same config surface — just point at a different inference engine.
Source
Last updated on