From prodsec-skills
Deploys runtime guardrails for bidirectional prompt and response filtering in AI systems. Use when designing or reviewing AI architectures needing prompt injection protection, content filtering, or input/output safety controls.
How this skill is triggered — by the user, by Claude, or both
Slash command
/prodsec-skills:bidirectional-filteringThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
A guardrails component SHOULD be deployed between the users/applications (or API gateway) and the models. This component acts as a gateway or proxy that inspects and acts on data flowing in **both directions**.
A guardrails component SHOULD be deployed between the users/applications (or API gateway) and the models. This component acts as a gateway or proxy that inspects and acts on data flowing in both directions.
This skill refers to runtime guardrails (a deployed component), not model-level safety training.
Incoming prompts are raw or "tainted" input. The guardrails component analyzes them and applies rule-based actions:
| Action | Description |
|---|---|
| Block | Discard the prompt entirely, preventing it from reaching the model |
| Mask | Redact or obfuscate sensitive data (PII, credentials) before forwarding |
| Modify | Rewrite the prompt to remove dangerous patterns while preserving intent |
| Pass | Allow the prompt through unchanged |
Objectives:
Model responses are inspected before delivery to the user or application:
| Action | Description |
|---|---|
| Block | Suppress the response if it contains harmful or policy-violating content |
| Mask | Redact sensitive data the model may have included in its response |
| Modify | Remove or rewrite problematic portions of the response |
| Pass | Deliver the response unchanged |
Objectives:
User/App → API Gateway → Guardrails → Inference Engine → Model
↕ (inspects both directions)
User/App ← API Gateway ← Guardrails ← Inference Engine ← Model
npx claudepluginhub redhatproductsecurity/prodsec-skills --plugin prodsec-skillsSecurity techniques and quality control for prompts and agents
Builds input/output validation guardrails for LLM apps using NeMo Guardrails Colang and custom Python validators to prevent prompt injection, data leakage, toxic content, and hallucinations.
Implements input/output guardrails for LLM apps using NeMo Guardrails Colang, Python PII/toxicity validators, and Guardrails AI to block prompt injection, data leaks, toxic content, hallucinations, and ensure JSON schema compliance. For AI safety in chatbots, RAG pipelines.