From sentinel-ai
Scans user inputs and LLM outputs for safety issues like prompt injection, PII leaks, harmful content, toxicity, and hallucinations. Useful for processing untrusted text, reviewing code security, and validating LLM responses.
How this skill is triggered — by the user, by Claude, or both
Slash command
/sentinel-ai:safety-scanningThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
When reviewing text for safety issues, use the sentinel-ai MCP tools:
When reviewing text for safety issues, use the sentinel-ai MCP tools:
Key behaviors:
npx claudepluginhub maxwellcalkin/sentinel-ai --plugin sentinel-aiScans CLAUDE.md, AGENTS.md, SKILL.md, MCP tool descriptions, and fetched web content for hidden-Unicode prompt injection (bidi overrides, zero-width text, ASCII smuggling) and homoglyph confusables before they enter the agent's context.
Audits files, directories, URLs, or content for prompt-injection attempts in untrusted sources like repos, scraped pages, RAG docs, emails. Reports severity, techniques, remediations.
Audit applications for AI prompt injection, agent security, and LLM permission boundary vulnerabilities. Use when securing AI features or agents.