From ultraprompt
**DEFAULT for AI agent safety reviews — dispatches security-auditor + risk-and-controls-reviewer with AI/agent safety focus.**
How this skill is triggered — by the user, by Claude, or both
Slash command
/ultraprompt:ai-agent-safety-review [system|tool|prompt|focus]When to use
Manual-only. Invoke for AI/LLM safety review: tool-calling boundaries, prompt injection vectors, retrieval trust, memory/context handling, autonomy controls, or prompt hardening for a specific prompt or skill body.
[system|tool|prompt|focus]This skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Apply discipline per `${CLAUDE_PLUGIN_ROOT}/_shared/DISCIPLINE.md` (covers `$ARGUMENTS` handling, evidence, validation, and safety).
Apply discipline per ${CLAUDE_PLUGIN_ROOT}/_shared/DISCIPLINE.md (covers $ARGUMENTS handling, evidence, validation, and safety).
Dispatch target: ultraprompt:auditor (focus: ai-safety). See ${CLAUDE_PLUGIN_ROOT}/_shared/DISPATCH-POLICY.md for the full V8 dispatch decision tree, Task call template, and inline-override conditions.
LLM systems have a unique attack surface: prompts and retrieved content are mixed with instructions, and the model treats both as authoritative. Trust boundaries must be explicit (this is data, that is instruction). Tool-calling is privilege escalation: each tool a model can call is an action it can take. Autonomy controls (human-in-loop, confirmation, audit) are part of the design, not afterthoughts.
Run prompt-injection eval cases (test corpus of known injection strings). Test tool argument validation with adversarial inputs. Test memory isolation across simulated users. For prompts: run before/after eval suite if available.
Schema below + ${CLAUDE_PLUGIN_ROOT}/_shared/OUTPUT-CONTRACT.md + concise-review style.
schema:
- field: Scope
type: section
required: true
evidence_rule: "none"
- field: Trust Boundary Map
type: section
required: true
evidence_rule: "none"
- field: Tool Audit
type: section
required: true
evidence_rule: "none"
- field: Retrieval Audit
type: section
required: true
evidence_rule: "none"
- field: Memory/Context Audit
type: section
required: true
evidence_rule: "none"
- field: Autonomy Audit
type: section
required: true
evidence_rule: "none"
- field: Hardenings Applied
type: section
required: true
evidence_rule: "none"
- field: Remaining Risks
type: section
required: true
evidence_rule: "named risk + likelihood + impact"
Scope | Trust Boundary Map | Tool Audit (per tool: side effect, confirmation gate, hardening) | Retrieval Audit | Memory/Context Audit | Autonomy Audit | Hardenings Applied | Remaining Risks
Dispatch auditor with focus=ai-safety. See _shared/playbooks/prompt-injection-patterns.md and _shared/playbooks/prompt-hardening-checklist.md.
This skill answers to V4 names: prompt-hardening. The router resolves them to ai-agent-safety-review and notes the alias in its response.
Guides creation, editing, and verification of skills for AI coding agents using test-driven development with subagent scenarios. Use when authoring or debugging skills.
npx claudepluginhub sokoliem/ultraprompt --plugin ultraprompt