From Setup Eval
Deep security audit of the agent setup. Runs all deterministic security rules (prompt injection, credential access, data exfiltration, obfuscation, reverse shells, AST behavioral analysis, taint tracking, MCP permission analysis, tool poisoning, YARA signatures, CVE lookups) plus LLM-based semantic security review. Use when the user asks about security, safety, wants to audit their setup, or needs a pre-deployment security check.
How this skill is triggered — by the user, by Claude, or both
Slash command
/setup-eval:setup-eval-securityThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Deep security audit combining deterministic checks with semantic analysis. Two stages: fast pattern-based scanning, then qualitative review of flagged components.
Deep security audit combining deterministic checks with semantic analysis. Two stages: fast pattern-based scanning, then qualitative review of flagged components.
Before doing anything else, ask the user:
Where should i present the results?
- Terminal - print the report here in the conversation
- File - write a markdown report to a file (you'll choose the path)
Wait for their answer before proceeding.
Determine the setup path. If the user doesn't specify one, use the current working directory.
uv run python skills/setup-eval-security/scripts/run_security_scan.py <setup-path>
If the user has a ~/.claude/ directory, pass it as the second argument:
uv run python skills/setup-eval-security/scripts/run_security_scan.py <setup-path> ~/.claude
Read the JSON output. Note which checks were skipped and why.
For every component that has security findings, read the actual file content. You need the real content for the semantic review.
Read rubric/security-review-rubric.md for the review criteria and output format.
For each component, answer the 4 security checks from the rubric. Prioritize components with deterministic findings, but check all components. Use the exact format specified in the rubric (CLEAN/FLAG per check, with evidence).
Read report-format.md and format the combined results following that structure.
Include:
At the very end of the report, include the exact timing:
Evaluated with: setup-eval v{version} (claude-code-plugin)
Duration: [X minutes Y seconds]
Get {version} by running: uv run python -c "import importlib.metadata; print(importlib.metadata.version('setup-eval'))"
Record the timestamp of your first tool call in Step 2 and compute the exact difference when you finish.
npx claudepluginhub redhat-community-ai-tools/harness-eval-lab --plugin setup-evalProvides CDSS development patterns for drug interaction checking, dose validation, clinical scoring (NEWS2, qSOFA), and alert classification integrated into EMR workflows.