From claude-security
Orchestrates security scanning combining AI-driven OWASP analysis with Semgrep SAST and CodeQL taint analysis. Cross-validates findings, calculates a risk score, and produces prioritised security audit reports. Invoke with /sentinel or when the user asks to "run security audit", "audit this project", "security scan", or "scan for vulnerabilities".
How this skill is triggered — by the user, by Claude, or both
Slash command
/claude-security:sentinelThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Runs a full security audit of a target project combining AI-driven analysis
configs/semgrep-rules/csharp.yamlconfigs/semgrep-rules/go.yamlconfigs/semgrep-rules/java.yamlconfigs/semgrep-rules/javascript.yamlconfigs/semgrep-rules/php.yamlconfigs/semgrep-rules/python.yamlconfigs/semgrep-rules/ruby.yamlconfigs/semgrep-rules/rust.yamlreferences/dread.mdreferences/stride.mdscripts/consolidate.shscripts/detect-stack.shscripts/pre-commit.shscripts/run-sast.shskills/audit/references/crypto-guidance.mdskills/audit/references/iac-checklist.mdskills/audit/references/owasp-top10.mdskills/red-team/agents/hacktivist.mdskills/red-team/agents/insider.mdskills/red-team/agents/nation-state.mdRuns a full security audit of a target project combining AI-driven analysis (OWASP Top 10, injection, auth, secrets, config, etc.) with Semgrep SAST scanning, then cross-validates and consolidates both sets of findings into a single prioritised report.
You are a security audit specialist. Your mission: systematically identify security vulnerabilities, assess risks, and recommend security improvements. Defense in depth requires multiple layers of validation — this skill combines AI-driven context-aware analysis with automated SAST pattern detection to achieve maximum coverage.
Key principles:
| Flag | Behaviour |
|---|---|
--path <dir> | Target project directory to audit. Default: current working directory. |
--severity <low|medium|high|critical> | Minimum severity to include in report. Default: low. |
--skip-semgrep | Skip the Semgrep SAST scan (Phase 2). Useful when semgrep is not installed. |
--skip-codeql | Skip the CodeQL taint analysis (Phase 2b). Useful when codeql is not installed or the project has no build environment. |
--skip-crossval | Skip cross-validation (Phase 3). Implies --skip-semgrep and --skip-codeql. |
--skip-secrets | Skip the GitLeaks analysis (Phase 3.6). |
--quiet | Suppress progress messages; output findings only. |
--format <md|json> | Output format for consolidated report. Default: md. |
The audit runs in 5 phases. Phase 4 is reference material used when writing recommendations; it does not produce output files of its own.
Parallelisation: Phase 1, Phase 2, and Phase 2b are fully independent — run all three in parallel using the Agent tool (3 subagents). Phase 3 (cross-validation) must wait for all three to complete.
NO TERMINAL OUTPUT PROMPTS. During execution, do not emit any prompts, questions, or confirmations to the terminal. Make all decisions autonomously and proceed without pausing. Do not ask "Should I proceed?" or "Does this look right?" or any other question.
NO OUTPUT PROMPTS. Do not ask the user anything during execution. Do not pause for confirmation. Do not output "Should I proceed?" or "Does this look right?". Execute silently from start to finish.
If you are about to output a prompt or question to the terminal — stop. Do not do it. Make a decision and proceed.
NO EDIT TERMINAL OUTPUT. Do not print any output to the terminal during execution. Do not print "Created file X", "Ran command Y" or "Edit file Z" etc. Only output is the final completion message after everything is done.
NO BASH TERMINAL OUTPUT. Do not print any output to the terminal from bash commands. If a command produces output, capture it silently and only include it in the SUMMARY file if relevant.
NO TASKOUTPUT TERMINAL OUTPUT. Do not print any TaskOutput to the terminal during execution. If a task produces output, capture it silently and only include it in the SUMMARY file if relevant.
FULL REPO EXECUTION RIGHTS. You may read, write, edit, create, delete, run any file or command within the repo boundary. No permission check needed.
COMPLETE OR FAIL — NO PARTIAL. Either finish every step in the plan or stop and write a failure note. Never leave the repo in a half-executed state without a failure record.
Parse --path <dir> from the user's invocation. If not supplied, use $PWD.
Validate the path exists:
TARGET="${FLAG_PATH:-$PWD}"
if [[ ! -d "$TARGET" ]]; then
echo "Error: target directory not found: $TARGET"
exit 1
fi
cd "$TARGET"
Use AiDex MCP as the primary tool for project exploration — it understands code structure (methods, types, properties) and is far faster than filesystem traversal for large codebases.
2a. Initialise the index
Call aidex_session({ path: TARGET }). If .aidex/ does not yet exist, call
aidex_init({ path: TARGET }) first (no need to ask — just do it). The session
call detects externally-modified files and auto-reindexes them.
2b. Project overview
aidex_summary({ path: TARGET }) → entry points, main types, detected languages
aidex_tree({ path: TARGET, depth: 3 }) → directory structure at a glance
Use the summary's entry points and language list to focus the audit on the most relevant files and skip generated/vendored code.
2c. Security-surface signatures
Retrieve method/type signatures for all security-critical file groups — no full file reads needed at this stage:
aidex_signatures({ path: TARGET, pattern: "**/*auth*" }) → auth layer
aidex_signatures({ path: TARGET, pattern: "**/*route*" }) → route handlers
aidex_signatures({ path: TARGET, pattern: "**/*controller*" }) → controllers
aidex_signatures({ path: TARGET, pattern: "**/*middleware*" }) → middleware
aidex_signatures({ path: TARGET, pattern: "**/*handler*" }) → request handlers
aidex_signatures({ path: TARGET, pattern: "**/*model*" }) → data models
aidex_signatures({ path: TARGET, pattern: "**/*db*" }) → database layer
aidex_signatures({ path: TARGET, pattern: "**/*crypto*" }) → crypto utilities
From these signatures, identify which specific files and methods require a full
Read for deeper inspection.
2d. Fallback: manifest-based tech stack detection
For details not captured by AiDex (package versions, lock file presence), use targeted reads rather than broad finds:
# Count lines of code (rough)
git ls-files 2>/dev/null | xargs wc -l 2>/dev/null | tail -1 || true
Produce a Security Inventory:
## Security Inventory
### Authentication
- Type: [JWT/Session/OAuth/API Key/None — detected from code]
- Password hashing: [bcrypt/argon2/scrypt/plaintext/none]
- MFA: [Yes/No]
### Authorization
- Type: [RBAC/ABAC/ACL/None]
- Coverage: [Fine/Coarse/Missing]
### Data Protection
- Encryption at rest: [Yes/No/Unknown]
- Encryption in transit: [Yes/No/Partial]
- PII handling: [Proper/Needs review/Unknown]
### Secrets Management
- Method: [Env vars/Secrets manager/Hardcoded — detected from grep]
### Infrastructure
- HTTPS enforced: [Yes/No/Unknown]
- Security headers present: [Yes/No/Partial]
- Rate limiting: [Implemented/None]
Use AiDex semantic queries as the primary scanner — they match against parsed identifiers (method names, types, properties) and are therefore more precise than regex grep. Follow up with bash-only tools for things AiDex cannot cover (dependency CVEs, secret scanners, raw string literals).
3a. Semantic queries via AiDex
Run all queries against the indexed target; each returns file locations and line numbers for the matching identifier:
# Injection-prone APIs
aidex_query({ path: TARGET, term: "eval", mode: "contains" })
aidex_query({ path: TARGET, term: "exec", mode: "contains" })
aidex_query({ path: TARGET, term: "system", mode: "contains" })
aidex_query({ path: TARGET, term: "shell", mode: "contains" })
aidex_query({ path: TARGET, term: "popen", mode: "contains" })
aidex_query({ path: TARGET, term: "deserializ", mode: "contains" })
aidex_query({ path: TARGET, term: "unpickle", mode: "contains" })
aidex_query({ path: TARGET, term: "fromXml", mode: "contains" })
# Database / query construction
aidex_query({ path: TARGET, term: "query", mode: "contains" })
aidex_query({ path: TARGET, term: "execute", mode: "contains" })
aidex_query({ path: TARGET, term: "rawQuery", mode: "contains" })
aidex_query({ path: TARGET, term: "format", mode: "contains", type_filter: ["method"] })
# Authentication & secrets
aidex_query({ path: TARGET, term: "password", mode: "contains" })
aidex_query({ path: TARGET, term: "secret", mode: "contains" })
aidex_query({ path: TARGET, term: "token", mode: "contains" })
aidex_query({ path: TARGET, term: "apiKey", mode: "contains" })
aidex_query({ path: TARGET, term: "credential", mode: "contains" })
aidex_query({ path: TARGET, term: "hash", mode: "contains", type_filter: ["method"] })
aidex_query({ path: TARGET, term: "verify", mode: "contains", type_filter: ["method"] })
aidex_query({ path: TARGET, term: "jwt", mode: "contains" })
aidex_query({ path: TARGET, term: "session", mode: "contains" })
# Cryptography
aidex_query({ path: TARGET, term: "md5", mode: "contains" })
aidex_query({ path: TARGET, term: "sha1", mode: "contains" })
aidex_query({ path: TARGET, term: "encrypt", mode: "contains" })
aidex_query({ path: TARGET, term: "decrypt", mode: "contains" })
aidex_query({ path: TARGET, term: "random", mode: "contains", type_filter: ["method"] })
# Network / SSRF surface
aidex_query({ path: TARGET, term: "fetch", mode: "contains", type_filter: ["method"] })
aidex_query({ path: TARGET, term: "request", mode: "contains", type_filter: ["method"] })
aidex_query({ path: TARGET, term: "http", mode: "contains" })
aidex_query({ path: TARGET, term: "url", mode: "contains" })
aidex_query({ path: TARGET, term: "redirect", mode: "contains" })
# Authorization / access control
aidex_query({ path: TARGET, term: "permission", mode: "contains" })
aidex_query({ path: TARGET, term: "role", mode: "contains" })
aidex_query({ path: TARGET, term: "isAdmin", mode: "contains" })
aidex_query({ path: TARGET, term: "authorize", mode: "contains" })
aidex_query({ path: TARGET, term: "middleware", mode: "contains" })
# File system / path traversal
aidex_query({ path: TARGET, term: "readFile", mode: "contains" })
aidex_query({ path: TARGET, term: "writeFile", mode: "contains" })
aidex_query({ path: TARGET, term: "path", mode: "contains", type_filter: ["method"] })
aidex_query({ path: TARGET, term: "upload", mode: "contains" })
# Output / rendering (XSS)
aidex_query({ path: TARGET, term: "render", mode: "contains", type_filter: ["method"] })
aidex_query({ path: TARGET, term: "innerHTML", mode: "contains" })
aidex_query({ path: TARGET, term: "dangerously", mode: "contains" })
aidex_query({ path: TARGET, term: "sanitize", mode: "contains" })
aidex_query({ path: TARGET, term: "escape", mode: "contains" })
# Logging (sensitive data in logs)
aidex_query({ path: TARGET, term: "log", mode: "contains", type_filter: ["method"] })
aidex_query({ path: TARGET, term: "debug", mode: "contains", type_filter: ["method"] })
aidex_query({ path: TARGET, term: "print", mode: "contains", type_filter: ["method"] })
For each hit, note the file and line number. Use aidex_signature on the
containing file to understand the method's full context before deciding whether
to Read the implementation.
3b. Dependency and secrets scanners (bash)
These operate on package metadata and raw file content — areas outside AiDex's identifier index:
# Dependency vulnerabilities
npm audit --audit-level=high 2>/dev/null || true
pip-audit 2>/dev/null || safety check 2>/dev/null || true
go mod verify 2>/dev/null || true
# Private key material (raw string, not an identifier)
grep -rn -E "BEGIN (RSA|EC|DSA|OPENSSH|PGP) PRIVATE KEY" \
--exclude-dir=.git . 2>/dev/null | head -10 || true
# Secret / credential scanners
gitleaks detect --source=. 2>/dev/null || true
trufflehog filesystem . --no-update 2>/dev/null || true
# Language-specific SAST
bandit -r . -ll 2>/dev/null || true # Python
eslint --plugin security . 2>/dev/null || true # JS/TS (if configured)
snyk test 2>/dev/null || true # all ecosystems
Use the AiDex query results from Step 1.3 and the signatures from Step 1.2 to
identify which files and methods to read. Only call Read on files that
contain suspicious identifiers — do not bulk-read entire directories.
Prioritised reading order:
*auth*, *login*, *session*, *jwt*)For each file flagged by AiDex, use aidex_signature first to confirm the
method exists and understand its signature, then Read only the relevant method
body and its immediate callsite context.
Assess each OWASP category based on what the code actually does (as revealed by the semantic index), not just filename heuristics. For each finding, record:
VULN-NNN (sequential, zero-padded to 3 digits)Cover all 10 categories at minimum:
| Category | Key Checks |
|---|---|
| A01 Broken Access Control | IDOR, missing authz on endpoints, metadata manipulation, CORS, path traversal |
| A02 Cryptographic Failures | HTTP data transmission, weak algos (MD5/SHA1 for passwords), hardcoded keys, data in logs |
| A03 Injection | SQL, NoSQL, command, LDAP injection; parameterised query usage |
| A04 Insecure Design | Missing rate limiting, no account lockout, weak password policy, undefined trust boundaries |
| A05 Security Misconfiguration | Default creds, unnecessary features, verbose errors, missing security headers, debug mode |
| A06 Vulnerable Components | Outdated deps with CVEs; unmaintained packages |
| A07 Authentication Failures | Brute-force protection, session tokens in URLs, sessions not invalidated on logout |
| A08 Software Integrity Failures | Missing lock files, insecure CI/CD, unsafe deserialization |
| A09 Logging Failures | No security event logging, sensitive data in logs, insufficient audit trail |
| A10 SSRF | User-controlled URLs in server requests, missing URL allowlist validation |
Also check for secrets, privilege escalation, and AWS/cloud-specific misconfigurations if infrastructure code is present.
Why this is its own step: IDOR / missing-authz defects are absence-of-call
bugs — there's no malicious pattern to grep for, only a missing one. Semgrep
--config=auto has no rule for them. CodeQL security-extended has no IDOR
rule either (it ships taint-flow queries, which don't model project-specific
authorization contracts). If you skip this walk, the audit will silently miss
this entire class of vulnerability — exactly the failure mode Sentinel exists
to prevent. Do not rely on the SAST tools to catch IDOR; they cannot.
Walk every HTTP-exposed handler. For each @RestController /
@Controller / Express route / Django view / Rails action / Gin handler /
ASP.NET controller etc.:
Enumerate handlers that take resource identifiers from the request
(@PathVariable, @RequestParam, req.params.id, req.query.id,
params[:id], c.Param("id"), [FromRoute] Guid id, …). UUIDs,
integer IDs, slugs, MSA numbers, journey IDs, request IDs — anything that
names a specific row the caller is trying to read or mutate.
For each handler, follow the parameter into the data access layer. You're tracing one question: between the request entering the handler and the parameter reaching a repository / DAO / SQL call, is there any authorization check that involves the currently authenticated user?
What counts as an authorization check. Watch for any of:
| Form | Examples |
|---|---|
| Declarative annotation | @PreAuthorize, @PostAuthorize, @RolesAllowed, @Secured("ROLE_X") that names a per-resource SpEL or role, [Authorize(Policy=…)], permission_classes, Pundit authorize @post, CanCanCan authorize! :read, @post |
| Project-named function | anything reading like canAccess, isAuthorizedFor, hasAccessTo, assertOwnership, *Permission(...), *Access(...), verifyOwns(...), ensureUserCan... |
| Filtered query | repository method that scopes by user/tenant in the WHERE clause (findByIdAndOwnerId, where(user_id: current_user.id), RLS policy declarations) |
| Middleware / interceptor | route-level middleware that resolves the resource and 403s if the user can't access it (note: a generic @Secured("USER") that only confirms the caller is logged in is not sufficient — it does not check this user can access this resource) |
Authentication ≠ authorization. withLoggedInUser { … }, JWT
verification, req.user != null, @Secured("USER") — these only prove
someone is logged in. They do not prove that someone owns the resource
identified by the path variable. Flag handlers that have authentication
but no resource-level authorization.
Watch for the "userDTO is passed but ignored" anti-pattern. A common
shape is: the controller resolves the logged-in user and passes a userDTO
(or currentUser, principal, User u) into the service, and the service
accepts the parameter but never uses it for an ownership predicate. The
parameter is plumbed in to look right but is dead in the body. Read the
service method top-to-bottom and confirm userDTO actually drives a
query filter or a check — not just a log line.
Watch for handlers that omit authentication entirely. Some IDORs are
even simpler than the above: the handler is @Secured("USER") but never
resolves which user is calling it (no principal, no
SecurityContextHolder, no req.user). It then performs a state change
keyed on the path variable. Anyone authenticated can call it for any
target ID.
Record each candidate. For every handler that fails the walk, produce
a finding even if you're not 100% sure — IDOR confirmations are cheap for
the human reader (10 seconds of code-reading) and the cost of missing one
is high. Mark uncertain ones as Medium severity with (verify) in the
title and let the reviewer decide. Severity guidance:
@Secured("USER")) and no resource check.Examples of true positives vs. false positives.
// ❌ IDOR (true positive) — userDTO passed but never used for filtering
override fun getMessagesByJourneyIdAndType(
userDTO: UserDTO, journeyId: String, type: MessageTypeEnum
): List<MessageDTO> {
return messageRepositoryService.findAllByJourneyIdSorted(journeyId, type)
// userDTO is unused — any authed user can read any journey
}
// ❌ IDOR (true positive) — handler never resolves the calling user
@Secured("USER")
@PostMapping("/messagePause/{journeyIdOrRequestId}")
fun pauseMessage(@PathVariable journeyIdOrRequestId: String, …): ResponseEntity<Void> {
messageService.pauseMessage(journeyIdOrRequestId, messagePause, …)
// no withLoggedInUser, no ownership check
}
// ✅ Safe — ownership asserted before repository access
override fun deleteDraftAttachment(userDTO: UserDTO, attachmentId: UUID) {
val attachment = attachmentRepo.findById(attachmentId)
?: throw NotFoundException()
require(premiseRepositoryService.isUserHasAccessToPremise(
userDTO.email!!, attachment.premiseId!!))
attachmentRepo.delete(attachment)
}
# ✅ Safe — query filters by current user
def get_messages(request, journey_id):
return Message.objects.filter(
journey_id=journey_id,
journey__owner=request.user # ownership baked into the query
)
When you finish the walk, write a one-line summary in the Security
Inventory under "Authorization model" stating which primitive(s) the
project uses (@PreAuthorize, withLoggedInUser + imperative
isUserHasAccessToPremise, RLS policies, Pundit, …). This both documents
coverage and gives a future Sentinel run a hint about what to look for.
Architectural recommendation to surface when imperative authz is detected.
If the project uses imperative ownership checks (per-method if (canAccess())
rather than @PreAuthorize / equivalent), include this in the report's
recommendations:
"Authorization is performed imperatively across N controllers; consider migrating to declarative authorization (
@PreAuthorize/ route-level middleware / RLS policies) so future regressions can be caught by generic static analysis. Currently the only reliable detection is a full manual controller walk on every audit."
JavaScript/Node.js
// ❌ SQL Injection
db.query("SELECT * FROM users WHERE id = " + req.params.id);
// ✅ Parameterised
db.query("SELECT * FROM users WHERE id = $1", [req.params.id]);
// ❌ Command Injection
exec("ls " + userInput);
// ✅ Safe
spawn("ls", [userInput]);
Python
# ❌ SQL Injection
cursor.execute(f"SELECT * FROM users WHERE id = {user_id}")
# ✅ Parameterised
cursor.execute("SELECT * FROM users WHERE id = %s", (user_id,))
# ❌ Command Injection
os.system(f"ls {user_input}")
# ✅ Safe
subprocess.run(["ls", user_input], check=True)
Go
// ❌ SQL Injection
db.Query(fmt.Sprintf("SELECT * FROM users WHERE id = %s", userId))
// ✅ Parameterised
db.Query("SELECT * FROM users WHERE id = $1", userId)
C#
// ❌ SQL Injection
command.CommandText = $"SELECT * FROM users WHERE id = {userId}";
// ✅ Parameterised
var cmd = new SqlCommand("SELECT * FROM users WHERE id = @Id", conn);
cmd.Parameters.AddWithValue("@Id", userId);
PHP
// ❌ SQL Injection
$query = "SELECT * FROM users WHERE id = $userId";
// ✅ Prepared statement
$stmt = $pdo->prepare("SELECT * FROM users WHERE id = ?");
$stmt->execute([$userId]);
Verify all of these are present in HTTP responses:
| Header | Recommended Value |
|---|---|
Content-Security-Policy | default-src 'self'; script-src 'self' |
X-Frame-Options | DENY or SAMEORIGIN |
X-Content-Type-Options | nosniff |
Strict-Transport-Security | max-age=31536000; includeSubDomains |
Referrer-Policy | strict-origin-when-cross-origin |
Permissions-Policy | restrict unnecessary browser features |
Before writing any report files, resolve the output directory and generate a timestamp so all reports from this run share the same timestamp suffix:
OUTPUT_DIR="$(git rev-parse --show-toplevel 2>/dev/null || echo "$PWD")/reports"
mkdir -p "$OUTPUT_DIR"
TIMESTAMP=$(date +"%Y%m%d_%H%M%S")
REPORT_FILE="${OUTPUT_DIR}/security-audit-${TIMESTAMP}.md"
CONSOLIDATED_FILE="${OUTPUT_DIR}/security-audit-consolidated-${TIMESTAMP}.md"
SEMGREP_JSON="${OUTPUT_DIR}/semgrep-results.json"
CODEQL_SARIF="${OUTPUT_DIR}/codeql-results.sarif"
# Normalized JSON files for consolidate.sh (written at end of each phase)
AI_FINDINGS_JSON="${OUTPUT_DIR}/ai-findings.json"
SEMGREP_NORMALIZED="${OUTPUT_DIR}/semgrep-normalized.json"
CODEQL_NORMALIZED="${OUTPUT_DIR}/codeql-normalized.json"
CONSOLIDATED_JSON="${OUTPUT_DIR}/consolidated-findings.json"
Save the full findings to the project root:
cat > "${REPORT_FILE}" << 'REPORTEOF'
# Security Audit Report
## Executive Summary
- **Project**: [detected project name]
- **Audit Date**: [today's date]
- **Auditor**: Claude (AI-driven) via /sentinel
- **Overall Risk Level**: [Critical / High / Medium / Low]
## Security Inventory
[from Step 1.2]
## Findings Summary
| Severity | Count | Fixed | Remaining |
|----------|-------|-------|-----------|
| 🔴 Critical | X | 0 | X |
| 🟠 High | X | 0 | X |
| 🟡 Medium | X | 0 | X |
| 🟢 Low | X | 0 | X |
## Detailed Findings
[one section per VULN-NNN]
### [VULN-001] [Title]
**Severity**: [Critical/High/Medium/Low]
**Category**: [OWASP A0X]
**File**: `path/to/file.ext:line`
**Description**: [what it is and why it matters]
**Evidence**:
[code snippet or grep output]
**Recommendation**: [specific fix with code example]
## Positive Observations
[good security practices found in the codebase]
## Quick Remediation Commands
```bash
[dependency upgrade commands, config fixes, etc.]
REPORTEOF
Report **must** be written to `${REPORT_FILE}` (inside `reports/`, never a temp directory).
After writing the report, also emit a machine-readable findings file for Phase 3 consolidation:
```bash
# Write AI findings in consolidate.sh-compatible format
jq -n --argjson findings "$(echo '[
{ "title": "VULN-001 title", "file": "path/file.py", "line": 42,
"severity": "HIGH", "message": "description", "source_tool": "claude",
"cwe": "CWE-89" }
]')" '{tool: "claude", findings: $findings}' > "${AI_FINDINGS_JSON}"
Replace the placeholder array with the actual findings array built during the audit.
Each finding object: {title, file, line, severity, message, source_tool: "claude", cwe}.
Severity values: CRITICAL, HIGH, MEDIUM, LOW.
Skip this phase if --skip-semgrep or --skip-crossval was supplied.
if ! command -v semgrep >/dev/null 2>&1; then
echo "[appsec] Warning: semgrep not found — skipping SAST scan."
echo "[appsec] Install with: pip install semgrep"
# Set SEMGREP_AVAILABLE=false and continue to Phase 3 note
fi
echo "[appsec] Running Semgrep SAST scan..."
SKILL_DIR="${CLAUDE_SKILL_DIR:-$(dirname "$0")}"
CUSTOM_RULES_DIR="${SKILL_DIR}/configs/semgrep-rules"
# Run with both auto (default ruleset) and custom rules layered on top.
# Use --config twice: once for the community rules, once for the local rules directory.
if [[ -d "${CUSTOM_RULES_DIR}" ]] && ls "${CUSTOM_RULES_DIR}"/*.yaml >/dev/null 2>&1; then
semgrep scan --json \
--config=auto \
--config="${CUSTOM_RULES_DIR}" \
--output="${SEMGREP_JSON}" . || true
else
semgrep scan --json --config=auto --output="${SEMGREP_JSON}" . || true
fi
# Exit code 1 from semgrep means findings were detected — not a fatal error
Output is written to ${SEMGREP_JSON} (inside the reports/ directory).
The Semgrep JSON schema:
{
"results": [
{
"check_id": "rule-id",
"path": "file/path",
"start": {"line": 10, "col": 5},
"end": {"line": 10, "col": 20},
"extra": {
"message": "Finding description",
"severity": "ERROR|WARNING|INFO",
"metadata": {
"category": "security",
"cwe": ["CWE-79"],
"owasp": ["A03:2021"]
}
}
}
],
"errors": []
}
Severity mapping from Semgrep to standard scale:
| Semgrep | Standard |
|---|---|
ERROR | Critical / High |
WARNING | Medium |
INFO | Low |
Categorise results:
semgrep-supply-chain or r2c-security-audit)Count findings per category and severity for use in Phase 3.
Then normalize to the consolidate.sh input format:
# Normalize Semgrep JSON → consolidate.sh format
jq '{
tool: "semgrep",
findings: [.results[] | {
title: .check_id,
file: .path,
line: (.start.line // 0),
severity: (
if .extra.severity == "ERROR" then "HIGH"
elif .extra.severity == "WARNING" then "MEDIUM"
else "LOW" end
),
message: .extra.message,
source_tool: "semgrep",
cwe: (.extra.metadata.cwe // null)
}]
}' "${SEMGREP_JSON}" > "${SEMGREP_NORMALIZED}"
Skip this phase if --skip-codeql or --skip-crossval was supplied.
Use gh codeql (the GitHub CLI extension) as the preferred runner. Fall back
to the standalone codeql binary only if gh is unavailable.
if gh codeql --version >/dev/null 2>&1; then
CODEQL_CMD="gh codeql"
elif command -v codeql >/dev/null 2>&1; then
CODEQL_CMD="codeql"
else
echo "[appsec] Warning: neither 'gh codeql' nor 'codeql' found — skipping taint analysis."
echo "[appsec] Install with: gh extension install github/gh-codeql"
echo "[appsec] OR: brew install codeql"
CODEQL_AVAILABLE=false
fi
Map every supported language present in the project to CodeQL language identifiers. Unlike phase 1 (which picks only the dominant language), CodeQL should run for every language found so that polyglot codebases receive full coverage.
| Detected file(s) | CodeQL language |
|---|---|
*.py | python |
*.js, *.ts, *.jsx, *.tsx | javascript |
*.java, pom.xml, *.gradle | java |
*.go, go.mod | go |
*.cs, *.csproj | csharp |
*.rb, Gemfile | ruby |
*.cpp, *.c, *.h | cpp |
*.swift | swift |
# Collect ALL languages present (not just the dominant one)
CODEQL_LANGS=()
declare -A seen_langs # de-duplicate (e.g. js + ts both map to javascript)
for ext_lang in "py:python" "js:javascript" "jsx:javascript" "ts:javascript" "tsx:javascript" \
"java:java" "go:go" "cs:csharp" "rb:ruby" \
"cpp:cpp" "c:cpp" "h:cpp" "swift:swift"; do
ext="${ext_lang%%:*}"; lang="${ext_lang##*:}"
if [[ -n "${seen_langs[$lang]+x}" ]]; then continue; fi
count=$(find . -name "*.${ext}" \
-not -path "*/node_modules/*" \
-not -path "*/.git/*" \
-not -path "*/vendor/*" \
-not -path "*/dist/*" \
2>/dev/null | wc -l)
if [[ $count -gt 0 ]]; then
CODEQL_LANGS+=("$lang")
seen_langs[$lang]=1
fi
done
# Also detect via manifest files for languages with few source files
[[ -f go.mod ]] && [[ -z "${seen_langs[go]+x}" ]] && CODEQL_LANGS+=("go") && seen_langs[go]=1
[[ -f Gemfile ]] && [[ -z "${seen_langs[ruby]+x}" ]] && CODEQL_LANGS+=("ruby") && seen_langs[ruby]=1
[[ -n "$(find . -name 'pom.xml' -o -name '*.gradle' -not -path '*/.git/*' 2>/dev/null | head -1)" ]] \
&& [[ -z "${seen_langs[java]+x}" ]] && CODEQL_LANGS+=("java") && seen_langs[java]=1
if [[ ${#CODEQL_LANGS[@]} -eq 0 ]]; then
echo "[appsec] Could not detect any supported CodeQL language — skipping."
CODEQL_AVAILABLE=false
else
echo "[appsec] Detected CodeQL languages: ${CODEQL_LANGS[*]}"
fi
For each detected language, create a database and run the security-extended query suite. Compiled languages (Java, C#, C++) require a working build environment; skip gracefully if database creation fails.
CODEQL_SARIF_FILES=() # collect per-language SARIF paths for Phase 3
for CODEQL_LANG in "${CODEQL_LANGS[@]}"; do
DB_DIR=".codeql-db-${CODEQL_LANG}"
LANG_SARIF="${OUTPUT_DIR}/codeql-results-${CODEQL_LANG}.sarif"
echo "[appsec] Creating CodeQL database (language: ${CODEQL_LANG})..."
$CODEQL_CMD database create "${DB_DIR}" \
--language="${CODEQL_LANG}" \
--source-root=. \
--overwrite \
2>&1 | tail -5 || {
echo "[appsec] Warning: CodeQL database creation failed for ${CODEQL_LANG} — skipping."
continue
}
echo "[appsec] Running CodeQL security-extended analysis (${CODEQL_LANG})..."
$CODEQL_CMD database analyze "${DB_DIR}" \
"codeql/${CODEQL_LANG}-queries:codeql-suites/${CODEQL_LANG}-security-extended.qls" \
--format=sarif-latest \
--output="${LANG_SARIF}" \
2>&1 | tail -5 || true
if [[ -f "${LANG_SARIF}" ]]; then
CODEQL_SARIF_FILES+=("${LANG_SARIF}")
echo "[appsec] CodeQL results written: ${LANG_SARIF}"
fi
done
# Backward-compat alias: point CODEQL_SARIF at the first result file (used in Phase 3 template)
CODEQL_SARIF="${CODEQL_SARIF_FILES[0]:-${OUTPUT_DIR}/codeql-results.sarif}"
Database creation requires a working build environment for compiled languages (Java, C#, C++). For interpreted languages (Python, JS, Ruby) it works without a build step.
CodeQL produces SARIF 2.1.0. Key fields:
{
"runs": [{
"results": [{
"ruleId": "py/sql-injection",
"message": { "text": "..." },
"locations": [{ "physicalLocation": {
"artifactLocation": { "uri": "app/db.py" },
"region": { "startLine": 42 }
}}],
"properties": { "severity": "error", "precision": "high" }
}],
"tool": { "driver": { "rules": [{
"id": "py/sql-injection",
"properties": { "tags": ["security","correctness","external/cwe/cwe-089"] }
}] }}
}]
}
Severity mapping:
CodeQL severity | Standard |
|---|---|
error + precision: high/very-high | Critical / High |
error + precision: medium | High / Medium |
warning | Medium |
recommendation | Low |
For taint-flow findings, extract the full source → sink path from codeFlows if present — this is CodeQL's differentiating value over Semgrep. Store taint paths in a separate variable for use during Step 3.2 cross-validation commentary (not in the normalized findings file).
Count findings per severity for use in Phase 3.
Then normalize all per-language SARIF files into a single consolidate.sh input file:
# Normalize all CodeQL SARIF files → consolidate.sh format
# Build a rules map (ruleId → CWE tag) from the driver rules section
jq -s '{
tool: "codeql",
findings: [
.[] | .runs[]? |
(.tool.driver.rules // [] | map({key: .id, value: (.properties.tags // [] | map(select(startswith("external/cwe/"))) | first // null)}) | from_entries) as $rules |
.results[]? | {
title: .ruleId,
file: (.locations[0].physicalLocation.artifactLocation.uri // "unknown"),
line: (.locations[0].physicalLocation.region.startLine // 0),
severity: (
if (.properties.severity // "warning") == "error" then "HIGH"
elif (.properties.severity // "warning") == "warning" then "MEDIUM"
else "LOW" end
),
message: .message.text,
source_tool: "codeql",
cwe: ($rules[.ruleId] // null)
}
]
}' "${CODEQL_SARIF_FILES[@]}" > "${CODEQL_NORMALIZED}"
Skip this phase if --skip-crossval was supplied.
If semgrep or codeql was skipped or produced no output, produce a note in the
consolidated report explaining the gap.
Run consolidate.sh on the normalized files written at the end of each phase.
This deduplicates findings, assigns SENTINEL-XXX IDs, and produces a compact
structured summary — avoiding the need to cat raw JSON/SARIF files into context.
SKILL_DIR="${CLAUDE_SKILL_DIR:-$(dirname "$0")}"
CONSOLIDATE="${SKILL_DIR}/scripts/consolidate.sh"
# Collect whichever normalized files exist
NORM_FILES=()
[[ -f "${AI_FINDINGS_JSON}" ]] && NORM_FILES+=("${AI_FINDINGS_JSON}")
[[ -f "${SEMGREP_NORMALIZED}" ]] && NORM_FILES+=("${SEMGREP_NORMALIZED}")
[[ -f "${CODEQL_NORMALIZED}" ]] && NORM_FILES+=("${CODEQL_NORMALIZED}")
if [[ ${#NORM_FILES[@]} -gt 0 ]]; then
bash "${CONSOLIDATE}" "${NORM_FILES[@]}" > "${CONSOLIDATED_JSON}"
else
echo '{"findings":[],"summary":{"total":0},"metadata":{}}' > "${CONSOLIDATED_JSON}"
fi
# Read the compact summary — findings with id/title/file/line/severity/source_tool only
jq '{
summary,
metadata,
findings: [.findings[] | {id, title, file, line, severity, source_tool, cwe}]
}' "${CONSOLIDATED_JSON}"
This output is what Phase 3 analysis operates on. Do not cat the raw
SEMGREP_JSON or CODEQL_SARIF files — all relevant data is already extracted
into the normalized files. The source_tool field on each finding drives the
cross-validation table in Step 3.2.
Build a comparison table:
| Finding | Claude Audit | Semgrep | CodeQL | Final Severity | Status |
|---|---|---|---|---|---|
| SQL Injection in auth.py:45 | ✅ | ✅ | ✅ | Critical | CONFIRMED (all 3) |
| CVE-2023-12345 in requests | ❌ | ✅ | ❌ | High | NEW (Semgrep) |
| IDOR in user endpoint | ✅ | ❌ | ❌ | High | NEW (AI) |
| Taint flow: req→db.query | ❌ | ❌ | ✅ | Critical | NEW (CodeQL) |
Confidence tiers:
Deduplication rules:
Severity reconciliation — when tools disagree, use the highest across all tools and document each tool's assessment:
| Semgrep severity | Standard |
|---|---|
ERROR | Critical / High |
WARNING | Medium |
INFO | Low |
For Semgrep-only CVE findings, map CVSS score: ≥9.0 → Critical, ≥7.0 → High, ≥4.0 → Medium, <4.0 → Low.
Categorise findings into:
For each tool-only finding, note why the others likely missed it:
| Category | Why Claude missed | Why Semgrep missed | Why CodeQL missed |
|---|---|---|---|
| CVEs / SCA | no CVE DB lookup | n/a | n/a |
| Pattern injection | possible, check context | n/a | n/a |
| Taint flow (cross-file) | possible | single-file / no dataflow | n/a |
| Business logic | n/a | no semantic understanding | rule-based only |
| Multi-file chains | n/a | single-file analysis | may catch if taint-reachable |
| Architecture flaws | n/a | rule-based only | rule-based only |
| Unsupported language | n/a | broad language support | limited language support |
False Positives Analysis
Before including a finding in the consolidated report, assess whether it is a false positive:
Document any discarded false positives and the reasoning in the consolidated report (Part 4 — False Positives Analysis).
cat > "${CONSOLIDATED_FILE}" << 'CONSOLIDATEDEOF'
# Consolidated Security Audit Report
## Executive Summary
**Audit Date**: [today]
**Project**: [name]
**Audit Methods**: AI-Driven (Claude /sentinel) + Semgrep SAST + CodeQL Taint Analysis
### Key Findings
- **Total Unique Vulnerabilities**: [N]
- **Confirmed by All 3 Tools**: [N]
- **Confirmed by 2 Tools**: [N]
- **AI-Only**: [N]
- **Semgrep-Only**: [N]
- **CodeQL-Only**: [N]
### Severity Breakdown
| Severity | Count |
|----------|-------|
| 🔴 Critical | X |
| 🟠 High | X |
| 🟡 Medium | X |
| 🟢 Low | X |
---
## Part 1: Confirmed Vulnerabilities
> Issues detected by multiple tools — highest confidence
### [VULN-CONF-001] [Title]
- **File**: `path/to/file:line`
- **Severity**: [Critical/High/Medium/Low]
- **CWE**: CWE-XX
- **OWASP**: A0X:2021
- **Detection**:
- ✅ Claude: [original finding ID]
- ✅ Semgrep: rule `rule-id`
- ✅ CodeQL: rule `codeql/lang-queries:path/to/Rule.ql` (taint path: [source → sink])
**Description**: [what it is and why it matters]
**Evidence**:
[code snippet]
**Recommendation**: [specific fix with code example]
---
## Part 2: Semgrep-Specific Findings
> Issues detected by Semgrep (SAST/SCA) but not in AI audit
### [VULN-SEM-001] [Title]
- **File**: `path/to/file:line`
- **Severity**: [severity]
- **Semgrep Rule**: `rule-id`
- **Category**: [SCA/SAST/Secrets]
- **CWE**: CWE-XX
**Why Claude missed this**: [likely reason: specific CVE, pattern-based, etc.]
**Description**: [from Semgrep message]
**Recommendation**: [how to fix]
---
## Part 2b: CodeQL-Specific Findings
> Issues detected by CodeQL taint analysis but not found by AI audit or Semgrep
### [VULN-CQL-001] [Title]
- **File**: `path/to/file:line`
- **Severity**: [severity]
- **CodeQL Rule**: `rule-id`
- **CWE**: CWE-XX
- **Taint Flow**: `[source location] → [sanitizer skip / call chain] → [sink location]`
**Why other tools missed this**: [e.g. cross-file dataflow requires inter-procedural analysis]
**Description**: [from CodeQL message]
**Recommendation**: [how to fix — break the taint chain]
---
## Part 3: AI-Specific Findings
> Issues detected by Claude but not flagged by Semgrep or CodeQL
### [VULN-COP-001] [Title]
- **File**: `path/to/file:line`
- **Severity**: [severity]
- **Original ID**: [from SECURITY_AUDIT_REPORT.md]
**Why Semgrep missed this**: [likely reason: business logic, multi-file chain, design flaw]
**Description**: [from AI audit]
**Recommendation**: [from AI audit]
---
## Part 4: False Positives Analysis
### Semgrep False Positives
- [Any Semgrep findings discarded after manual review]
### AI False Positives
- [Any Claude findings where surrounding context neutralises the risk]
---
## Part 5: Dependency Vulnerabilities (SCA)
| Dependency | Version | CVE | Severity | CVSS | Fix Version |
|------------|---------|-----|----------|------|-------------|
### Remediation Commands
```bash
# Python
pip install --upgrade <package>==<fixed_version>
# Node.js
npm audit fix
# Go
go get <module>@<fixed_version>
| Type | File | Line | Severity | Action |
|---|
Immediate actions:
gitleaks detect --source=. --log-opts="HEAD~50..HEAD"Total Files Scanned: [N]
Total Lines of Code: [N]
Files with Vulnerabilities: [N]
Vulnerability Density: [vulns per 1000 LOC]
Claude Detections: [N]
Semgrep Detections: [N]
Overlapping: [N] ([%]%)
Unique to Claude: [N]
Unique to Semgrep: [N]
Total Unique Issues: [N]
semgrep scan --config=auto to CI/CD pipelinereports/security-audit-${TIMESTAMP}.md)reports/semgrep-results.jsonreports/codeql-results-<lang>.sarif (one file per detected language)
The file **must** be written to `${CONSOLIDATED_FILE}` (inside `reports/`).
---
### Phase 4: Risk Score Calculation
After completing the consolidated report, calculate a numeric security score from the
finding counts recorded in the Severity Breakdown table.
Scoring formula (100 = perfect security, 0 = critical risk):
- Start at 100
- Each Critical finding: −15 points
- Each High finding: −8 points
- Each Medium finding: −3 points
- Each Low finding: −1 point
- Minimum score: 0
Score thresholds:
| Score | Risk Level |
|-------|------------|
| 90–100 | 🟢 LOW RISK |
| 70–89 | 🟡 MEDIUM RISK |
| 40–69 | 🟠 HIGH RISK |
| 0–39 | 🔴 CRITICAL RISK |
Render the score as a 20-block progress bar (filled blocks = `score ÷ 5`, rounded down):
Security Score: 72/100 [██████████████░░░░░░] MEDIUM RISK
Append the scorecard to `${CONSOLIDATED_FILE}` before the Appendix:
```markdown
---
## Security Score
**Score: [SCORE]/100** — [RISK LEVEL]
Security Score: [SCORE]/100 [████████████████░░░░] [RISK LEVEL]
| Severity | Count | Penalty |
|----------|-------|---------|
| 🔴 Critical | [N] | −[N×15] |
| 🟠 High | [N] | −[N×8] |
| 🟡 Medium | [N] | −[N×3] |
| 🟢 Low | [N] | −[N×1] |
| **Total deducted** | | **−[total]** |
This phase is reference material — use it when writing recommendations in the reports above. Do not execute these examples; adapt them to the actual vulnerable code found in the target project.
Password hashing — use bcrypt (min cost 12) or argon2id:
# Python
import bcrypt
hashed = bcrypt.hashpw(password.encode(), bcrypt.gensalt(rounds=12))
valid = bcrypt.checkpw(password.encode(), hashed)
// Node.js
const bcrypt = require('bcrypt');
const hash = await bcrypt.hash(password, 12);
const valid = await bcrypt.compare(password, hash);
Rate limiting — cap auth endpoints:
// Node.js / Express
const rateLimit = require('express-rate-limit');
app.post('/login', rateLimit({ windowMs: 15*60*1000, max: 5 }), loginHandler);
# Flask
from flask_limiter import Limiter
limiter = Limiter(app, key_func=lambda: request.remote_addr)
@app.route('/login', methods=['POST'])
@limiter.limit('5 per 15 minutes')
def login(): ...
Never hardcode secrets. Use environment variables at minimum; prefer a secrets manager:
# ❌ Bad
API_KEY = 'sk-1234567890'
# ✅ Good — env var
import os
API_KEY = os.environ['API_KEY'] # raises KeyError if missing
# ✅ Better — AWS Secrets Manager
import boto3, json
def get_secret(name):
client = boto3.client('secretsmanager', region_name='ap-southeast-2')
return json.loads(client.get_secret_value(SecretId=name)['SecretString'])
# Python — Pydantic
from pydantic import BaseModel, EmailStr, Field
class UserInput(BaseModel):
email: EmailStr
name: str = Field(min_length=2, max_length=100)
// Node.js — Zod
import { z } from 'zod';
const schema = z.object({ email: z.string().email(), name: z.string().min(2).max(100) });
const result = schema.safeParse(req.body);
if (!result.success) return res.status(400).json(result.error);
# Python — markupsafe / bleach
from markupsafe import escape
safe = escape(user_input)
// Node.js — sanitize-html
const sanitizeHtml = require('sanitize-html');
const clean = sanitizeHtml(userInput, { allowedTags: ['b','i','em','strong'] });
const helmet = require('helmet');
app.use(helmet({
contentSecurityPolicy: { directives: { defaultSrc: ["'self'"], scriptSrc: ["'self'"] } },
hsts: { maxAge: 31536000, includeSubDomains: true }
}));
After writing both report files, output a brief terminal summary:
[sentinel] Security audit complete.
[sentinel] ─────────────────────────────────────────────────────
[sentinel] Target: <absolute path>
[sentinel] AI findings: X critical, X high, X medium, X low
[sentinel] Semgrep: X findings (or: skipped)
[sentinel] CodeQL: X findings across N language(s): [lang1, lang2, ...] (or: skipped / unsupported language)
[sentinel] Total unique: X issues (X confirmed by multiple tools)
[sentinel] ─────────────────────────────────────────────────────
[sentinel] Security Score: [SCORE]/100 [████████████░░░░░░░░] [RISK LEVEL]
[sentinel] ─────────────────────────────────────────────────────
[sentinel] Reports (./reports/):
[sentinel] ${REPORT_FILE}
[sentinel] ${CONSOLIDATED_FILE}
[sentinel] ${SEMGREP_JSON}
[sentinel] ${CODEQL_SARIF}
[sentinel] ─────────────────────────────────────────────────────
[sentinel] Next: address Critical and High findings first.
[sentinel] Run /sentinel --path . to re-audit after fixes.
| Severity | Description | Response Time |
|---|---|---|
| 🔴 Critical | RCE, auth bypass, exposed secrets | Immediate |
| 🟠 High | Data breach, privilege escalation | Within 24 h |
| 🟡 Medium | Limited-impact exploits | Within 1 week |
| 🟢 Low | Minor concerns, hygiene issues | Next sprint |
Every audit should:
npx claudepluginhub 0x1337c0d3/claude-security --plugin claude-securityProvides CDSS development patterns for drug interaction checking, dose validation, clinical scoring (NEWS2, qSOFA), and alert classification integrated into EMR workflows.