From oh-my-secuaudit
Dataflow-based code clustering for security assessments. Groups (Endpoint, Sink) paths by shared review strategy so reviewers sample representative cases instead of exhaustively reviewing every path. Use when scoping manual review on a codebase with 50+ endpoints, repetitive sanitization patterns, or after initial SAST/SCA produces large finding sets that need triage.
How this skill is triggered — by the user, by Claude, or both
Slash command
/oh-my-secuaudit:sec-clusterThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Dataflow-based code clustering for security assessments. Groups (Endpoint, Sink) paths by shared review strategy, enabling representative-sample review instead of exhaustive per-path analysis.
references/clustering_strategy_v4.mdtemplates/CLUSTERS.md.tmpltemplates/REVIEW_CHECKLIST.md.tmpltemplates/auth_enum.shtemplates/semgrep-rules/c2-hardcoded-shared-secret.yamltemplates/semgrep-rules/c3-hostname-verifier-bypass.yamltemplates/semgrep-rules/c4-sensitive-logging.yamltemplates/semgrep-rules/c5-unsafe-deserialization.yamltemplates/sweep.shDataflow-based code clustering for security assessments. Groups (Endpoint, Sink) paths by shared review strategy, enabling representative-sample review instead of exhaustive per-path analysis.
A cluster does not guarantee identical results. A cluster provides the possibility of applying the same review strategy.
Therefore the operating procedure is: verify clusters while using them, not trust them blindly.
sec-audit-static findings (optional, accelerates Phase 1)references/clustering_strategy_v4.md for the full strategy.templates/ for output format examples.Determine clustering applicability per the v4 strategy:
Include (requires dataflow analysis):
| Check Item | Clustering Reason |
|---|---|
| XSS | Sink context (HTML, JS, attribute) determines vulnerability |
| Data protection | Masking/encryption/exposure varies by flow |
| SSRF / path traversal / template injection | Input propagation path analysis required |
| Auth/authz (conditional) | Per-endpoint authorization application differs (v4 section 3.3) |
Exclude (static pattern matching sufficient):
Runtime.exec, ProcessBuilder)Auth/Authz Re-definition (v4 section 3.3):
Auth is not "does a common module exist" but "is it applied per-endpoint." Include in clustering when endpoint-level authorization varies.
templates/semgrep-rules/ to the target codebase:
metavariable-regex for domain-specific field namestemplates/sweep.sh (adapt module list):
./sweep.sh # all modules, human output
./sweep.sh --json # JSON output
./sweep.sh <module> # single module
For the auth/authz cluster (typically the largest):
@RequestMapping/@GetMapping/@PostMapping endpoints@PreAuthorize, @Secured, SecurityFilterChain presenceWebConfig/addInterceptors() auth interceptor registrationtemplates/auth_enum.sh as a starting point (adapt grep patterns).Define clusters using the (Endpoint, Sink) unit. For each cluster, document:
| Element | Description |
|---|---|
| Source | User input / external data entry point |
| Transformation | Processing logic |
| Validation/Sanitization | Filtering, encoding presence |
| Sink | Final output point (DB, HTTP response, file, external call) |
| Context | Auth state, data sensitivity, trust boundary |
Typical cluster categories:
Adjust cluster definitions to the target codebase. Not all categories apply to every project.
Per v4 section 7.5:
| Stage | Criteria | Sampling |
|---|---|---|
| Stage 1 (initial) | New cluster | 50%+ manual review, measure consistency |
| Stage 2 (stabilization) | Consistency >= 80% | Reduce to 30% sampling |
| Stage 3 (operational) | Miss rate < 5% for 2 consecutive cycles | Representative sample only |
| Re-verification trigger | Major code change, new framework, missed vuln | Reset to Stage 1 |
For each sample, fill the review checklist (see templates/REVIEW_CHECKLIST.md.tmpl):
[X] (vulnerable), [N] (not vulnerable), or [partial]Produce these artifacts in the target's architecture-review/ or assessment output directory:
CLUSTERS.md — Full cluster inventory with:
semgrep-rules/ — Adapted rules with results/SUMMARY.md
semgrep-rules/results/REVIEW_CHECKLIST.md — Completed review with:
Clustering is ineffective when:
| Condition | Reason |
|---|---|
| Reflection / dynamic dispatch | Static analysis cannot trace actual flow |
| AOP / proxy-based flow | Runtime-determined security processing |
| Framework internal hidden flow | Dataflow breaks inside framework |
| Runtime config-dependent sanitizer | Same code, different behavior by config |
| Template engine internal processing | Cannot trace internal escaping |
Fallback: Tag failed paths in Phase 1, manage separately, review manually prioritized by: external input proximity > auth bypass potential > rest.
| Artifact | Description | Consumed By |
|---|---|---|
CLUSTERS.md | Cluster definitions, measurements, cross-references | sec-audit-static, security-architecture-review |
semgrep-rules/*.yaml | Codebase-adapted detection rules | sec-audit-static re-runs |
semgrep-rules/results/SUMMARY.md | Detection statistics | CLUSTERS.md, architecture review |
semgrep-rules/results/REVIEW_CHECKLIST.md | Sample review verdicts and consistency | Architecture review, next audit cycle |
Provide:
| Metric | Definition | Formula |
|---|---|---|
| Intra-cluster consistency | Same-verdict rate within cluster | (matching samples) / (reviewed samples) |
| Review efficiency | Time saved vs. exhaustive review | 1 - (clustered time) / (unclustered time) |
| Sample miss rate | Vulnerabilities missed by representative sampling | (mismatched samples) / (additional samples) |
| Reviewer agreement | Cross-reviewer verdict consistency | Cohen's Kappa or simple agreement rate |
references/clustering_strategy_v4.md — Full v4 strategy documenttemplates/semgrep-rules/ — Starter rule templates (5 categories)templates/sweep.sh — Module sweep runnertemplates/auth_enum.sh — Auth mechanism enumeration helpertemplates/CLUSTERS.md.tmpl — Cluster document templatetemplates/REVIEW_CHECKLIST.md.tmpl — Review checklist templatenpx claudepluginhub windshock/oh-my-secuaudit --plugin oh-my-secuauditProvides CDSS development patterns for drug interaction checking, dose validation, clinical scoring (NEWS2, qSOFA), and alert classification integrated into EMR workflows.