CVE Research — Static Analysis Workflow
Purpose
Autonomous security vulnerability research on open-source repositories. Goal: find exploitable vulnerabilities that cross real security boundaries, are NOT intentional design, are NOT duplicates, and score >= Medium (4.0+) on CVSS v4.
2026 Upgrade — What Has Actually Produced High-Impact Findings
This version preserves the original phase discipline, but adds the patterns that repeatedly produced credible Medium/High/Critical candidates in the target set. Use these as prioritization accelerators, not as replacements for exact source-to-sink tracing.
High-yield themes:
- Auth-only is not authorization — endpoints that require a session but skip object, tenant, project, role, or feature permission checks are prime targets.
- Correct initial query, unsafe serializer or relationship expansion — a route may fetch a tenant-scoped object, then serialize global relationships (
roles.all(), memberships.all(), logs, spend records, resources) without re-filtering.
- Global mutation inside scoped handlers — a tenant/project-scoped update that calls global
clear(), delete, overwrite, or save operations can modify another tenant/project/user's state.
- Secondary protocol surfaces — SFTP, SMTP, MCP, WebSocket bridges, webhooks, callbacks, browser-control channels, and background workers often bypass the HTTP auth model.
- Validation after side effects — upload/publish/import flows that write, extract, cache, or register data before checking route-vs-body consistency often leave exploitable artifacts behind even when the request returns an error.
- Parser/filter mismatches — deny-lists and regex filters applied to raw text are weak when the downstream parser accepts alternate forms: IPv6-mapped IPv4, decimal/octal IPs, case changes, redirects, absolute paths, symlinks, ASN.1 variants, or archive member traversal.
- Trust libraries have security boundaries too — certificate verification, signature validation, policy parsing, and sandbox checks can be CVE-worthy when attacker-controlled bytes are accepted contrary to documented security semantics.
- Fresh-instance and bootstrap flows — first-run installers, public key exchange endpoints, registration/bootstrap tokens, and setup windows can become unauthenticated admin claim paths if they trust reachability instead of possession of a secret or local-only boundary.
- Cross-user observability APIs — logs, spend records, task runs, traces, events, files, resources, and audit details are often scoped by object ID but not by owner/tenant/project.
- Archive and package workflows — tar/zip extraction, package publish, plugin upload, recipe import, restore-from-backup, and extension install flows are high-value because file write can chain into config overwrite, persistence, or execution.
High-impact finding shape:
- A low-privileged or unauthenticated actor controls an identifier, URL, archive, manifest, WebSocket message, certificate, uploaded bundle, callback destination, or relationship payload.
- The code performs an auth check that is too coarse, or validates the wrong representation of the data.
- The sink reads, writes, fetches, trusts, serializes, deletes, forwards, or verifies data across a boundary the attacker should not cross.
- A local PoC proves the boundary, preferably with two principals/tenants/projects or an internal-only listener/file/artifact.
Phase 0 — Session Setup And Evidence Ledger
Before Phase 1, create a small scratch note for the target. Keep it terse:
- repo URL, commit/tag, install mode, exposed services, and default deployment assumptions
- SECURITY.md location and maintainer reporting path
- published advisories already known
- candidate list with status:
surface, tracing, filtered, confirmed, reported
- PoC artifact paths and exact commands used
Never let the ledger become a dumping ground. It exists so the final report has exact provenance and so duplicate checks do not get lost.
Token Budget
One repository per session. Budget strictly.
- Phase 1 (Recon): ~5%
- Phase 2 (Architecture Mapping): ~10%
- Phase 2.5 (Documentation & Intent Analysis): ~5%
- Phase 3 (Git History + Advisory Check): ~10%
- Phase 4 (Attack Surface Identification): ~10%
- Phase 5 (Deep Analysis): ~45%
- Phase 5.5 (Pre-Report Validation): ~5%
- Phase 6 (Reporting): ~10%
At 90% token usage: STOP. Output everything found so far — confirmed vulnerabilities, suspected vulnerabilities, areas not reviewed, file:line pointers for manual follow-up.
Phase 1 — Repository Reconnaissance
List all directories and files. Output ONLY tree structure (names, not contents). Skip:
package.json, package-lock.json, yarn.lock, composer.lock, go.sum
Dockerfile, docker-compose.yml, .dockerignore
.gitignore, .editorconfig, .prettierrc, .eslintrc, config dotfiles
LICENSE, README.md, CHANGELOG.md, CONTRIBUTING.md
node_modules/, vendor/, .git/, __pycache__/, .venv/
- Static assets:
*.css, *.svg, *.png, *.jpg, *.ico, fonts
- CI/CD:
.github/workflows/, .gitlab-ci.yml, Jenkinsfile
- Test fixtures/mock data (but NOT test files — they reveal expected behavior)
Output tree. Move to Phase 2.
Phase 2 — Architecture Mapping
From tree structure alone, identify:
- Framework/language — What stack?
- Entry points — Where do HTTP requests enter? (routes, controllers, API handlers)
- Authentication layer — Where are auth checks? (middleware, decorators, guards)
- Data layer — Where is data stored/retrieved? (ORM, raw SQL, file I/O)
- Dangerous sinks — Files likely containing:
exec(), system(), eval(), include(), unserialize(), yaml.load(), pickle.loads(), subprocess, template rendering, SQL building, redirect handling, file upload processing
Output brief architecture summary (10-15 lines). Identify TOP 5 attack surface areas. Move to Phase 2.5.
Phase 2.5 — Documentation & Intent Analysis
PURPOSE: Prevent wasting tokens on "by design" behavior the maintainer will reject.
Before deep-diving into code:
- Read SECURITY.md — Understand the project's security model, what they consider a vulnerability, and their threat model.
- Check GitHub Security Advisories —
gh api repos/{owner}/{repo}/security-advisories --paginate or check the Security tab. Look for:
- Existing advisories (open or closed) in the same vulnerability class you're targeting
- REJECTED advisories — these tell you what the maintainer does NOT consider a vulnerability
- Recently patched issues — if the same class was just fixed, your finding may be a duplicate
- Check project docs for security model descriptions:
- What is admin/operator responsibility vs. application responsibility?
- Features described as "configurable" or "opt-in" — these are NOT vulnerabilities
- Sandboxing claims (template engines, expression evaluators) — if documented as sandboxed, SSTI without sandbox escape won't be accepted
- Search issue tracker —
gh search issues --repo {owner}/{repo} "security" OR "by design" OR "wontfix" for the behavioral area you're examining. If a behavior was discussed and marked intentional, skip it.
- Check for extensibility points — If the project says "developers can add X validation via hooks/middleware/triggers," then the absence of that validation is the deployer's choice, not a vulnerability.
Output: List of behaviors confirmed as intentional (SKIP these) and areas with no defensive documentation (PROCEED).
Phase 2.5 Upgrade — Intent Checks That Prevent Bad Reports
Actively answer these before deep code tracing:
- Is this feature documented as operator-controlled, developer-extensible, local-only, or intentionally unsafe behind admin privileges?
- Does the project state that deployments must restrict network exposure? If so, do not rely only on "operator exposed it" unless the default or documented deployment is reachable.
- For bootstrap/install flows, does documentation tell operators to expose the service before claiming the first admin? If yes, remote first-claim can still be valid; if docs require localhost or a secret token, check whether code enforces it.
- For callbacks/webhooks/proxies, who controls the URL: unauthenticated user, normal user, tenant admin, system admin, or config file? Only unprivileged control usually makes SSRF reportable.
- For shared-user or multi-tenant products, is cross-tenant membership expected? Shared identity can be legitimate, but cross-tenant role disclosure or mutation is usually not.
- For libraries, docs and inline comments are part of the security contract. If verifier behavior contradicts documented validation semantics, treat that as a strong signal.
Phase 3 — Git History + Advisory Check
-
Security-related commits — git log --oneline --all --grep="fix" --grep="vuln" --grep="security" --grep="sanitiz" --grep="inject" --grep="CVE" --grep="patch" -n 30 — Where there's one fix, there's often an incomplete fix nearby.
-
Recent diffs in high-risk files — For top 5 areas from Phase 2, run git log --oneline -n 10 -- <file> and git diff <old>..<new> -- <file> on interesting changes. Look for:
- Partial security fixes
- New user input handling without sanitization
- Removed security checks
-
Existing CVEs/GHSAs — Check commit messages AND the Security tab. If prior CVEs exist, look for the SAME class in OTHER parts of the codebase.
-
CRITICAL: Duplicate check — Before proceeding, verify your target areas don't overlap with:
- Already-published GHSAs (check
gh api repos/{owner}/{repo}/security-advisories)
- Recently closed security issues
- Rejected advisories in the same class (if maintainer rejected SSRF before, another SSRF report will likely be rejected too)
Output: Interesting commits/diffs, confirmed non-duplicate areas. Move to Phase 4.
Phase 3 Upgrade — Incomplete-Fix Hunting
When a repo has prior advisories, do not only search for the same vulnerable line. Search for parallel trust decisions:
- same middleware pattern used by a sibling router
- same deny-list reused by another outbound request path
- same object lookup followed by a different serializer
- same archive extraction in local and HTTP clients
- same permission check missing from one adjacent endpoint
- same parser used in verify, import, restore, preview, webhook, or worker paths
- same certificate/policy extension parsed correctly but enforced only inside a conditional gate
Use old advisories as a map of where maintainers already made mistakes. Variants in another module are often stronger than the original if they cross a clearer boundary.
Phase 4 — Attack Surface Identification
Read ONLY entry point files (routes, controllers, API definitions):
- Which endpoints accept user input (params, body, headers, cookies, uploads, path segments)
- Which endpoints have weak or missing authentication
- Source-to-sink paths — how user input flows into the application
For each endpoint, note:
- Source: Where user input enters
- Sink: Where it ends up (query, file op, command, template, redirect)
- Sanitization: What validation/escaping exists between source and sink
- Auth boundary: What privilege level is needed to reach this endpoint
Combine with Phase 3 findings. Rank by severity potential. Move to Phase 5.
Phase 4 Upgrade — Ranking Rubric
Prioritize candidates in this order:
- Unauthenticated boundary creation: first-run admin claim, public install/setup, unauthenticated SSRF/file write/RCE.
- Cross-tenant or cross-project data/control: user A sees or mutates user B, tenant A affects tenant B, project A reads project B resources.
- Low-privileged to admin: authenticated normal user reaches management APIs, password reset, role mutation, backup restore, config write, token minting.
- External data to trust decision: certificate verification, signature parsing, policy enforcement, auth token validation, sandbox escape.
- File write with follow-on use: upload/publish/import/extract/write paths that may later be executed, loaded, served, or trusted.
- SSRF with default bypass: unprivileged URL control plus internal reachability, response disclosure, webhook delivery, or cloud metadata access.
- Information disclosure with strong sensitivity: credentials, logs, spend records, private files, tenant names/roles, API keys, prompts, traces.
Downrank:
- admin-only configuration choices
- self-harm account settings
- non-default insecure deployment unless the project markets it as safe
- speculative chains without a working sink
- DoS-only outcomes
Phase 5 — Deep Analysis (Targeted)
Work through prioritized list from Phase 4. For EACH path:
- Trace the full data flow — source → every function → sink, reading every file in the chain.
- Identify bypasses — incomplete validation, type confusion, race conditions, logic errors, traversal, deserialization, second-order injection, SSTI, SSRF, file write+include chains.
- Assess exploitability — Can an EXTERNAL attacker trigger this? What privileges needed? Is it reachable?
Analyze ONE AT A TIME. Check token budget before next path.
Vulnerability Class Notes
RCE: File write + file include, command injection in exec/system/popen, deserialization gadgets, SSTI with sandbox escape. Template injection in sandboxed engines without proven escape is NOT worth reporting.
SQL Injection: String concat in queries, ORM raw methods with user input, ORDER BY/LIMIT injection.
SSRF: Endpoint fetches user-supplied URL — check internal IP blocking. CRITICAL: If the URL is configured by an admin (webhook, proxy target, ForwardAuth backend), it's the admin's operational choice — NOT SSRF. Only report where an unprivileged user controls the URL.
Path Traversal / LFI: User input in file paths/template names/include paths. Check if ../ survives sanitization.
Auth Bypass: Session/token logic errors, IDOR, privilege escalation, CSRF skip. CRITICAL: An authenticated user modifying their OWN MFA/security settings with a valid session is NOT a boundary violation. Many services intentionally allow this. Only report cross-boundary attacks: unauth→auth, userA→userB, user→admin.
Race Conditions: CRITICAL: Race conditions requiring pre-existing credentials (password + MFA code) are almost always Low severity (< 4.0 CVSS). Only report if: (1) race allows unauthenticated attacker to bypass auth entirely, or (2) impact is Critical (full account takeover from single leaked token).
XSS: Reflected/stored/DOM XSS with real impact. Skip self-XSS.
Phase 5 Upgrade — Proven Bug Archetypes
Use these concrete archetypes while tracing. A candidate is not reportable until the specific code path is proven in the current target.
A. Scoped lookup followed by unscoped relationship serialization
request tenant/project -> queryset filters base object -> serializer returns object.relationships.all()
Check include=, expand=, fields=, with=, nested serializers, preload hooks, GraphQL resolvers, and log/detail endpoints. Prove with two tenants/projects and a shared user/object.
B. Scoped update followed by global relationship mutation
tenant-scoped PATCH -> load shared user/object -> relation.clear()/delete()/set() -> recreate only current scope
Confirm whether the ORM relationship table is global. A tenant-scoped handler must filter deletes by tenant/project/namespace.
C. Authenticated endpoint missing adjacent permission
@endpoint/auth_required -> no authorize("feature:write") -> modifies users/config/passwords/roles/resources
Compare sibling endpoints. If adjacent config write requires a permission and the password/role/resource write does not, trace the impact.
D. Validation after side effect
upload/import/publish -> write/extract/register/cache -> compare URL/body/manifest -> return 400 -> artifact remains
The error response does not save the code if the side effect already crossed a boundary. Prove with a leftover artifact, changed DB row, changed permission, or outbound request.
E. Unsafe archive or bundle handling
attacker tar/zip/package -> extractall/copy path from manifest -> file outside destination
Check both publish-time and pull/install-time behavior. Check local and remote client paths separately.
F. Deny-list parser mismatch
raw user URL/path/cert/policy -> regex/string check -> downstream parser accepts different semantic target
For SSRF, test IPv6-mapped loopback, case changes, redirects, DNS rebinding, decimal/octal IPs, userinfo, fragments, and trailing dots. For paths, test absolute paths, symlinks, .., encoded traversal, and archive member names.
G. Secondary protocol bypass
non-HTTP protocol/message -> separate auth model -> reaches same state/files/resources as HTTP API
Inspect SFTP/FTP/SMTP/MCP/WebSocket/gRPC/worker queues. Verify whether the HTTP authorization model is repeated on the protocol path.
H. WebSocket or browser bridge trust failure
public bind -> optional Origin check -> no token/session -> client message controls another connection/session
Origin is not authentication. Missing Origin from a non-browser client must be handled explicitly. Prove cross-connection forwarding or session hijack with two clients.
I. Library trust-boundary acceptance
attacker-controlled bytes -> parser preserves security extension -> verifier skips enforcement under a side condition
For crypto/security libraries, compare against standards, documented semantics, and another trusted implementation when feasible.
J. Cross-user log/resource lookup
authenticated user -> supplies request_id/session_id/resource_id -> query by id only -> returns another user's data
IDs may be "unguessable" but still reportable when leaked/referrable/observable and the system promises tenant/user isolation. Calibrate Attack Requirements honestly.
Phase 5 Upgrade — PoC Standard
Every confirmed candidate should have a minimal verifier that:
- creates or identifies two principals/scopes when the bug is cross-boundary
- performs a negative control proving the caller should not have access
- triggers the vulnerable request/path
- captures the exact forbidden effect: response body, DB row, file artifact, internal listener hit, role deletion, session takeover, or trust acceptance
- writes artifacts to a stable path and prints 3-6 success indicators
- avoids destructive actions against real services; use disposable local state
Prefer a small reproducible script over a long narrative. The report can quote the script's important output rather than dumping the entire file.
Phase 5.5 — Pre-Report Validation
Every finding MUST pass ALL checks below. If any check fails, DO NOT REPORT.
1. Security Boundary Check
Does this cross a real security boundary?
- Unauthenticated → Authenticated access: VALID
- User A → User B's data/actions: VALID
- Regular user → Admin privileges: VALID
- Authenticated user modifying own settings: INVALID (self-harm)
- Admin configuring their own system: INVALID (operational choice)
2. Intentional Behavior Check
Have you verified this is NOT:
- Documented as a deliberate design choice (in PRs, issues, docs)?
- An admin-configurable feature where the project provides tools to secure it?
- Covered by extensibility hooks the project expects deployers to use?
3. Duplicate Check
Have you confirmed:
- No existing GHSA (open, closed, or rejected) covers this?
- No recent commit already fixed this exact issue?
- No rejected advisory in the same vulnerability class exists?
4. CVSS v4 Estimation
Estimate the score:
- Attack Vector: N/A/L/P
- Attack Complexity: L/H
- Attack Requirements: N/P
- Privileges Required: N/L/H
- User Interaction: N/P/A
- C/I/A Impact: N/L/H
If score < 4.0 → DO NOT REPORT. Race conditions with high privilege requirements are almost always < 4.0.
5. Attacker Model Reality Check
- Who is the realistic attacker?
- What do they already need? (creds, admin access, specific config, timing)
- Would a reasonable maintainer consider this a vulnerability in THEIR software?
- If exploitation requires the admin to have configured something specific → not a CVE
If any check fails → note as "Filtered: [reason]" and move on.
Phase 5.5 Upgrade — Boundary Calibration
Use these calibrated examples:
- Valid: tenant A admin receives tenant B roles/memberships for a shared user.
- Valid: tenant A admin's scoped role update deletes tenant B roles.
- Valid: authenticated project A user reads project B file/resource through unscoped handler.
- Valid: unauthenticated caller reaches loopback/private service through default SSRF deny-list bypass.
- Valid: low-privileged account resets another user's password when sibling config APIs require elevated permission.
- Valid: untrusted certificate chain is accepted despite documented path/security constraint enforcement.
- Valid: malicious package publisher causes a later pull/install by another user to write outside the chosen directory.
- Invalid by default: admin configures their own webhook/proxy/backend URL.
- Invalid by default: logged-in user changes their own MFA/password/API key.
- Invalid by default: sandboxed template expression without a proven sandbox escape or sensitive read.
- Invalid by default: race requiring password plus MFA/recovery code plus timing unless the impact is account takeover from materially less access.
Phase 6 — Reporting
Accuracy is everything. Only report what you can trace with certainty AND what passed Phase 5.5.
For each finding:
### [Vulnerability Title]
**Class**: [RCE/SQLi/SSRF/XSS/Path Traversal/Auth Bypass/etc.]
**Severity**: [Critical/High/Medium]
**CVSS v4 Estimate**: [score] — AV:[N/A/L/P]/AC:[L/H]/AT:[N/P]/PR:[N/L/H]/UI:[N/P/A]/VC:[N/L/H]/VI:[N/L/H]/VA:[N/L/H]
**Confidence**: [Confirmed/High/Medium]
**File(s)**: [exact file paths and line numbers]
**Auth Required**: [None/Any user/Admin]
**Security Boundary Crossed**: [unauth→auth / userA→userB / user→admin]
**Intentional Check**: [Confirmed not documented as intentional — checked: docs/issues/PRs/GHSA]
**Existing Advisory Check**: [No existing GHSA found]
**Logic Flow**: [source → function → function → sink, with file:line for each step]
**Why it works**: [1-2 sentences on what's broken]
**PoC Evidence**: [command, artifact path, exact success indicators]
**Negative Control**: [what proves the action should have been denied or scoped]
**Remediation**: [specific code-level fix, not generic "sanitize input"]
If stopped at 90% token usage:
### Not Yet Reviewed
[file:line paths not analyzed]
[anything suspicious noticed in passing]
Rules
- NEVER read files outside the prioritized attack surface unless a data flow leads there
- NEVER dump entire file contents — reference by file:line
- NEVER scan files sequentially — always follow data flows
- Prioritize DEPTH over BREADTH — one fully traced exploitable bug > 20 surface-level observations
- If a finding chains with another (file write + file include), trace BOTH sides
- Skip informational findings (missing headers, cookie flags, version disclosure)
- Skip DoS, regex complexity, missing rate limiting
- BE 100% ACCURATE on logic flows — every step must reference real code at real file:line
- Track token usage. At 90%, stop and report.
Rejection Prevention Rules
- NEVER report behavior documented as intentional by the maintainer. Check docs/issues/PRs BEFORE reporting.
- NEVER report admin-configurable features as vulnerabilities — if an admin opts into a behavior and the project provides tools to secure it, it's not a CVE.
- NEVER report self-harm scenarios — authenticated user modifying own account/MFA/settings with a valid session is not a boundary violation.
- NEVER report without checking existing GHSAs (open, closed, AND rejected) — duplicates destroy credibility with maintainers.
- ONLY report findings estimated at CVSS v4 >= 4.0 (Medium+). Race conditions with high privilege requirements are almost always Low.
- When in doubt, default to NOT reporting. A rejected advisory damages future credibility far more than a missed finding.
Output Hygiene
For each target, keep final artifacts separate from scratch:
reports/<target>/ for polished reports, advisory drafts, and PDFs
output/<target>/ or target-local output/ for verifier outputs
tmp/ only for disposable build/runtime artifacts
Before sharing to another machine or maintainer, scrub:
- cookies, bearer tokens,
.env*, private keys, real API keys, real user data
- large dependency folders (
node_modules, .venv, caches)
- Docker volumes and DB dumps unless they are intentionally minimized evidence
Keep the methodology files versioned separately from target repos so future findings improve the workflow without mutating old evidence.