From copilot-cli-toolkit
Classifies PR changes by file type and runs targeted quality gates for functional, non-functional, security, DevOps, DX, observability. Use after /build on git diffs.
How this skill is triggered — by the user, by Claude, or both
Slash command
/copilot-cli-toolkit:testThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
@CLAUDE.md
@CLAUDE.md
Test: $ARGUMENTS
If $ARGUMENTS is empty, test the current branch diff against the base branch.
Detect the base branch from gh pr view --json baseRefName or fall back to main. Run git diff origin/<base-branch> --name-only and classify changed files:
| Type | Patterns | Gates to Run |
|---|---|---|
| CODE | *.py, *.ps1, *.ts, *.js, *.cs | All 6 gates |
| WORKFLOW | *.yml in .github/workflows/ | Gates 1, 3, 4 |
| CONFIG | *.json, *.yaml (non-workflow) | Gates 3, 4 |
| DOCS | *.md, *.txt, *.rst | Gate 5 only |
| MIXED | Combination | Apply per-file rules |
Print: PR TYPE: [type]. Running gates: [list].
Skip non-applicable gates. Do not waste agent invocations on irrelevant dimensions.
Invoke Skill(skill="code-qualities-assessment") for quality baseline.
Task(subagent_type="qa"): You are a senior QA engineer. Your job is to catch issues that will cause production incidents. Be skeptical. Cite specific file:line evidence for every finding. Evaluate:
Output: VERDICT: PASS|WARN|CRITICAL_FAIL with findings array.
Task(subagent_type="analyst"): You are a performance and reliability engineer. Focus on failure modes, not the happy path. Use measurable criteria, not subjective judgments. Evaluate:
Output: VERDICT: PASS|WARN|CRITICAL_FAIL with findings array.
Invoke Skill(skill="security-scan") for CWE pattern detection.
Task(subagent_type="security"): You are a security auditor performing OWASP Top 10 review. Assume every input is malicious. Reference CWE numbers for every finding. Evaluate:
Output: VERDICT: PASS|WARN|CRITICAL_FAIL with findings array including CWE references.
Task(subagent_type="devops"): You are a build and release engineer. Focus on pipeline safety, reproducibility, and supply chain security. Evaluate:
Output: VERDICT: PASS|WARN|CRITICAL_FAIL with findings array.
Task(subagent_type="critic"): You are a developer advocate reviewing from the consumer perspective. Would a new contributor understand this code? Would the API frustrate or delight? Evaluate:
Output: VERDICT: PASS|WARN|CRITICAL_FAIL with findings array.
Task(subagent_type="architect"): You are an SRE reviewing production readiness. If this code fails at 3am, can oncall diagnose it without reading the source? Evaluate:
Output: VERDICT: PASS|WARN|CRITICAL_FAIL with findings array.
Each gate MUST produce a verdict line and findings array:
GATE: [name]
VERDICT: PASS|WARN|CRITICAL_FAIL
FINDINGS:
- [SEVERITY] (file:line) description — recommendation
Synthesize into overall report:
| Gate | Verdict | Findings | Evidence |
|---|---|---|---|
| Functional | PASS/WARN/CRITICAL_FAIL | Count | file:line citations |
| Non-Functional | PASS/WARN/CRITICAL_FAIL | Count | file:line citations |
| Security | PASS/WARN/CRITICAL_FAIL | Count | CWE references |
| DevOps | PASS/WARN/CRITICAL_FAIL | Count | file:line citations |
| DX | PASS/WARN/CRITICAL_FAIL | Count | file:line citations |
| Observability | PASS/WARN/CRITICAL_FAIL | Count | file:line citations |
Overall verdict: CRITICAL_FAIL if any gate fails. WARN if any gate warns. PASS if all gates pass.
npx claudepluginhub rjmurillo/ai-agentsReviews code changes for correctness, readability, architecture, security, and performance. Checks lint, type safety, test coverage, and security issues. Use for PRs, audits, or pre-merge reviews.
Runs a pre-PR confidence audit with 5-dimension scoring: test quality, coverage, risk, AC traceability, and freshness. Use before commit/push/PR to gate or assess readiness.
Reviews code quality in stage 2 of two-stage review process. Checks SOLID, DRY, security, and test quality using diff analysis. Requires spec-review to have passed first.