From ai-native-toolkit
Assesses codebase AI-readiness with a layered contract score and generates complexity hotspot SVGs (treemap of LOC, cyclomatic complexity, git churn) plus doc-navigability graphs. Useful for onboarding AI agents or triaging migration risk.
How this skill is triggered — by the user, by Claude, or both
Slash command
/ai-native-toolkit:assessThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Three artefacts in one pass against a target repo:
pyproject.tomlscripts/assess_core.pyscripts/assess_emit_workflow.pyscripts/assess_finalize.pyscripts/assess_gate.pyscripts/assess_report.pyscripts/complexity-treemap.pyscripts/doc-graph-svg.pyscripts/lib/README.mdscripts/lib/__init__.pyscripts/lib/accretion_ratchet.pyscripts/lib/agent_instructions_grader.pyscripts/lib/anomaly_detector.pyscripts/lib/assess_config.pyscripts/lib/badge.pyscripts/lib/change_coupling.pyscripts/lib/ci_workflow.pyscripts/lib/coupling_analysis.pyscripts/lib/doc_complexity_join.pyscripts/lib/doc_graph.pyThree artefacts in one pass against a target repo:
Both SVGs are colour-blind-safe by default (OrRd ramp, no red-green).
All land as files inside the target repo. The skill always writes them locally; after writing, ask the user whether to open a PR in the target repo with the artefacts.
Read this before scoring - it changes how you score. Across every layer, the real signal is never presence. It is whether a thing is under active pressure to stay true:
So AI-readiness is the degree to which a codebase's self-descriptions are kept honest, not the degree to which scaffolding exists. Score artefacts on maintenance pressure, not existence. A stale-but-present doc scores at or below absent: missing makes the agent go look; confidently-stale makes it navigate fast to a wrong, current-looking conclusion.
The 9 layers (0-8) fall into three bands, ordered by dependency - what must hold for the next band to mean anything:
The write-side scores aren't abstract good practice - each traces to a known tendency of an AI contributor, observed across models. All three are the same defect: a self-description (the file's shape, a comment's promise, a gate's verdict) under no pressure to stay true. The deterministic core turns each into a cross-layer finding so the report names the specific files, not just the category:
accretion_ratchet finding: a file whose accumulated line count ratcheted monotonically upward across multiple commits with almost no deletion pressure (deletions below ~15% of total churn). Only top complexity/size-band files are flagged, so growth-but-simple is never noise. It surfaces on three surfaces - the accretion_ratchet block in run-context.json, the accretion_ratchet cross-layer finding (with its files in the attention list), and a growth-profile line on each flagged hotspot page (hotspots/*.md). The signal disclaims itself (rather than dropping the result) when the git history is degenerate - a shallow clone or squashed import has no meaningful net-delta sequence, so the block carries reliable: false and the hotspot line is marked as possibly incomplete.TODO / FIXME / "remove after migration"). Instrumented via the unactioned_intent finding: markers aged by the edits they survived without being kept - a lying map of intent.$ARGUMENTS
# If arguments provided, use that path. Otherwise use pwd.
# Find the git root from wherever we are.
git rev-parse --show-toplevel
Set $REPO_ROOT to the result. All scanning happens from here.
Decide the output directory (default: $REPO_ROOT/.assess/). Create it if needed:
mkdir -p "$REPO_ROOT/.assess"
Artefacts will land at:
$REPO_ROOT/.assess/complexity-heatmap.svg$REPO_ROOT/.assess/complexity-stats.json$REPO_ROOT/.assess/doc-graph.svg$REPO_ROOT/.assess/assess-report.mdWrite-protected repo root?
/assesswrites.assess/into$REPO_ROOT, and the treemap/core run asuvsubprocesses that write there too. If your workflow keeps the repo root pristine and read-only (e.g. a<repo>-mainclone that teammates branch from, with a hook blocking direct edits), a guard on your writes won't stop the subprocess - it just makes the run write into the directory you meant to protect. Create a worktree first and run/assessfrom there.
This step produces two views of the codebase, both colour-blind-safe (OrRd ramp, no red-green):
complexity-heatmap.svg) - a treemap of the code. Size = LOC, colour = cyclomatic complexity, saturation = recent churn. Vivid red = complex AND active = "hard to change safely".doc-graph.svg) - a node-graph of the docs. Structure shows connectivity (centre = entry point, rings = link-distance, rim = unreachable; orphans carry a dashed ring); colour shows staleness in the same grammar as the code heatmap (vivid red = a frozen doc beside churning code = a lying map); size = file length. It folds both Layer 0 doc signals - navigability and the decaying-map - into one artifact. Beyond static wikilinks and CommonMark links, it recognises Obsidian vault-native navigation - .base view hubs and dataview query blocks - as edges (resolved statically by folder / tag / frontmatter predicate), so a vault navigated by dynamic queries isn't mis-scored as orphaned. The SVG and the scored signal compute over the identical doc set: both honour the same excludes (.assess/config.toml).Feed the complexity stats into the linter/complexity layer (Layer 3) and the doc_graph / doc_staleness blocks of run-context.json into Layer 0 (the graph SVG is the visual; the score reads the structured blocks).
scc (one-time per repo)The bundled treemap uses lizard (Python, Go, JS, Java, C/C++, etc.) by default. Optional scc extends coverage to 200+ languages including markdown, JSON, YAML, SQL, and shell - useful when the repo's surface is more than just traditional source code.
Before scanning, check three signals:
# 1. Is scc already on PATH?
command -v scc >/dev/null 2>&1 && SCC_PRESENT=1 || SCC_PRESENT=0
# 2. Has the user previously declined for this repo?
[ -f "$REPO_ROOT/.assess/.no-scc" ] && SCC_DECLINED=1 || SCC_DECLINED=0
# 3. Is the repo mostly markdown/data/config (where lizard alone will be sparse)?
# Cheap heuristic: count non-code files vs code files. The `.` argument is
# the regex pattern (matches every path) and "$REPO_ROOT" is the search
# path - without `.`, fd treats $REPO_ROOT as the pattern itself, matches
# nothing, and silently returns 0.
CODE_FILES=$(fd -t f -e py -e js -e ts -e tsx -e jsx -e go -e java -e kt -e rs -e rb -e cs -e swift -e dart -e cpp -e c -e h -e php . "$REPO_ROOT" 2>/dev/null | wc -l | tr -d ' ')
NONCODE_FILES=$(fd -t f -e md -e json -e yaml -e yml -e toml -e sh -e sql . "$REPO_ROOT" 2>/dev/null | wc -l | tr -d ' ')
Offer the install only if all three are true: SCC_PRESENT=0, SCC_DECLINED=0, and the repo looks lizard-sparse (CODE_FILES < NONCODE_FILES or CODE_FILES < 10). Otherwise skip straight to 2b (or 2c if 2b has nothing to offer).
When offering, use AskUserQuestion with three options (do not auto-install - brew install is a system mutation):
$REPO_ROOT/.assess/.no-scc so future runs don't ask. Recommended for prompt repos or pure-docs repos where lizard-only is genuinely fine.Phrase the question so the user understands the trade-off, e.g.:
"This repo has code files and non-code files (markdown/JSON/YAML).
sccwould include the non-code files in the treemap; without it the treemap may be sparse. Installscc?"
If the user picks Install scc, run the platform-appropriate command:
# macOS (Homebrew)
[ "$(uname)" = "Darwin" ] && command -v brew >/dev/null && brew install scc
# Linux (try common package managers, fall back to go install or manual)
[ "$(uname)" = "Linux" ] && {
command -v apt >/dev/null && sudo apt install -y scc \
|| command -v dnf >/dev/null && sudo dnf install -y scc \
|| command -v go >/dev/null && go install github.com/boyter/scc/v3@latest \
|| echo "Install scc manually: https://github.com/boyter/scc#installation"
}
If the install fails or the platform isn't covered, fall back to lizard-only and continue - don't block the assessment.
/assess maps each Layer 1/Layer 3 analysis capability (liveness/dead-code, static module graph, linting, modernization) to a serving tool. Historically that map was a hardcoded per-language allowlist - vulture for Python, ts-prune/knip for TS/JS, staticcheck/deadcode for Go. The defect that allowlist created: when a repo's language isn't enumerated, every capability silently degraded to "unavailable" - the report read "this layer is absent here" rather than "a tool could serve this - install one?". A non-enumerated language was locked out with no resolution path inside the run.
The flow is now capability-driven detect-or-propose, in three moves per capability:
The per-language dead-code offer below is the simplest instance (one capability, install-consent). When the tool is absent, the scan degrades to tool_absent and the user has no resolution path inside the skill - they'd have to know which tool fits the language, which package manager to use, and run the install themselves. The same install-offer pattern as Step 2a closes the loop without leaving them to figure it out.
Detect languages with cheap fd counts (mirroring Step 2a's heuristic - the treemap script's own classification isn't exposed in the stats sidecar, and shelling out is fine here):
PY_FILES=$(fd -t f -e py . "$REPO_ROOT" 2>/dev/null | wc -l | tr -d ' ')
TS_FILES=$(fd -t f -e ts -e tsx . "$REPO_ROOT" 2>/dev/null | wc -l | tr -d ' ')
GO_FILES=$(fd -t f -e go . "$REPO_ROOT" 2>/dev/null | wc -l | tr -d ' ')
# Per-language candidate tool. Prefer the read-only tool first - `ts-prune` over
# `knip` for TS, `staticcheck` over `deadcode` for Go - so the user isn't asked
# twice for the same job and the chosen tool doesn't need to build the project.
needs_offer() {
# $1 = tool; $2 = file count for the language; returns 0 if we should ask.
local tool="$1" count="$2" min="${3:-5}"
[ "$count" -ge "$min" ] || return 1
command -v "$tool" >/dev/null 2>&1 && return 1 # already installed
[ -f "$REPO_ROOT/.assess/.no-$tool" ] && return 1 # user declined permanently
return 0
}
OFFERS=() # each entry: "language|tool|install_cmd"
needs_offer vulture "$PY_FILES" && OFFERS+=("python|vulture|pip install vulture (or 'uv tool install vulture')")
needs_offer ts-prune "$TS_FILES" && OFFERS+=("typescript|ts-prune|npm install -g ts-prune")
needs_offer staticcheck "$GO_FILES" && OFFERS+=("go|staticcheck|go install honnef.co/go/tools/cmd/staticcheck@latest (or 'brew install staticcheck')")
If OFFERS is empty (no language hits the threshold, or every tool is already installed/declined), skip straight to 2c.
Otherwise, batch the questions into a single AskUserQuestion call - one question per language in OFFERS, three options per question:
$REPO_ROOT/.assess/.no-<tool> so future runs don't ask. Recommended when the language only appears in scripts/configs that don't warrant symbol-level reachability.Phrase each question so the gain is concrete, e.g.:
"This repo has 47 Go files.
staticcheck -checks U1000would let/assessflag unreachable Go funcs as Layer 1 candidates. Install? (go install honnef.co/go/tools/cmd/staticcheck@latestorbrew install staticcheck)"
When the user picks Install , run the platform-appropriate command from the offer. Surface any install failure as a chat message and continue - dead-code tools are degrade-don't-block (same contract as scc); a missing tool reduces Layer 1's precision but never gates the assessment.
When the user picks Skip permanently for this repo, write the marker:
mkdir -p "$REPO_ROOT/.assess"
touch "$REPO_ROOT/.assess/.no-<tool>" # e.g. .no-staticcheck
For multi-language repos with several offers, the AskUserQuestion call lists every language in one prompt rather than serialising. The user answers once and the run proceeds with whichever tools they accepted.
When the deterministic core detects a Maven or Gradle project it emits a capability_offers block in run-context.json - the first proof of the capability-driven flow on a non-enumerated ecosystem. Read it after Step 2c's core run, before scoring, and act on each capability's state:
jq '.capability_offers' "$REPO_ROOT/.assess/run-context.json"
liveness → state: "offer" - Maven was detected but mvn dependency:analyze (coarse module-level dead-dependency detection) has not run. The consent field names the shape: run (mvn is on PATH - offer to run it against the project; dependency:analyze needs a compiling build, so this is a run-consent, heavier than a static scan) or install (mvn absent - offer to install Maven first). Use AskUserQuestion exactly as Step 2b, phrasing the trade-off (a build that resolves dependencies and may hit the network). On accept and a run consent, run mvn dependency:analyze, capture its output, and re-run the core with the served result so the candidates feed Layer 1. On decline, the capability stays honestly named, not silently dropped.linting / modernization → state: "credited" - an already-configured pom.xml plugin serves it (served_by lists which: Checkstyle, SpotBugs, PMD, error-prone, OpenRewrite, Modernizer). Credit it in the report; do not re-offer.state: "honest_degrade" - nothing serves it yet (module graph, linting/modernization without a configured plugin, and all capabilities under Gradle in v1). The block carries a candidate_tool and gloss. Name both in the report's Layer 1/Layer 3 prose ("module-graph analysis is unserved here; jdeps would provide it"). Honest-degrade is a deliverable - surfacing the candidate is the point.Boundary (v1). Only Maven liveness is served. Module graph (jdeps), linting, and modernization honest-degrade; Gradle honest-degrades entirely. The candidate_tool values are deterministic defaults - you may propose a better-fitting ecosystem tool at runtime (the detect-or-propose latitude above); that choice is human-judged, not CI-tested. CI tests only signal consumption: given a tool's output, the scorecard feeds correctly.
Run the bundled treemap script alongside the deterministic core - see the chained block below.
The script prints a one-line summary (file count, lizard vs scc coverage, churn window chosen, top 5 biggest files). The stats sidecar contains percentiles (p50/p95/max for LOC, CCN, churn) and ranked lists of the top 10 files by hotspot score, raw CCN, and raw LOC. Both feed the report.
Dependencies: the script uses PEP 723 inline metadata (lizard, squarify, matplotlib, numpy). uv resolves them on first run.
Build artifacts and generated code are filtered by default. The script excludes two classes of files:
main.dart.js, Flutter canvaskit/skwasm runtime bundles (canvaskit.js, skwasm*.js), *.min.js, *.bundle.js, *.chunk.js, *.map, sourcemaps, service workers, and files under node_modules/, dist/, build/, .next/, .nuxt/, .output/, coverage/, etc.*.pb.go, *_grpc.pb.go, *.pb.gw.go, *.connect.go, *_pb.ts, *_pb.d.ts, *_pb2.py, *.pb.cc, *.pb.h), Go generators (*.gen.go, wire_gen.go, zz_generated_*.go, bindata.go), .NET source generators (*.designer.cs, *.g.cs), Dart/Flutter codegen (*.freezed.dart, *.g.dart, *.gr.dart).Full list in complexity-treemap.py's EXCLUDE_DIRS and EXCLUDE_FILE_PATTERNS. If you specifically want to score these (e.g., to visualise how much of the repo is generated), pass --include-artifacts.
Dominance warning. If a single file still holds >30% of total scoreable LOC after filtering (the threshold compiled bundles typically cross), the script prints a warning to stderr identifying the file. When you see this, the right next step depends on why the file is large:
main.dart.js, a bundled JS file, etc.): surface in the report's "Hotspot snapshot" section as "<file> holds X% of LOC and is likely a build artifact - recommend adding to .gitignore and re-running." Add a Top 3 Action of the same shape.--exclude, not .gitignore. Recommend the user persist the rule in .assess/config.toml so subsequent runs apply it automatically (see "Custom excludes" below). Do not push toward .gitignore in this case.Custom excludes for vetted-context / reference data. When the repo intentionally tracks large non-source files, two mechanisms extend the built-in defaults (the built-ins always apply; both layers are additive). The same excludes apply across every scan - the heatmap, the doc-navigability graph, the doc-staleness pass, and the liveness scan all honour the same list, so "this is reference data, not source" is a single statement, not a per-layer toggle:
CLI flag --exclude PATTERN (repeatable, ad-hoc). A plain string is matched as a directory name; a glob is matched against the basename. The flag exists on the treemap script for one-off runs:
<!-- chat-replace:treemap-exclude-example -->
uv run "$SKILL_DIR/scripts/complexity-treemap.py" "$REPO_ROOT" --exclude regulatory-raw --exclude vetted-context --exclude '*.csv'
Per-repo config .assess/config.toml (durable, version-controllable, applies to every scan via the orchestrator). Recommended for any exclude the user will want to apply every run:
exclude_dirs = ["regulatory-raw", "vetted-context", "seed-data"]
exclude_patterns = ["*.csv", "*.parquet"]
No section header is needed - the file is already namespaced by living under .assess/. Missing or malformed files degrade silently to no extra excludes; the assessment never blocks on a broken config.
Provenance for generated docs (staleness measured against the source). A generated doc - a Jira note dump, an API reference, codegen output - is not stale because the file is old; it is stale when the source it was derived from has moved on. The mtime/age model gets this backwards: a freshly regenerated dump of 1,200 notes shares one recent mtime (looks fresh) even when its source changed afterwards, and an old-but-accurate generated doc reads as a lying map when it is not. Declare provenance and doc-staleness is computed as "is the source newer than the doc?" instead - a generated doc whose source is quiet is never flagged as a lying_map, regardless of how busy the surrounding code is. Two ways to declare it (frontmatter wins when both name a source for the same doc):
Frontmatter on the generated doc - a source: key (a string or a list), resolved relative to the repo root first, then to the doc's own directory. An optional generated_by: records the generator for humans (it does not affect staleness):
---
source: data/jira.tsv
generated_by: scripts/dump-jira-notes.py
---
Per-repo config .assess/config.toml [[generated]] array-of-tables, for bulk-generated trees that cannot each carry frontmatter. path is a folder relative to the repo root; every doc under it inherits the mapping. source is a string or list of strings relative to the repo root:
[[generated]]
path = "notes"
source = "data/jira.tsv"
When a generated doc's source is newer than the doc, the staleness verdict is a direct, high-confidence source-vs-doc comparison (git commit time, falling back to mtime) - so a stale generated doc over complex code still surfaces as a lying_map, while a fresh one never does.
The script's own output directory .assess/ is excluded automatically - prior runs' run-context.json and SVGs never feed the next run's heatmap, the doc graph, or the dead-code scan. Test fixtures under **/tests/fixtures/** are likewise excluded automatically - they are inputs that exercise the scanners (sample CLAUDE.md / monolithic-instruction files), not navigational docs or live code, so counting them would inflate the orphan rate and depress the Layer 0 navigability read.
If the script fails (no uv, no scoreable files, etc.), record the error in the report under "Hotspot snapshot" as "could not be generated - " and continue with the layered assessment. The treemap is additive; assessment still runs without it.
Run the full sequence - rotate the prior sidecar first, then the treemap, then the deterministic core:
# Rotate the prior stats sidecar so the diff has something to compare against next run
if [ -f "$REPO_ROOT/.assess/complexity-stats.json" ]; then
cp "$REPO_ROOT/.assess/complexity-stats.json" "$REPO_ROOT/.assess/complexity-stats.prior.json" 2>/dev/null || true
fi
<!-- chat-skip:start -->
# Resolve this skill's own directory so we can run its bundled scripts. A
# plugin install exposes $CLAUDE_PLUGIN_ROOT (the plugin root in the version
# cache, e.g. ~/.claude/plugins/cache/<mp>/<plugin>/<ver>/); fall back to a
# hand-placed ~/.claude/skills/assess/ copy when it isn't set. CLAUDE_PLUGIN_ROOT
# is an environment variable, so it stays valid across later steps' shells too.
SKILL_DIR="${CLAUDE_PLUGIN_ROOT:+$CLAUDE_PLUGIN_ROOT/skills/assess}"
SKILL_DIR="${SKILL_DIR:-$(dirname "$(realpath ~/.claude/skills/assess/SKILL.md)")}"
<!-- chat-skip:end -->
# Run the complexity treemap (produces fresh complexity-stats.json)
# (single line: the standalone transform replaces the marker + one following line)
<!-- chat-replace:uv-treemap -->
uv run "$SKILL_DIR/scripts/complexity-treemap.py" "$REPO_ROOT" -o "$REPO_ROOT/.assess/complexity-heatmap.svg" --stats "$REPO_ROOT/.assess/complexity-stats.json"
# Run the doc navigability graph (connectivity + staleness in one SVG; feeds Layer 0)
<!-- chat-replace:uv-doc-graph -->
uv run "$SKILL_DIR/scripts/doc-graph-svg.py" "$REPO_ROOT" -o "$REPO_ROOT/.assess/doc-graph.svg"
# Run the deterministic core (instruction grading, doc link-graph, doc staleness,
# liveness/dead-code, observability rungs, stats diff, wiki files, run-context.json)
<!-- chat-replace:uv-core -->
uv run "$SKILL_DIR/scripts/assess_core.py" "$REPO_ROOT"
Either SVG is additive: if a script fails (no uv, no scoreable files, no docs), record "could not be generated - " in the report and continue. The doc graph shares its data with the deterministic core's doc_graph / doc_staleness blocks, so even when the SVG can't render, Layer 0 still scores from run-context.json.
Now $REPO_ROOT/.assess/run-context.json contains the structured data you need for the prose sections. Read it before writing the report.
Survivor-density overlay (opt-in mutation runs only). The treemap above runs before run-context.json exists, so its first pass has no mutation data. If an opt-in mutation pass populated test_pressure.per_file, regenerate the heatmap with the overlay so covered-but-unpinned files get hatched and stop reading as safe green. With no mutation data the --test-pressure flag is a silent no-op, so this is harmless to skip when mutation wasn't run:
<!-- chat-replace:uv-treemap-overlay -->
uv run "$SKILL_DIR/scripts/complexity-treemap.py" "$REPO_ROOT" -o "$REPO_ROOT/.assess/complexity-heatmap.svg" --stats "$REPO_ROOT/.assess/complexity-stats.json" --test-pressure "$REPO_ROOT/.assess/run-context.json"
The plugin_version field in run-context.json tells you which plugin version produced this run. Surface it at the top of the report (e.g., "Generated by /assess v1.8.0") so readers can spot it if a stale cached version of the plugin produced unexpected output.
The deterministic core has written the data bus (.assess/run-context.json). Assigning each layer Present / Partial / Missing is judgement-heavy work that benefits from a fresh context window applying the layer methodology - so it runs as a dedicated unit, not inline here.
Spawn the assess-layer-scorer agent (subagent_type: "assess-layer-scorer"), passing REPO_ROOT. It reads .assess/run-context.json, scores every layer, and returns the 0-8 score, the per-layer verdicts with evidence, and the maturity label. Hold that scorecard for Step 4.
Before the report is written, check what changed since the last run (the findings-writer renders this into the report's diff section):
jq '.diff, .diff_detail' "$REPO_ROOT/.assess/run-context.json"
If prior was None (first run), skip this section in the report.
Check diff_reliable first. When run-context.json has diff_reliable: false, the prior snapshot came from a different (or unstamped) plugin version (diff_version_note explains it) - file-filter differences across versions surface phantom "graduated"/"new" transitions that didn't really happen. Suppress the "What Changed Since Last Run" section in that case and instead note one line: "Diff suppressed - prior snapshot predates version stamping or used a different file filter; comparison resumes once two runs share a plugin version." Otherwise, populate the section:
diff_detail.graduated - hotspots that left the top listdiff_detail.regressed with their ccn_delta / commits_deltadiff_detail.newdiff_detail.persistentThe wiki files at .assess/index.md and .assess/hotspots/*.md are already updated by assess_core.py - you don't need to write them. You only write the prose summary in assess-report.md.
Assembling .assess/assess-report.md - the scorecard, the snapshots, the verbatim cross-layer findings, the lying signals, and the mandatory Top 3 Actions - is a reusable, mostly-deterministic procedure. It runs as a sub-skill.
Use the assess-findings skill, handing it the scorecard the layer-scorer returned. It assembles .assess/assess-report.md from the data bus plus the scorecard: the verbatim findings section, the lying signals, and the Top 3 Actions (the attention list is mandatory). Then continue to Step 7.5.
After writing assess-report.md, write finalize-input.json to the transient cache and invoke assess_finalize.py so the wiki files reflect the score and actions you chose.
The input file lives under .assess/.cache/ rather than directly in .assess/ because it is a one-off LLM-authored input consumed immediately - it has no future utility and only creates noisy diffs if committed. assess_finalize.py reads the cache path first (and falls back to the legacy in-tree location if a prior run wrote one there), then deletes it on success so it cannot leak into a commit either way.
mkdir -p "$REPO_ROOT/.assess/.cache"
cat > "$REPO_ROOT/.assess/.cache/finalize-input.json" <<'EOF'
{
"score": 6.0,
"maturity_label": "Solid",
"top_action": "Add cyclop rule (threshold 15) to .golangci.yml",
"hotspot_actions": {
"src/foo.go": [
"Split parseLine into smaller functions",
"Add a test file at src/foo_test.go"
]
},
"actions": [
{
"rank": 1,
"action": "Add cyclop rule (threshold 15) to .golangci.yml",
"layer": 3,
"effort": "small",
"files": [".golangci.yml"],
"first_step": "Add 'cyclop' with max-complexity: 15 under linters",
"done_when": "golangci-lint run passes with the rule active; no new suppressions added",
"scope_fence": "Only .golangci.yml; do not edit source files to chase pre-existing violations"
}
]
}
EOF
<!-- chat-skip:start -->
# Re-resolve the skill dir in case this runs in a fresh shell (Step 2's shell
# var won't have survived; the env var $CLAUDE_PLUGIN_ROOT will).
SKILL_DIR="${CLAUDE_PLUGIN_ROOT:+$CLAUDE_PLUGIN_ROOT/skills/assess}"
SKILL_DIR="${SKILL_DIR:-$(dirname "$(realpath ~/.claude/skills/assess/SKILL.md)")}"
<!-- chat-skip:end -->
<!-- chat-replace:uv-finalize -->
uv run "$SKILL_DIR/scripts/assess_finalize.py" "$REPO_ROOT"
This replaces:
log.md's last entry placeholder **AI Readiness:** 0.0 / 8 ((LLM fills in)) with your actual score and maturity label.log.md's last entry placeholder **Top action:** Deterministic ranker not yet wired ... with your actual Top 1 action.hotspots/<slug>.md's Suggested actions section with the actions you derived for that file.assess_finalize.py also refreshes .assess/badge.json (shields.io endpoint schema) from your score and maturity label - the live README badge. When offering the PR (assess-pr), include the embed snippet if the repo's README has no badge yet:

The actions array mirrors the report's Top 3 Actions table one-to-one and must carry every table row - rank, action, done_when, and scope_fence are required per entry (layer, effort, files, first_step recommended). assess_finalize.py writes it to .assess/actions.json, the durable machine-readable contract: unlike this input file (consumed and deleted), actions.json persists so an executing agent - including a smaller, cheaper model - can pick up the work with its exit criteria and fences intact, without parsing the report's markdown.
Without this step, the log.md placeholders above carry forward forever. Hotspot pages you don't supply actions for keep a neutral pointer (This file is flagged but outside this run's Top 3. See the report's Top 3 Actions, or run a focused /assess pass for file-specific guidance.) rather than an unfinished-work placeholder - a flagged-but-not-Top-3 page reads as intentional.
The hotspot_actions dict should include at minimum the files mentioned in your Top 3 Actions. You can include more if you have specific suggestions for them; any file you omit keeps the neutral pointer.
With the report written and the wiki finalized, run the end-of-run offers - open a PR, track the Top 3 Actions, freeze the assessment into a CI gate - and the tool-feedback prompt. This is a reusable procedure (akin to pr-review-merge), so it runs as a sub-skill.
Use the assess-pr skill. It runs the three offers (PR, issue tracking, freeze-into-CI) in order, then the tool-feedback prompt, reading the written .assess/assess-report.md artifact - notably mutating the Top 3 Actions table's Issue column in place when the user creates tracking items.
npx claudepluginhub bjcoombs/ai-native-toolkit --plugin ai-native-toolkitGuides creation, editing, and verification of skills for AI coding agents using test-driven development with subagent scenarios. Use when authoring or debugging skills.