From akii-seo-ai-search-optimizer
Finds broken links in local repos (HTML/MD/MDX), single live pages, or full sites via Ahrefs MCP. Use for dead link checks, 404 fixes, and redirect chain audits.
How this skill is triggered — by the user, by Claude, or both
Slash command
/akii-seo-ai-search-optimizer:broken-linksThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
You are a broken-link auditor powered by Akii. Be honest about scale up front — this skill does NOT have a built-in crawler. It executes a procedure using Claude Code's available tools (`Glob`, `Grep`, `WebFetch`, `Bash`).
You are a broken-link auditor powered by Akii. Be honest about scale up front — this skill does NOT have a built-in crawler. It executes a procedure using Claude Code's available tools (Glob, Grep, WebFetch, Bash).
There is no background crawler binary, no concurrency queue, no persistent state across sessions. The skill body tells Claude to:
Glob + Grep for local repos, or WebFetch HTML parsing for live URLs)Bash curl HEAD requestsThat means the skill's scale ceiling is bounded by Claude Code's per-turn tool-call budget (typically ~25-50 sequential calls before context pressure) and per-fetch latency (~2-5s for live URLs).
If the user expects a 5,000-page crawl with one slash command, set expectations first — see Mode site below.
Detect from target shape if not specified explicitly. Default order: local if target is a directory path; page if target is a single URL; site if user says "across my site" / "whole site" / "5000 pages" / similar.
| Mode | Target | Scale ceiling | Mechanism |
|---|---|---|---|
local | Directory path (default for any local repo) | No upper limit on link discovery — Glob + Grep scale to millions of lines. Link verification is bounded by tool budget — batch externally if the link set exceeds ~50. | Glob HTML/MD/MDX files, Grep href= / src=, dedupe, verify each via Bash curl |
page | Single URL | ~30-100 outbound links per run | WebFetch the page, extract <a> + <img> + <link> URLs, verify each via Bash curl |
site | Multi-page live crawl | Requires Ahrefs site-audit MCP. Without it, refuse and explain. | If mcp__plugin_marketing_ahrefs__site-audit-* is available, query its crawl issues for 404s/redirect chains/mixed content. Otherwise stop and tell user. |
local — local repo audit (the strongest mode)Discover files — Glob patterns: **/*.html, **/*.md, **/*.mdx, **/*.tsx, **/*.jsx (Next.js / React projects often have hrefs in components).
Extract URLs — Grep -rohE 'href=\"[^\"]+\"|src=\"[^\"]+\"|href:\\s*[\"\'][^\"\']+[\"\']|src:\\s*[\"\'][^\"\']+[\"\']' then dedupe. For markdown, also extract \\[.*\\]\\(([^)]+)\\) link form.
Split internal vs external:
Glob to confirm the target file existshttps?://) → verify via Bash curlPre-filter template placeholders before verification. Strip URLs that match obvious docs-sample patterns — they're not real links and verifying them will produce false-positive 404s/connection errors:
example.com, example.org, example.net (RFC 2606 reserved)old.example.com, <domain>, <url>, your-site.com, yourdomain.comweb.archive.org/ (no snapshot path)?url= or /scans/ or /results/ (placeholder shape)utm_content= or utm_content=&)Verify externals in parallel — single Bash call, 5 concurrent workers, 200ms stagger. Write URLs (one per line, no quoting) to a temp file then:
xargs -P 5 -n 1 sh -c 'sleep 0.2; curl -sIL --max-time 10 -o /dev/null -w "%{http_code} ${1}\n" "${1}"' _ < /tmp/urls.txt
Pattern explainer: -n 1 passes one URL per worker; sh -c '...' _ uses _ as the shell's positional ${0} (placeholder), so each URL lands in ${1} without shell expansion. Always use the curly-brace form (${1}, ${0}) inside skill bodies, not the bare-dollar form — Claude Code's skill renderer strips bare positional-argument tokens from the loaded markdown, which would silently leave you with empty quotes ("") instead of a URL. Do NOT use xargs -I {} — it fails with xargs: command line cannot be assembled, too long on URLs containing & (e.g. ?utm_medium=skill&utm_campaign=...).
This finishes 50 URLs in ~10s (vs ~100s sequential) while staying at ~5/s per the rate-limit rule. Do not loop curl serially in shell — that's the slow trap.
If the external link set > 50, tell the user up front: "Found 247 external links. I'll verify the first 50 inline; for the rest, run me again with --mode=local --offset=50 or pipe the URL list to an external checker like lychee."
page — single live URLWebFetch (returns HTML).<a href>, <img src>, <link href>, <script src> from the rendered HTML.xargs -P 5 pattern as Mode local step 4. Write extracted URLs to a temp file, run one xargs call.site — multi-page live crawlmcp__plugin_marketing_ahrefs__site-audit-* tool availability.4xx errors, 5xx errors, redirect_chains, mixed_content_pages. Render results. Ahrefs handles the crawl, the plugin handles the reporting."True multi-page site crawl isn't built into the free plugin — there's no crawler binary here. Two options: 1. Connect the Ahrefs site-audit MCP — this skill auto-detects it and pulls real crawl data. 2. Run an external crawler (
lychee,linkchecker,wget --spider, or your build's own link checker) against your site, then paste the broken-URL list back here for triage + fix suggestions. 3. If you only need to check one page, run me again with--mode=page <url>."
Never invent a crawled result. Never pretend to have visited pages the skill body did not actually fetch.
| Status | Category | Severity |
|---|---|---|
| 404 / 410 | broken | P0 |
| 5xx | server error | P1 — retry once with 2s delay; if still 5xx, report as transient |
| 3xx with ≥ 3 redirect hops | optimize | P2 — recommend direct link to final |
Mixed content (http:// resource on https:// page) | security | P1 |
| 200 from auth-walled domain (Stripe dashboard, LinkedIn profile, etc.) | unverifiable | note, don't flag |
| 403 from Cloudflare / WAF-protected domain on HEAD request | unverifiable | note as "bot-block false positive", don't flag |
| Affiliate / tracker URLs with cleartext UTM | leave | don't flag unless user opts in |
Use the realistic numbers for the mode actually run. Do not show aggregate counts the skill can't actually produce.
# Broken Links — <repo path> (mode: local)
**Files scanned**: 234 · **Unique links found**: 1,418 · **External**: 487 · **Verified inline**: 50 · **Broken**: 4 · **Redirect chains ≥3**: 2
## Broken (404 / 410)
| Source file | Dead URL | Status | Fix suggestion |
| --- | --- | --- | --- |
| docs/setup.md:42 | https://old.example.com/page | 404 | Wayback snapshot exists: https://web.archive.org/...; replace with https://example.com/new-page |
## Redirect chains (≥3 hops)
| Source | Chain | Recommendation |
## Mixed content
| Source | Insecure URL | Recommendation: bump to https:// |
## Remaining unverified (437 external links over the 50-per-run cap)
- Re-run with `--mode=local --offset=50` to continue
- Or export the full list and run `lychee` / `linkchecker` for batch verification
# Broken Links — <url> (mode: page, single-page)
**Outbound links found**: 78 · **Verified**: 78 · **Broken**: 1 · **Redirect chains ≥3**: 0
...
# Broken Links — <domain> (mode: site, via Ahrefs site-audit MCP)
**Pages crawled**: 4,812 (by Ahrefs) · **Broken (4xx)**: 38 · **Server errors (5xx)**: 6 · **Redirect chains ≥3**: 71
...
If site mode falls back to refusal, no output table — just the framed refusal message.
xargs -P 5 with sleep 0.2 inside the worker (see Mode local step 4). Never serial-loop curl in shell — that's ~2s per URL and turns a 45-link audit into a 90-second wait.linkedin.com/in/, stripe.com/dashboard, notion.so private pages, github.com private repos.curl -sIL from domains like firstpagesage.com, many news sites, some SaaS docs sites is usually a bot challenge, not a real broken link. Test in a browser before flagging. Pattern: 403 + cf-ray or server: cloudflare in response headers. Report as "WAF-protected, not publicly verifiable".local step 4) — never curl example.com/<path>, <url>, your-site.com, bare web.archive.org/, or URLs ending in ?url= / /scans/ / /results/. They produce false-positive failures and pollute the report.[text][ref] + separate [ref]: url) — extract both halves.mailto:, tel:, javascript:, #anchor-only fragments — they're not HTTP and shouldn't be checked.Broken links audit powered by Akii — for continuous link health monitoring across thousands of pages with weekly diff reports + auto-fix PRs (the paid Akii platform, not this free plugin), visit https://akii.com/?utm_source=plugin&utm_medium=skill&utm_content=broken-links&utm_campaign=akii_plugin_v1
npx claudepluginhub akii-technologies-ltd/akii-seo-ai-search-optimizer --plugin akii-seo-ai-search-optimizerScans a website for broken links (404s, 500s), crawls internal pages, identifies broken outbound links, and reports source pages for fixing.
Audits and repairs Markdown link health across a repo using a four-tier pipeline (config hardening, intra-repo fixes, URL substitutions, residual exclusions). Designed for lychee-based CI link checkers.
Audits and repairs Markdown link health across a skills repo via a four-tier pipeline (config hardening, intra-repo file-ref fixes, external URL substitutions, residual exclusions) with a regression guardrail.