Codebase X-Ray

Analyze famous open-source codebases and generate technical biography books that teach software engineering through real code.

Overview

Codebase X-Ray reads an entire codebase — DOOM, Redis, Bitcoin, libuv, PostgreSQL, or any project — and produces a book-length analysis in technical biography format. The output teaches software engineering patterns and principles through real code, not textbook abstractions.

What It Produces

14+ chapter book in the user's language (auto-detected)
Each chapter: 3,000-10,000 words with annotated code, diagrams, and anecdotes
📌 Explanation boxes for terminology, analogies, and modern tech comparisons
📐 Design pattern boxes mapping code to GoF patterns, SOLID principles, architecture patterns
Mermaid diagrams (2+ per chapter): class, sequence, state, dependency graphs
Provenance tags on every factual claim for verifiability

Key Design Decisions

Analysis and writing are completely separated. Analysis extracts facts; writing selects and narrates them. Artifact store is permanent and reusable.
No hallucination. Self-verify (Layer 1) + structural audit (Layer 2). Every code_derived claim must have file path and line number.
Language-agnostic. Works on C, Python, Rust, Go, Java, TypeScript — any language. Patterns are universal.
Output language follows user. Request in Korean → Korean book. Request in English → English book.

Generated Structure

User provides a project name (e.g., redis), then a root path (e.g., ~/research/redis). Everything lives under that root:

{project_root}/
├── manifest.md              # Progress tracking + project info
├── source/                  # Cloned repository
├── references/              # Downloaded materials (papers, slides, images, etc.)
│   └── sources.md           # URL list + downloaded file mapping
├── checkpoints/
│   ├── phase0a.md           # Web research (author statements, anecdotes)
│   ├── phase0b.md           # Semantic anchors (worldview, discrepancies)
│   ├── phase1/{dir}.md      # Directory surveys
│   ├── phase2.md            # Module boundary synthesis
│   ├── phase4.md            # Chapter design (epistemological order)
│   ├── verify_phase3.md     # Structural audit results
│   └── verify_phase6.md     # Final audit results
├── artifacts/modules/
│   └── {module}.md          # Deep dive artifacts (reusable)
└── output/
    ├── 00-prologue.md       # Individual chapter files
    ├── 01-{chapter}.md
    ├── ...
    ├── build.sh             # Assembly script (cat + pandoc PDF)
    └── book.md              # Final assembled book

Skills

Skill	Command	Description
analyze	`codebase-x-ray:analyze`	Full pipeline: research → analysis → writing

Pipeline

ANALYSIS PIPELINE
  Phase 0a → Web Research (author statements, Tier 1 sources)
  Phase 0b → Semantic Anchor Extraction (worldview, discrepancies)
  Phase 1  → Directory Survey (parallel agents)
  Phase 2  → Boundary Synthesis (semantic modules)
  Phase 3  → Module Deep Dive (parallel agents) + Self-Verify
  Verifier → Structural Audit
        ↓
  ARTIFACT STORE (permanent, reusable)
        ↓
WRITING PIPELINE
  Phase 4  → Chapter Design (epistemological order)
  Phase 5  → Per-Chapter Writing (parallel agents)
  Phase 6  → Cross-chapter Review (edit in place)
  Verifier → Final Audit
        ↓
  build.sh → book.md + book.pdf

Crash Recovery

manifest.md tracks all progress. If a session dies mid-pipeline:

Start a new session
Point to the same output_dir
Pipeline resumes from the last completed phase
Parallel phases (1, 3, 5) resume only missing items

Writing-Only Re-run

To regenerate chapters without re-analyzing:

Delete checkpoints/phase4.md, checkpoints/verify_phase6.md, and output/ files
Remove Phase 4+ entries from manifest.md
Re-run — resumes from Phase 4 using existing artifacts

Usage

# Just give a project name — the skill asks for a root path, then clones automatically
"x-ray redis"
"DOOM x-ray해줘"

# Provide root path upfront
"x-ray redis at ~/research/redis"

# Resume interrupted session
"continue ~/research/redis"
"~/research/redis 이어서 해줘"

Orchestration Model

The main Claude Code session is the orchestrator. It reads SKILL.md and dispatches sub-agents for each phase. Sub-agents write results directly to files and return only summaries. This keeps the main session's context clean.

For parallel phases (1, 3, 5), multiple agents run simultaneously — one per directory, module, or chapter.

Verification

Layer 1: Self-Verify

Each Phase 3 agent reads the files it cited before finishing. Unverifiable claims are removed or re-tagged as [provenance:synthesized].

codebase-x-ray

Popularity

What's Inside

README

Codebase X-Ray

Overview

What It Produces

Key Design Decisions

Generated Structure

Skills

Pipeline

Crash Recovery

Writing-Only Re-run

Usage

Orchestration Model

Verification

Layer 1: Self-Verify

Confidence

Similar Plugins

caveman

frontend-design

ui-design

claude-mem

marketing-skills

nanobanana

More by workingdanny911

sprint

sprint

cai

Popularity

Health & Quality

More by workingdanny911

sprint

sprint

cai

Similar Plugins

caveman

frontend-design

ui-design

claude-mem

marketing-skills

nanobanana