repo-indexer
Index any codebase for persistent Claude context — minimal token overhead between sessions (~500 tokens for CLAUDE.md boot).

The Problem
Every new Claude session starts blank. You re-explain your architecture, your conventions, your stack — burning hundreds of tokens just to get Claude up to speed. For large codebases, this context tax is constant and expensive.
The Solution
repo-indexer runs a structured 6-phase analysis of your codebase and writes the results into a tiered memory system that scales across sessions with near-zero overhead:
L0: Claude Native Memory → repo roster, patterns (~100 tokens, always present)
L1: CLAUDE.md → boot loader only (<500 tokens, auto-loaded every session)
L2: .claude/memory/*.md → deep context files (loaded on-demand, only when needed)
L3: Conversation History → full analysis output (searchable, costs 0 tokens until used)
Claude loads L0 + L1 automatically. L2 and L3 are retrieved only when the task demands it. Files are pointers, not stores.
Quick Start
# Install the plugin
/plugin marketplace add jyshnkr/repo-indexer
# Install the skill
/plugin install repo-indexer
Then in any project directory:
index this repo
What It Does
Phase 1: Git Sync
Pulls latest from release > main > master to ensure analysis is current.
Phase 2: Detect Repo Type
Automatically classifies the codebase:
- Monorepo —
pnpm-workspace.yaml, turbo.json, packages/, apps/
- Microservices — multiple Dockerfiles,
docker-compose with 3+ services
- Single App — default when no strong signals are present
- Library —
pyproject.toml, Cargo.toml, setup.py, go.mod, or src/-only layout (no apps/)
Heuristic note: a single weak signal may still default to Single App.
Phase 3: Index
Analyzes 9 areas systematically:
- Config files (package.json, pyproject.toml, Cargo.toml, go.mod)
- Entry points (main, CLI, server bootstrap)
- Directory structure (depth 3)
- Core modules (business logic, services, models)
- API surface (routes, endpoints, schemas)
- Data layer (models, migrations, ORM)
- External dependencies (third-party integrations)
- Build/deploy (Dockerfile, CI/CD, Makefile)
- Tests (structure, fixtures, patterns)
Phase 4: Generate Output
- Full analysis written to conversation (L3) with
### SEARCH KEYWORDS for retrieval
- Minimal
.claude/ file tree created at repo root (L2)
CLAUDE.md created as a <500 token boot loader (L1)
Phase 5: Validate Token Budgets
python3 skills/repo-indexer/scripts/estimate-tokens.py
Validates budgets using a heuristic token estimate (CLAUDE.md must be under 500 tokens).
Phase 6: Suggest Native Memory Update
python3 skills/repo-indexer/scripts/generate-memory-update.py
Suggests 2–3 lines to add to Claude's native memory so the next session starts with repo awareness — no CLAUDE.md load required.
Token Budget
| Layer | Budget | When Loaded |
|---|
| L0: Native Memory | ~100–300 tokens | Always (free) |
| L1: CLAUDE.md | < 500 tokens | Every session start |
| L2: memory/*.md | < 10,000 tokens total | On-demand only |
| L3: Conversation History | 0 tokens | When searched |
Total auto-loaded per session: < 800 tokens. Everything else costs nothing until you need it.
Token counts are estimated via a bytes-per-token heuristic; treat these as guardrails, not exact model counts.
Use Cases
"Index this repo"
→ Full 6-phase workflow. Claude knows your project before you ask your first question.
"Set up Claude context for this project"
→ Same workflow. Optimized for team onboarding — every developer gets instant Claude context.
"Help me understand this codebase"
→ Checks existing Claude memory and past conversations first. If prior indexing found, uses it. If .claude/ exists, compares with current codebase, flags inconsistencies, updates incrementally.
Output Structure
After indexing, your repo gets:
your-project/
├── CLAUDE.md # <500 token boot loader (L1)
└── .claude/
├── memory/
│ ├── architecture.md # System design, diagrams, key flows (L2)
│ ├── conventions.md # Naming, patterns, git workflow (L2)
│ └── glossary.md # Domain terms, acronyms (L2)
├── plans/ # Empty — user-managed
└── checkpoints/ # Empty — user-managed
The <!-- USER --> marker in each file preserves your own notes through re-indexing.
Scripts Reference