repo-indexer

Index any codebase for persistent Claude context — minimal token overhead between sessions (~500 tokens for CLAUDE.md boot).

The Problem

Every new Claude session starts blank. You re-explain your architecture, your conventions, your stack — burning hundreds of tokens just to get Claude up to speed. For large codebases, this context tax is constant and expensive.

The Solution

repo-indexer runs a structured 6-phase analysis of your codebase and writes the results into a tiered memory system that scales across sessions with near-zero overhead:

L0: Claude Native Memory  → repo roster, patterns (~100 tokens, always present)
L1: CLAUDE.md             → boot loader only (<500 tokens, auto-loaded every session)
L2: .claude/memory/*.md   → deep context files (loaded on-demand, only when needed)
L3: Conversation History  → full analysis output (searchable, costs 0 tokens until used)

Claude loads L0 + L1 automatically. L2 and L3 are retrieved only when the task demands it. Files are pointers, not stores.

Quick Start

# Install the plugin
/plugin marketplace add jyshnkr/repo-indexer

# Install the skill
/plugin install repo-indexer

Then in any project directory:

index this repo

What It Does

Phase 1: Git Sync

Pulls latest from release > main > master to ensure analysis is current.

Phase 2: Detect Repo Type

Automatically classifies the codebase:

Monorepo — pnpm-workspace.yaml, turbo.json, packages/, apps/
Microservices — multiple Dockerfiles, docker-compose with 3+ services
Single App — default when no strong signals are present
Library — pyproject.toml, Cargo.toml, setup.py, go.mod, or src/-only layout (no apps/) Heuristic note: a single weak signal may still default to Single App.

Phase 3: Index

Analyzes 9 areas systematically:

Config files (package.json, pyproject.toml, Cargo.toml, go.mod)
Entry points (main, CLI, server bootstrap)
Directory structure (depth 3)
Core modules (business logic, services, models)
API surface (routes, endpoints, schemas)
Data layer (models, migrations, ORM)
External dependencies (third-party integrations)
Build/deploy (Dockerfile, CI/CD, Makefile)
Tests (structure, fixtures, patterns)

Phase 4: Generate Output

Full analysis written to conversation (L3) with ### SEARCH KEYWORDS for retrieval
Minimal .claude/ file tree created at repo root (L2)
CLAUDE.md created as a <500 token boot loader (L1)

Phase 5: Validate Token Budgets

python3 skills/repo-indexer/scripts/estimate-tokens.py

Validates budgets using a heuristic token estimate (CLAUDE.md must be under 500 tokens).

Phase 6: Suggest Native Memory Update

python3 skills/repo-indexer/scripts/generate-memory-update.py

Suggests 2–3 lines to add to Claude's native memory so the next session starts with repo awareness — no CLAUDE.md load required.

Token Budget

Layer	Budget	When Loaded
L0: Native Memory	~100–300 tokens	Always (free)
L1: CLAUDE.md	< 500 tokens	Every session start
L2: memory/*.md	< 10,000 tokens total	On-demand only
L3: Conversation History	0 tokens	When searched

Total auto-loaded per session: < 800 tokens. Everything else costs nothing until you need it. Token counts are estimated via a bytes-per-token heuristic; treat these as guardrails, not exact model counts.

Use Cases

"Index this repo" → Full 6-phase workflow. Claude knows your project before you ask your first question.

"Set up Claude context for this project" → Same workflow. Optimized for team onboarding — every developer gets instant Claude context.

"Help me understand this codebase" → Checks existing Claude memory and past conversations first. If prior indexing found, uses it. If .claude/ exists, compares with current codebase, flags inconsistencies, updates incrementally.

Output Structure

After indexing, your repo gets:

your-project/
├── CLAUDE.md                    # <500 token boot loader (L1)
└── .claude/
    ├── memory/
    │   ├── architecture.md      # System design, diagrams, key flows (L2)
    │   ├── conventions.md       # Naming, patterns, git workflow (L2)
    │   └── glossary.md          # Domain terms, acronyms (L2)
    ├── plans/                   # Empty — user-managed
    └── checkpoints/             # Empty — user-managed

The  marker in each file preserves your own notes through re-indexing.

repo-indexer

Popularity

What's Inside

README

repo-indexer

The Problem

The Solution

Quick Start

What It Does

Phase 1: Git Sync

Phase 2: Detect Repo Type

Phase 3: Index

Phase 4: Generate Output

Phase 5: Validate Token Budgets

Phase 6: Suggest Native Memory Update

Token Budget

Use Cases

Output Structure

Scripts Reference

Confidence

Similar Plugins

claude-turbo-search

codemap

context-please

claude-leverage

claude-token-reducer

mimirs

Popularity

Health & Quality

Similar Plugins

claude-turbo-search

codemap

context-please

claude-leverage

claude-token-reducer

mimirs