CoreGraph

One queryable code graph for multi-language and monorepo codebases — find callers, impact, dead code, and cross-file inconsistencies, with every relationship tagged by how much you can trust it.
CoreGraph is a Rust CLI (coregraph, v0.1.3, MIT). It indexes your source once,
serves the graph from a background daemon, and answers questions over an IPC
socket, an MCP bridge for LLM agents, an LSP bridge for editors, and an optional
HTTP API. Because every answer comes from the precomputed graph instead of
re-reading files, results are precise, fast to return, and small enough to hand
straight to an LLM — a few hundred tokens where grepping and pasting files would
cost thousands.
What is CoreGraph
CoreGraph builds an in-memory symbol graph of your codebase by combining two
analysis layers into a single result:
- tree-sitter extracts symbols — functions, methods, structs, classes,
enums, config keys, doc comments — from each file.
- stack-graphs resolves names across files — so a call site in one file
links to the definition it actually binds to in another, not just to anything
with the same name.
Both run inside one coregraph index pass — no language servers, build system,
or compiler toolchain required.
Every edge in the graph carries a confidence score (0.0–1.0), the origin
that produced it (e.g. resolved by stack-graphs vs. matched syntactically), and a
trust model. That means a consumer — an LLM agent or a human — can tell a
compiler-grade fact from a heuristic guess, and filter by --min-confidence
when it matters.
Highlights
- Token-efficient — built for LLM agents. Every answer comes from the
precomputed graph, so CoreGraph returns the exact symbols and edges a question
touches instead of whole files.
--output-format llm emits compact, structured
text; results are paged against a --token-budget (default 8000; --fast
caps it at 2000, --full raises it to 16000) with a live budget: used/total
counter; and --min-confidence filters low-trust noise before it ever reaches
the model. A caller lookup that would otherwise mean pasting several files lands
in a few hundred tokens.
- Fast, and stays fast. A single
index pass builds the whole graph (~280
files in ~2.3s on this repo), then a background daemon serves every later query
from memory over an IPC socket — no re-indexing per command. Idle projects are
snapshotted to disk and warm-load on the next query (skipping tree-sitter
extraction) unless a source file changed, so repeat queries are effectively
instant and never stale.
- Many languages, one graph. Java, TypeScript, JavaScript, Python, Go, Rust,
and Kotlin each get both symbol extraction and cross-file name resolution,
alongside config (YAML/TOML/JSON) and Markdown layers — all unified into one
index you query identically regardless of language.
- Monorepo-native. One graph spans every package, service, and language in
the repo at once: a reference that crosses a directory or language boundary
resolves to its real definition, so cross-package call paths and shared
definitions are reachable in a single query rather than scattered across
per-language indexes. The daemon caches multiple projects under an LRU with an
optional heap budget, keeping a large polyglot repo responsive.
Why CoreGraph