Marketplace

rag

Agentic RAG pipeline using Claude agents for query analysis, hybrid retrieval, evaluation, and answer synthesis with CRAG validation.

npx claudepluginhub flashwade03/fablers-rag

README

View full README on GitHub

1 Plugin

fablers-agentic-rag

1·

Agentic RAG pipeline using Claude agents for query analysis, hybrid retrieval, evaluation, and answer synthesis with CRAG validation.

3mo

v2.0.1

flashwade03

Stats

Plugins1

Stars1

UpdatedMar 23, 2026

Links

View on GitHub View Marketplace JSON

fablers-agentic-rag

Ask your documents. Get a cited answer.

A Claude Code plugin that runs an agentic RAG pipeline — query analysis, hybrid retrieval, evaluation with CRAG validation, and cited answer synthesis — all orchestrated by Claude agents. Supports PDF, plain text, and Markdown.

English | 한국어 | 日本語

What is this?

You have documents. You have questions. But keyword search is fragile and LLMs hallucinate without sources.

fablers-agentic-rag bridges the gap: it chunks your document (PDF, TXT, or Markdown), indexes it with vector + BM25, and deploys a 3-agent pipeline that retrieves, validates, and synthesizes answers with page-level citations — all inside Claude Code.

Why this exists

There are many ways to do RAG with Claude Code. You could wire up Obsidian + an MCP server + a vector DB + a separate AI for summarization. It works — but now you're managing four tools, each with its own setup, updates, and quirks.

Or you could just feed the whole PDF to Claude. But a 600-page book blows past the context window, and even if it fits, you'd burn tokens re-reading it with every question.

This plugin takes a different approach: one tool, one workflow. Ingest, search, validate, answer — all inside Claude Code. The only reason OpenAI is involved at all is that Claude doesn't offer an embedding API (yet). Everything else runs on the Claude you're already using.

vs. Typical RAG MCP

	Typical RAG MCP	This Plugin
Workflow	Obsidian / vector DB / external AI — multiple tools to manage	Claude Code only — ingest to answer in one place
Brain	External LLM API calls (OpenAI, etc.) for reasoning	Claude Code agents ARE the brain — no external LLM
Architecture	Single retrieve → paste	Multi-agent pipeline with validation
Quality check	None — returns whatever vector search finds	CRAG validation scores every passage, retries with rewritten queries
Complex questions	Same path for all queries	Complexity routing — 1 agent for simple, 3 for multi-part
Citations	Raw chunk dump or none	Every claim gets `[Source N]` inline + sources section
Search method	Vector-only (misses exact terms)	Hybrid vector + BM25 (catches both semantics and keywords)
Infrastructure	Often requires Docker, vector DB server	Zero infra — pure Python files + Claude agents
Self-correction	One-shot, no retry	CRAG loop rewrites queries up to 2x when results are poor

vs. Reading the whole PDF

	Whole PDF in context	This Plugin
~50 pages	Works fine. Just read it.	Overkill
~150+ pages	Exceeds context window or costs explode	Index once, query cheaply forever
Repeated questions	Full re-read every time (10 questions = 10x cost)	One-time index, ~5K tokens per query
Citation accuracy	May hallucinate page numbers	Chunk metadata has exact pages/headings

The only external API call is OpenAI text-embedding-3-small for query embedding. Everything else — query analysis, reranking, validation, answer synthesis — runs on Claude Code's own agent system. No extra LLM costs.

TL;DR: An MCP gives you search results. This plugin gives you a validated, cited answer — powered by the Claude you're already paying for. Put those tokens to good use.

A small arc reactor, not a power grid

This plugin was born from a real need: ideating on game design theory from Jesse Schell's The Art of Game Design — a 600-page book full of interconnected concepts, lenses, and frameworks. The goal was never to index a million documents. It was to deeply understand one.

Think of it as a compact, self-contained reactor you drop into a project:

A textbook you want to study with citations
A technical manual you need to reference accurately
Research papers you want to cross-examine
Project documentation you want to query

numpy arrays + in-memory BM25 — no vector DB, no server, no Docker. If your data fits in a few files, this is all you need. For enterprise-scale knowledge bases with millions of records, use GraphRAG or a dedicated vector DB solution instead.

What's new in v2.0.0

Faster: 3 agents instead of 5 — simple questions use only 1 agent call
Simpler structure: repo root = plugin (no nested plugin/ directory)
New commands: /search for direct search, /ingest for document indexing
Smarter routing: complexity-based branching skips unnecessary agents

rag

README

1 Plugin

fablers-agentic-rag

rag

README

fablers-agentic-rag

What is this?

Why this exists

vs. Typical RAG MCP

vs. Reading the whole PDF

A small arc reactor, not a power grid

What's new in v2.0.0

How it works

1 Plugin

fablers-agentic-rag

Related Marketplaces

nextjs

thedotmack

ruview

fablers-agentic-rag

What is this?

Why this exists

vs. Typical RAG MCP

vs. Reading the whole PDF

A small arc reactor, not a power grid

What's new in v2.0.0

How it works

Related Marketplaces

nextjs

thedotmack

ruview