Expert Retrieval-Augmented Generation: chunking, embeddings, vector/hybrid search, reranking, and grounded answers. Trigger keywords: RAG, retrieval, embeddings, vector database, chunking, reranking, hybrid search, BM25, grounding, citations, hallucination, context window, recall. Use to build or debug RAG pipelines and improve answer quality/faithfulness.
How this skill is triggered — by the user, by Claude, or both
Slash command
/rag-expert:rag-expertThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
> Garbage retrieval in, hallucination out. Answer quality is bounded by what you retrieve, so fix retrieval before touching the prompt. Measure retrieval and generation separately.
Garbage retrieval in, hallucination out. Answer quality is bounded by what you retrieve, so fix retrieval before touching the prompt. Measure retrieval and generation separately.
prompt-engineering-expert.deep-learning-expert.llm-testing-expert.api-design-expert.| Symptom | Likely fix |
|---|---|
| Right doc never retrieved | better embeddings, hybrid search, smaller/structured chunks |
| Retrieved but answer ignores it | reranker, fewer/cleaner chunks, stronger grounding prompt |
| Misses exact codes/names | add keyword/BM25 (hybrid) |
| Confident but wrong | enforce "answer only from context" + citations + "I don't know" |
| Slow/expensive | fewer chunks, cache embeddings, smaller reranker, pre-filter |
Retrieve → hybrid → rerank → grounded prompt
dense = store.search(embed(query), top_k=20, filter={"tenant": tid, "acl": user.groups})
sparse = bm25.search(query, top_k=20)
candidates = rrf_fuse(dense, sparse) # reciprocal-rank fusion
ranked = reranker.rank(query, dedupe(candidates))[:5] # cross-encoder
context = "\n\n".join(f"[{i+1}] {c.text}\n(source: {c.meta['title']})"
for i, c in enumerate(ranked))
prompt = (
"Answer using ONLY the context. Cite sources as [n]. "
"If the answer is not in the context, say you don't know.\n\n"
f"Context:\n{context}\n\nQuestion: {query}"
)
prompt-engineering-expert — structuring the grounded generation prompt.llm-testing-expert — measuring faithfulness and retrieval metrics.sql-expert — metadata filtering / pgvector storage.api-design-expert — streaming retrieval/answer endpoints.Creates, edits, and optimizes skills for Claude Code, including drafting, evaluating with test prompts, iterating on performance, and improving skill descriptions for better triggering accuracy.
npx claudepluginhub miaoge-ge/coding-agent-skills --plugin rag-expert