From systems-design
Designs end-to-end RAG architectures for use cases like customer support chatbots, documentation Q&A, legal search, and code assistance, covering ingestion pipelines, retrieval strategies, quality metrics, and scaling.
How this skill is triggered — by the user, by Claude, or both
Slash command
/systems-design:rag-designThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Design a Retrieval-Augmented Generation system for a given use case.
Design a Retrieval-Augmented Generation system for a given use case.
$ARGUMENTS - The RAG use case to design for (e.g., "customer support chatbot", "documentation Q&A", "legal document search", "code assistant")
Clarify requirements by understanding:
Load relevant skills based on the use case:
rag-architecturevector-databasesllm-serving-patternsml-inference-optimizationSpawn the rag-architect agent for comprehensive design:
Design the ingestion pipeline:
Design the retrieval pipeline:
Address quality and scale:
/sd:rag-design customer support chatbot with 10K FAQ documents
/sd:rag-design internal documentation Q&A for engineering team
/sd:rag-design legal document search for contract review
/sd:rag-design code assistant for enterprise codebase
/sd:rag-design research paper Q&A with 100K papers
/sd:rag-design product catalog search with structured data
/sd:rag-design multi-lingual knowledge base
| Category | Key Considerations |
|---|---|
| Customer Support | FAQ coverage, escalation, tone consistency |
| Documentation | Technical accuracy, code examples, versioning |
| Legal/Compliance | Citation accuracy, audit trails, access control |
| Code Assistance | AST-aware chunking, context relevance, IDE integration |
| Research/Academic | Multi-document reasoning, citation, long-form answers |
| E-commerce | Product attributes, inventory awareness, personalization |
| Complexity | Pattern | When to Use |
|---|---|---|
| Low | Basic RAG | Simple Q&A, small corpus |
| Medium | RAG + Reranking | Higher accuracy needed |
| Medium | Hybrid Search | Mixed keyword + semantic queries |
| High | Query-Transformed | Vague or complex queries |
| High | Agentic RAG | Multi-hop reasoning, tool use |
A comprehensive RAG system architecture including:
npx claudepluginhub melodic-software/claude-code-plugins --plugin systems-designCovers RAG architecture including design patterns, chunking strategies, embedding models, retrieval techniques, hybrid search, and context assembly for LLM pipelines.
Guides designing RAG systems that ground LLM responses in retrieved documents to reduce hallucination and enable knowledge updates without retraining.
Guides RAG implementation from requirements to LLM integration, covering embedding selection, vector DB setup, chunking strategies, and retrieval optimization.