From llamacloud
This skill should be used when the user wants to "build a Cloud Index", set up a "managed RAG pipeline on LlamaCloud", configure "hybrid search / rerank / metadata filter on LlamaCloud", "ingest documents into a LlamaCloud index", or expose a "managed retrieval endpoint". Owns the end-to-end managed Cloud Index (source → parse/transform → chunk → embed → sink) and retrieval tuning; defers OSS/self-managed RAG to `llamaindex-framework`, raw parsing to `llamacloud-parse`, typed extraction to `llamacloud-extract`, and account/key setup to `llamacloud`.
How this skill is triggered — by the user, by Claude, or both
Slash command
/llamacloud:llamacloud-indexThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
LlamaCloud **Cloud Index** is a fully managed RAG pipeline: connect a data source, and LlamaCloud
LlamaCloud Cloud Index is a fully managed RAG pipeline: connect a data source, and LlamaCloud parses, transforms, chunks, embeds, and stores documents into a sink, then exposes a hosted retrieval endpoint. Choose it to get production retrieval without operating ingestion workers or a vector database. This skill owns the build (source → sink) and retrieval tuning; it does not teach OSS RAG, raw parsing, or extraction.
Prefer Cloud Index when the value is retrieval that works, not infrastructure:
Prefer self-managed OSS (llamaindex-framework with VectorStoreIndex + your own vector DB) when:
you need full control of chunking/embedding code, must keep data on existing local infra, want a
vector store LlamaCloud doesn't host, or are doing rapid offline experimentation. See
references/ingestion.md for the full decision table.
A Cloud Index is defined by an ordered pipeline. Decide each stage deliberately:
references/ingestion.md for which to pick.splade / bm25 / auto) to enable hybrid search at the
sink. Decide this now: hybrid must be enabled at ingestion, not bolted on at query time.Indexes are created and managed via the API/SDK as the durable default. A UI flow also exists, and the legacy "Index v1" UI is restricted to projects with pre-existing pipelines — treat API/SDK management as the path to teach.
The retrieval endpoint accepts tuning per request — no re-ingestion needed (except enabling hybrid, which is set at build time). Three levers, applied in this order of impact:
alpha blend. Shift
toward keyword for exact terms/codes/names; toward semantic for paraphrased intent. Requires a
sparse model configured at build time.See references/retrieval-tuning.md for the mode-selection table, the hybrid/rerank/filter decision
table, alpha-tuning intuition, and metadata-schema authoring tips.
Teach the stable Cloud Index surface. Cloud Index v2 is beta and deferred — flag it as beta and do not make it the default path. Do not present v2 as GA.
Use the high-level services SDK. Confirm the exact package name and method signatures live
before pinning — the Python package has shifted between llama-cloud and llama-cloud-services,
and retrieval may run through llama-index client classes. These snippets show shape, not exact
API:
# Python — query a deployed Cloud Index's retrieval endpoint (illustrative)
# Verify class/param names via the docs MCP before using.
retriever = index.as_retriever(
dense_similarity_top_k=10, # widen, then rerank
alpha=0.5, # 0 = keyword, 1 = semantic
enable_reranking=True,
rerank_top_n=3,
)
nodes = retriever.retrieve("...query...")
// TypeScript — shape only; fetch the current @llamaindex client + params via MCP.
const retriever = index.asRetriever({ similarityTopK: 10, enableReranking: true });
const nodes = await retriever.retrieve("...query...");
For exact, current creation/retrieval signatures and connector/sink config, use the MCP — do not copy these verbatim.
Treat package names/versions, class/method signatures, connector and sink names, parameter schemas, and credit numbers as live lookups. Fetch via:
mcp__plugin_llamacloud_docs__search_docs — concept / BM25 search.mcp__plugin_llamacloud_docs__read_doc — read a known doc page.mcp__plugin_llamacloud_docs__grep_docs — exact symbol / regex search.index.md to any https://developers.llamaindex.ai/... page URL and WebFetch it.Anchor doc paths:
/llamaparse/cloud-index/getting_started/ — build flow, sources, sinks, deploy./llamaparse/cloud-index/guides/parsing_transformation/ — parse tier, segmentation, chunking, embedding/sparse./llamaparse/cloud-index/guides/retrieval/advanced/ — hybrid, rerank, metadata filtering & inference.VectorStoreIndex, own vector DB) → llamaindex-framework.llamacloud-parse.llamacloud-extract.llamacloud.references/ingestion.md — managed-vs-self-managed decision table, source choice, parse-tier
and segmentation/chunking decisions, embedding + sparse-model choices, sink selection, gotchas.references/retrieval-tuning.md — retrieval-mode selection table, hybrid/rerank/metadata-filter
decision table, alpha intuition, metadata filter-inference schema tips, anti-patterns.npx claudepluginhub jbaham2/llamacloud-plugin --plugin llamacloudGuides creation, editing, and verification of skills for AI coding agents using test-driven development with subagent scenarios. Use when authoring or debugging skills.