From inquisita
Core skill for using Inquisita, a document intelligence platform. Use this skill whenever the user mentions "Inquisita", wants to create a matter, upload documents, search or query documents, run analysis jobs, create document collections, or work with any Inquisita MCP tools. Also trigger when the user asks to organize documents into a knowledge base, run document analysis or classification, build document collections, or enrich collections with structured data — even if they don't mention Inquisita by name. Trigger for any document management task that goes beyond simple file reading.
How this skill is triggered — by the user, by Claude, or both
Slash command
/inquisita:inquisitaThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Inquisita is a document intelligence platform and system of record. Documents go in, structured knowledge comes out. It organizes documents into **matters**, processes them into searchable chunks with embeddings, and supports LLM analysis jobs that produce structured results you can query with SQL.
Inquisita is a document intelligence platform and system of record. Documents go in, structured knowledge comes out. It organizes documents into matters, processes them into searchable chunks with embeddings, and supports LLM analysis jobs that produce structured results you can query with SQL.
Everything you create in Inquisita — matters, collections, analysis results, enrichments — persists as shared organizational data. Other users, attorneys, paralegals, and their agents can access and build on the same work. When you create a well-organized matter with enriched collections, you're not just completing a task — you're building a durable knowledge asset for the firm. Think of collections especially as the firm's curated, structured understanding of a matter: a "Privileged Documents" collection or a "Key Evidence for Breach Claim" collection enriched with relevance scores is something the whole team can query and rely on going forward.
Matter (container)
├── Categories (organizational buckets for uploads)
├── Documents (uploaded files, auto-processed into chunks + embeddings)
├── Analysis Jobs (async LLM or vector tasks that produce structured results)
└── Collections (curated document sets, optionally enriched with analysis results)
Everything lives inside a matter. A matter has categories (like folders), documents belong to one category each, and collections are dynamic groupings built from queries.
inquisita_create_matter → inquisita_get_upload_link → [upload via curl] → inquisita_discover (verify)
Create the matter with meaningful categories. Categories are mandatory single-select buckets, not tags. Define them upfront — every uploaded document must go into one.
Example legal categories: pleadings, correspondence, contracts, discovery, work_product.
Example business categories: financials, contracts, communications, reports.
Get an upload link. This returns a session_token (valid 5 minutes) and an api_url. Upload each file via curl or via a script.
curl -H "Authorization: Bearer <session_token>" \
-F "[email protected]" \
"<api_url>/api/v1/documents/matters/<matter_id>/upload?category=<category_key>"
If your working environment doesn't support shell/python access, you can return the link with the user and ask them to use it to upload.
Wait for processing. After upload, Inquisita auto-extracts text, chunks the document, and generates embeddings. This takes 30–120 seconds per document. Poll with inquisita_discover to watch document counts increase, or query processing_status in the documents view.
Archive File Extraction Archive files like zip, mbox, pst, etc are unpacked in Inquisita and expanded into their child items, each processed separately.
inquisita_discover → inquisita_query
Always call inquisita_discover first to see what views, columns, categories, and metadata are available.
Critical: never ask the user to write SQL. When a user asks a natural-language question about their documents ("who attended the September 21, 2021 Lookout Ridge meeting?", "what are the liability caps in our vendor contracts?"), it is your job to translate that into the appropriate discover + query calls. Do not respond by asking the user "what SQL query or WHERE clause would you like to use?" — the user is not expected to know SQL or the schema. Call discover to see the available views and columns, then write the SQL yourself, combining it with semantic and keyword search as needed. If you genuinely cannot answer after a reasonable search (data not ingested, ambiguous question), explain what you tried in plain language — do not punt by asking the user for a query.
Three search modes, combinable in one query:
search_text @@ websearch_to_tsquery('term1 OR "exact phrase"')semantic_query text, reference :query_vector in SQL for vector distance orderingSemantic search is usually the best starting point. Keyword search is better for exact names, dates, or specific phrases.
Queries can also be run with the presigned_url=true field to get presigned urls with every result. While docs can also be seen in the UI, this can be useful to provide your user with direct doc viewing links, or to build your own generative UI (In general generative UI takes significant tokens and time and should be avoided unless user requests. If using presigned urls in generative UIs or scripts, its best to save the output json and import it in your scripts/UI so you don't need to hard code them into your UI.
Read references/sql-patterns.md for common query patterns and syntax.
inquisita_analyze → [poll inquisita_get_analysis_job every 10-15s] → inquisita_query on results
This is the most powerful pattern and the one most likely to trip you up. Analysis jobs are asynchronous. After submitting, you must poll until status is complete, complete_with_errors, or failed.
Two methods:
llm: Send document content to an LLM with a prompt and output schema. Returns structured JSON per document.vector_similarity: Compute cosine similarity against a reference query. Returns similarity_score and is_match.Two levels:
document (default): One result per file. Uses the document summary. Required for collection enrichments.chunk: One result per chunk (page/section). Use for finding specific passages, extracting per-page data, or when you need granular results.Analysis Can Be Pricey:
The analysis LLM is a lower intelligence model. It has NO context about the matter, parties, or case. It's ability to reason or make deductions from multi-step deductions is limits. Write explicit prompts that include all necessary context. Be specific about output constraints. Perhaps provide examples.
Good prompt:
In the construction defect case Weston Properties LLC v. Summit Ridge Builders Inc.,
Weston (property owner/plaintiff) alleges Summit Ridge (general contractor/defendant)
performed defective construction work on a commercial building at 123 Main St,
Lakewood, CO. Key claims include foundation cracking, waterproofing failure, and
HVAC system defects. Other involved parties include Pinnacle Engineering (Weston's
engineering consultant), the City of Lakewood (issued permits and inspections), and
various subcontractors.
Extract every dated event mentioned in this document. For each event, provide:
- date: in YYYY-MM-DD format. If only a month and year are given, use the 1st of
the month. If only a year is given, use January 1st.
- description: a one-sentence description of what happened, with enough context to
understand the event without reading the source document.
- party: the party primarily involved. Use one of: "Weston Properties", "Summit Ridge
Builders", "Pinnacle Engineering", "City of Lakewood", "Both", or the name of
the specific entity if it's someone else.
- source_type: the type of evidence this date comes from. Use one of: "contract",
"email", "inspection_report", "invoice", "letter", "court_filing", "other".
If no dated events are found, return an empty array. Do not infer or fabricate dates.
Only extract dates that are explicitly stated or clearly implied in the text.
Bad prompt: "What dates are mentioned?"
Output schemas define the structure of what the analysis LLM returns. The schema goes in config.output_schema and supports several formats that can be mixed:
{"vendor": "string", "amount": "number", "is_relevant": "boolean"}{"relevance": ["high", "medium", "low", "none"]} — the LLM can only return one of these values.{"keywords": "string[]", "amounts": "number[]"}{
"events": {
"type": "array",
"items": {
"type": "object",
"properties": {
"date": {"type": "string"},
"description": {"type": "string"},
"party": {"type": "string"}
}
}
}
}
Match the schema to your prompt. For classification tasks, use enums to constrain outputs. For extraction tasks that pull multiple items from a document (dates, parties, amounts), use nested object arrays. For simple yes/no questions, use booleans.
The good prompt example above would pair with this schema:
{
"events": {
"type": "array",
"items": {
"type": "object",
"properties": {
"date": {"type": "string"},
"description": {"type": "string"},
"party": {"type": "string"},
"source_type": {"type": "string"}
}
}
}
}
Read references/analysis-guide.md for more prompt templates, output schema patterns, and intelligence level guidance.
Querying results: After completion, query analysis_results_view (both levels), analysis_results_chunk (chunk-level only), or analysis_results_doc (document-level only). Results are in a results JSONB column — access fields with results->>'field_name'.
inquisita_create_collection → [optionally] inquisita_enrich_collection → inquisita_query on collection_members
Collections are named, curated document sets built from SQL queries, optionally with a semantic_query for vector filtering. They serve as the organizational layer between raw documents and structured work product — think of them as saved searches that you can then layer analysis results onto.
Importantly, collections are persistent and shared. When you create a "Key Evidence — Breach of Contract" collection enriched with relevance scores, that collection is available to every user and agent with access to the matter. Build collections with descriptive names and rich enrichments — you're creating durable reference points that the team will use long after this conversation ends.
Creating a collection:
There are two ways to populate a collection:
sql="category = 'correspondence'" grabs all documents matching the filter.semantic_query for conceptual matching. This is usually better for topic-based collections (e.g., "documents related to change orders") because it catches documents that discuss the concept without using the exact keywords.Enriching collections with analysis results:
This is where collections become powerful. After running an analysis job, you can merge its results onto collection items via inquisita_enrich_collection. Each item gains structured data under a job_name namespace — and these become queryable fields on collection_members. The enrichment fields from each job are discoverable: call inquisita_discover with collection_id to see all enrichment sources, their job names, field schemas, and whether they're document-level or chunk-level. Then use those fields in SQL queries against collection_members to filter, sort, and build views.
Two paths to get there:
inquisita_create_collection → inquisita_enrich_collectionenrichment_job_id when creating the collectionYou can stack multiple enrichments on the same collection — run a relevance job and a privilege job, enrich with both, then query for documents that are high-relevance AND non-privileged. Each job's results are namespaced independently, so enrichments never collide.
Document-level vs chunk-level enrichment:
Enrichments on collection members are always document-level — one enrichment record per document per job. This works cleanly for document-level analysis jobs (one result per file maps directly to one collection item).
For chunk-level analysis jobs, results are auto-aggregated to document level when enriching: booleans are OR'd (becomes any_X — true if ANY chunk was true), numbers are MAX'd (becomes max_X — the highest value across chunks), and enums collapse by priority order. This means you lose per-chunk granularity in the enrichment itself.
When you need chunk-level detail within a collection, don't rely on the aggregated enrichments. Instead, query the collection_analysis view, which gives you the raw per-chunk analysis results scoped to that collection's documents. This preserves full granularity — which chunk, which page, what the analysis found there.
The practical split: use collection_members with enrichments for document-level filtering and sorting ("show me the top 10 most relevant documents"). Use collection_analysis when you need passage-level precision ("show me the specific pages that mention the March inspection").
Collection types: `
also have a type specified on creation, freetext
For the most powerful results, chain analysis into collections:
job_name (e.g., relevance_scoring, privilege_review)enrichment_job_id pointing at the completed job — or create the collection first and call inquisita_enrich_collection afterwardThis is how you build things like "all documents relevant to the breach of contract claim, scored by relevance, with privileged documents flagged" — run a relevance job and a privilege job, create a collection, enrich with both, then query:
SELECT source_file_id, file_name,
enrichments->'relevance'->>'level' as relevance,
(enrichments->'privilege'->>'privileged')::boolean as privileged
FROM collection_members
WHERE collection_name = 'Breach of Contract Discovery'
AND enrichments->'relevance'->>'level' IN ('high', 'medium')
AND (enrichments->'privilege'->>'privileged')::boolean = false
ORDER BY enrichments->'relevance'->>'level'
| Tool | Purpose | Key params |
|---|---|---|
create_matter | Create a new matter with categories | name, categories[] |
get_upload_link | Get session token for uploading files | matter_id |
discover | See what's in a matter or collection | matter_id, optional collection_id |
query | SQL search across documents, chunks, results | sql, matter_id, optional semantic_query |
analyze | Run async LLM or vector analysis job | matter_id, method, config, sql |
get_analysis_job | Poll analysis job status | job_id |
create_collection | Build a named document set from a query | name, sql, matter_id |
enrich_collection | Merge analysis results into collection items | collection_id, job_ids[] |
modify_collection | Add/remove documents from a collection | collection_id, action, source_file_ids[] |
update_matter | Change matter name, description, categories | matter_id |
get_document | Get document details + download URL | source_file_id |
delete_document | Remove a document and all its data | source_file_id |
Inquisita has a web interface where users can browse their matters, documents, and collections visually. When you create or modify resources, share the relevant link so the user can see their data:
https://inquisita.ai/app/matters/{matter_id}https://inquisita.ai/app/matters/{matter_id}/collections?collection={collection_id}After creating a matter, uploading documents, or building collections, include the appropriate link in your response. The web interface lets users browse documents, view collection contents, and explore analysis results without needing to write SQL — it's the human-friendly counterpart to the API work you're doing as an agent.
get_analysis_job until complete.discover before querying to see available views, columns, and enrichment fields.discover and translate it into a query yourself. Never respond with "what WHERE clause would you like to use?" — the user does not know the schema, and from their perspective the agent is broken when you ask.Inquisita's tools provide powerful abstractions that let agents flexibly analyze documents at scale. The workflow patterns above can be combined to support a wide range of tasks. Here are some examples:
Contract risk review. Upload a set of vendor contracts to a matter. Run a chunk-level analysis job prompted to identify clauses related to liability caps, indemnification, termination rights, and auto-renewal. Create a collection of all contracts, enrich with the results, then query to surface contracts with unfavorable terms — like uncapped liability or auto-renewal without notice periods. Produce a summary report ranked by risk.
Due diligence document room. Create a matter for an acquisition target with categories for financials, IP, employment, and regulatory. Upload thousands of documents from a data room export. Run multiple analysis jobs — one for relevance classification, one for red-flag detection (litigation mentions, regulatory violations, undisclosed liabilities). Build collections per diligence workstream, enriched with both jobs, so each reviewer gets a filtered, scored set of documents relevant to their area.
Insurance claims processing. Upload a batch of claim files (police reports, medical records, photos, adjuster notes). Run analysis jobs to extract key fields: date of loss, claimed amount, injury type, and whether the claimant's account is consistent across documents. Create collections grouped by claim status (new, under review, flagged for investigation) and enrich with the extracted fields so adjusters can sort and prioritize by severity or inconsistency.
Compliance audit. Upload a company's policy documents alongside employee communications (emails, Slack exports). Run a chunk-level analysis job that checks whether specific compliance requirements (data handling, access controls, reporting obligations) are referenced or violated in the communications. Build a collection of flagged communications enriched with which policy they potentially violate, and produce an audit report.
Research literature review. Upload a corpus of academic papers or technical reports. Run an analysis job to extract key findings, methodology type, sample size, and whether the conclusions support or contradict a specific hypothesis. Create a collection enriched with these fields, then query to build a structured literature table — sorted by methodology, filtered to papers above a sample size threshold, grouped by whether they support or challenge the hypothesis.
Real estate portfolio analysis. Upload leases, inspection reports, and appraisals across a portfolio of properties. Run analysis jobs to extract lease terms (rent, expiration, renewal options), inspection findings (structural issues, code violations), and appraised values. Build collections per property, enriched with all three jobs, then query across the portfolio to identify leases expiring within 12 months, properties with unresolved code violations, or assets where appraised value has declined.
npx claudepluginhub inquisita/inquisita-plugin --plugin inquisitaGuides creation, editing, and verification of skills for AI coding agents using test-driven development with subagent scenarios. Use when authoring or debugging skills.