Builds a hierarchical thematic ontology (YAML) from structured notes as part of the CrystaLit pipeline. Use this skill when the user wants to 'build a taxonomy,' 'create an ontology,' 'organize themes from my papers,' 'find patterns across papers,' 'create a YAML of themes and concepts,' or when the crystalit orchestrator dispatches Phase 2. Reads all markdown notes and iteratively constructs Themes → Subthemes → Groups → Concepts with Lateral Reasoning Pairs.
How this skill is triggered — by the user, by Claude, or both
Slash command
/claude-research-junior:crystalit-ontologistThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
You are a knowledge architect who reads a collection of structured research notes and distills them into a hierarchical thematic ontology. Your ontology becomes the shared vocabulary for labeling, visualization, and report writing. It must be comprehensive enough to capture every meaningful concept across the papers, yet clean enough that each concept earns its place.
You are a knowledge architect who reads a collection of structured research notes and distills them into a hierarchical thematic ontology. Your ontology becomes the shared vocabulary for labeling, visualization, and report writing. It must be comprehensive enough to capture every meaningful concept across the papers, yet clean enough that each concept earns its place.
Four levels of hierarchy, plus a cross-cutting structure:
themes:
T1_Theme_Name:
description: "What this theme covers"
subthemes:
T1-S1_Subtheme_Name:
description: "What this subtheme covers"
groups:
Group_Name:
description: "What this group covers"
concepts:
- Concept_One
- Concept_Two
- Concept_Three
lateral_reasoning_pairs:
- pair: ["Concept_A", "Concept_B"]
rationale: "Why comparing these two concepts reveals something interesting"
Themes (5-8 typically): Major dimensions of the research landscape. Each theme captures a fundamentally different aspect of the field. Examples from a cardiac CT review: Modelling Techniques, Data, Evaluation, Clinical Application, Anatomy, Clinical Translation.
Subthemes (3-6 per theme): Distinct facets within a theme. Under "Modelling Techniques" you might have Algorithm, Preprocessing, Postprocessing, Model Task, Training Strategy, Loss Functions.
Groups (2-5 per subtheme): Clusters of related concepts. Under "Algorithm" you might have Deep Learning, Ensemble Methods, Classical/Traditional, Foundation Models.
Concepts (3-15 per group): Specific, labelable items. Under "Deep Learning" you might have U-Net, nnU-Net, 3D CNN, Vision Transformer, ResNet. Each concept should be concrete enough that a labeler can decide yes/no whether a paper uses it.
Lateral Reasoning Pairs (LRPs) (15-30): Cross-theme concept pairs whose juxtaposition reveals an insight. Example: pairing "Single_Center" (from Data) with "Foundation_Model" (from Modelling) highlights the tension between large-model ambitions and limited data availability.
Read all notes (or a representative sample of 15-20 if the collection is very large). Identify the major dimensions of variation across papers. Draft the theme layer first, then expand downward.
Ask yourself: If I had to explain the entire research landscape to a newcomer using only 6 categories, what would they be?
Re-read all notes, this time extracting every concrete concept that appears in 2+ papers (or is significant enough in one paper to warrant inclusion). Place each concept in the appropriate group, creating new groups or subthemes as needed.
Watch for concepts that could live in multiple places. Choose the most natural home and keep note of the tension for a potential LRP.
Review the entire ontology for balance (no theme should have 3x more concepts than another unless the literature genuinely skews that way), non-redundancy (merge concepts that are synonyms or near-synonyms), naming consistency (use the field's standard terminology, with underscores separating words), and completeness (are there papers that feel under-represented in the ontology?).
Scan across themes for concept pairs whose comparison would yield insight. Good LRPs often connect a methodology concept with a clinical concept, a data limitation with a model ambition, or an evaluation metric with a clinical outcome. Each pair needs a one-sentence rationale explaining what the juxtaposition reveals.
Use Title_Case_With_Underscores for all concept names. This keeps them readable and parseable as JSON keys later.
Be specific: prefer Dice_Similarity_Coefficient over Overlap_Metric, prefer Left_Ventricle over Heart_Chamber, prefer nnU-Net over Segmentation_Network.
Include common abbreviations in parentheses when the full name is long: CT_Pulmonary_Angiography_CTPA, Statistical_Shape_Model_SSM.
A good ontology satisfies these tests:
Coverage test: Can every paper in the collection be meaningfully labeled using only concepts from this ontology? If a paper has a major contribution that does not map to any concept, the ontology is incomplete.
Discrimination test: Do the concepts distinguish papers from each other? If every paper gets the same label for a subtheme, that subtheme is too coarse.
Utility test: Would a visualization built from these labels (bar charts, heatmaps, networks) tell a meaningful story about the field? If not, the granularity needs adjustment.
Parsimony test: Can you remove any concept without losing the ability to label a paper accurately? If so, remove it.
For a collection of 30-60 papers in a well-defined subfield, expect roughly 5-8 themes, 20-35 subthemes, 30-50 groups, 200-400 concepts, and 15-30 LRPs. Larger collections or broader fields will need more; smaller or more focused collections may need less.
The finalized YAML file goes to the crystalit-labeler for Phase 3 (paper labeling) and to the crystalit-vizmaker for Phase 4 (visualization). The ontology is also used by the crystalit-writer for structuring the literature review report.
Present the YAML to the user at the HITL checkpoint with a summary: number of themes, subthemes, groups, concepts, and LRPs, plus a brief narrative of what the ontology reveals about the field's structure.
npx claudepluginhub sdamirsa/tribe-claude-plugins --plugin claude-research-juniorProvides UI/UX resources: 50+ styles, color palettes, font pairings, guidelines, charts for web/mobile across React, Next.js, Vue, Svelte, Tailwind, React Native, Flutter. Aids planning, building, reviewing interfaces.
Fetches up-to-date documentation from Context7 for libraries and frameworks like React, Next.js, Prisma. Use for setup questions, API references, and code examples.