From semlang
Use when creating an initial SemLang ontology from a data source, optional documentation, and user validation of core entities, relationships, roles, situations, measures, and sample questions.
How this skill is triggered — by the user, by Claude, or both
Slash command
/semlang:initial-ontology-creationThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Use this skill when a user wants to create an initial ontology for a domain, data product, warehouse schema, application database, or analytics package.
Use this skill when a user wants to create an initial ontology for a domain, data product, warehouse schema, application database, or analytics package.
Assume the ontology is for production analytics unless the user says otherwise. The goal is not to mirror every table. The goal is to infer the domain's durable concepts, relationships, useful states, events, metrics, and question vocabulary, then validate that ontology with the user before treating it as authoritative.
kind for identity-bearing entities such as Customer, Event, Venue, Product, Account, or Supplier.event for temporal occurrences such as Order, TicketScan, MessageSend, Incident, or Payment.situation for state or measurement snapshots such as InventoryLevel, PriceSnapshot, SubscriptionStatus, or DailyBalance.relator for association or bridge concepts such as EventAttraction, AccountMembership, ProductBundleItem, or ProviderFacilityAffiliation.phase only for lifecycle stages of an entity, not table variants.load_ontology tool. Large ontologies should be organized into domain files and loaded in batches.Ask for the minimum information needed to begin. Prefer concise questions and proceed with reasonable assumptions when the user cannot answer everything.
Ask about the data source:
Try to connect with the available information. Ask follow-up questions reactively only when connection attempts or source inspection reveal a concrete blocker.
Ask one optional context question:
Are there any pre-existing sources of information about this data that I should use, such as a data catalog, ERD, dbt docs, BI dashboards, metric dictionary, business glossary, README, notebook, or examples of questions people ask?
If documentation is unavailable, continue from source inspection and make the missing context visible in the validation review.
Inspect the source system before authoring ontology files.
For each relevant table, file, or source:
If the source supports metadata commands, prefer structured metadata over scraping display text. For Databricks, use current CLI shapes such as databricks tables get <catalog>.<schema>.<table> when available.
Analyze optional documentation and existing usage to infer business meaning.
Look for:
kind concepts.event concepts.situation concepts.relator concepts.Compare the docs against source inventory and note conflicts. When documentation and physical schema disagree, ask the user to resolve the business meaning before encoding it as canonical.
Start writing SemLang early. Do not create a separate long-lived inventory document unless the user asks for one.
Create domain-oriented SemLang files and use comments near the top of each file, source, or concept to hold the working notes that would otherwise live in a separate inventory. As the model becomes clearer, move more information out of comments and into real SemLang declarations.
For each candidate concept, capture in comments or declarations:
kind, event, situation, relator, or phase.occurrence_time for events and observation_time for situations.Use comments such as:
// Source audit 2026-05-21: prod.sales.orders had 12,431,992 rows.
// Grain: one row per submitted order.
// Open question: confirm whether cancelled orders remain in this source.
concept Order is event from databricks.table('prod.sales.orders') {
...
}
Group concepts into domain files by business area once the inventory is large enough. Use a single entry-point file that includes shared types before domain files. Avoid re-including shared files from every domain file.
After the first concept files exist, perform a dedicated relationship pass.
For each concept:
join_one, join_many, or join_cross only when the relationship is meaningful for analysis.with when the field name and target identity make the relationship obvious.on clauses when source and target field names differ.? unless referential integrity is known.relator concepts when the association has its own grain, dates, roles, or measures.Document important relationships in comments when the first SemLang draft cannot encode them yet.
Add reusable semantics after the structural model is coherent.
For each concept, identify:
Use roles only when the name carries reusable business meaning. If a filter merely narrows a source for one analysis, use a where: clause or lens instead.
Create the first SemLang draft in small, valid increments. Valid means loading the entry-point file with the SemLang MCP load_ontology tool and using the feedback to fix parse, semantic, source, and lowering issues.
package first in every SemLang file.include declarations immediately after package.type: declarations before concepts that use them.load_ontology after each coherent batch instead of waiting for the complete ontology.Before presenting the ontology for validation, run a systematic audit. When delegation is available and permitted, have a sub-agent perform this audit independently, then iterate with that sub-agent until the issues are resolved or clearly deferred.
Check for:
occurrence_time on events.observation_time on situations.Fix obvious issues before the user validation session. Keep unresolved business questions visible.
Review the ontology with the user in business language before treating it as complete. Present concise summaries and ask for corrections.
First, validate the core concepts and relationships:
Here are the core concepts I found: the entities, events, situations, and relationship concepts, plus the important relationships between them. Do these seem right? What is missing, misnamed, or incorrectly connected?
Show:
kind, event, situation, relator, and phase.Then validate each entity one at a time:
For <Entity>, here are the roles and situations I found: the named states, categories, lifecycle stages, and point-in-time measurements of this concept. Are these right? Which names would your team use?
For every major kind, show:
situation concepts that measure or describe the entity at a point in time.event concepts that happen to or because of the entity.Then validate sample questions:
Here are sample questions this ontology should answer. Do these sound like the questions people actually ask? Which are wrong, low value, or missing?
Include a balanced set:
Finally, solicit more real questions:
What are five to ten real questions people ask about this domain that are painful, frequent, high-stakes, or currently require manual work?
For each added question, record whether the current ontology can answer it. If not, identify the missing concept, relationship, role, measure, temporal axis, validation, or source.
After validation, revise the ontology and produce a concise handoff.
The handoff should include:
Run the project's validation command before handoff when working in a repository. For SemLang projects, prefer the repository's full check command when available.
npx claudepluginhub unsupervisedcom/semlangGuides creation, editing, and verification of skills for AI coding agents using test-driven development with subagent scenarios. Use when authoring or debugging skills.