From ai-ml-eng-pro
Dataset creation, cleaning, augmentation, versioning, QA for ML/AI pipelines. Use when preparing or improving a training or evaluation dataset.
How this skill is triggered — by the user, by Claude, or both
Slash command
/ai-ml-eng-pro:dataset-curatorThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Manages the full dataset lifecycle for ML/AI projects — from raw data collection through cleaning, labeling, augmentation, splitting, versioning, and quality assurance. Ensures datasets are reproducible, documented (datasheets), balanced across classes/demographics, and free from common pitfalls like data leakage, label noise, and unintended bias.
Manages the full dataset lifecycle for ML/AI projects — from raw data collection through cleaning, labeling, augmentation, splitting, versioning, and quality assurance. Ensures datasets are reproducible, documented (datasheets), balanced across classes/demographics, and free from common pitfalls like data leakage, label noise, and unintended bias.
model-evaluator — Curated datasets are the foundation of reliable model evaluationprompt-engineer — Test sets for prompt evaluation are curated datasetsembedding-manager — Embedding quality depends on dataset qualityhuggingface-hub — Dataset storage, versioning, and sharing infrastructureSearches MemPalace before answering questions about past work, people, projects, or prior decisions. Returns verbatim stored content instead of guessing from model memory.
Guides Payload CMS config (payload.config.ts), collections, fields, hooks, access control, APIs. Debugs validation errors, security, relationships, queries, transactions, hook behavior.
Implements vector databases with Pinecone, Weaviate, Qdrant, Milvus, pgvector for semantic search, RAG, recommendations, and similarity systems. Optimizes embeddings, indexing, and hybrid search.
npx claudepluginhub haj1t/senior-dev-squad-skills --plugin ai-ml-eng-pro