From dataset-splitter
Process split datasets into training, validation, and testing sets for ML model development. Use when requesting "split dataset", "train-test split", or "data partitioning". Trigger with relevant phrases based on skill purpose.
How this skill is triggered — by the user, by Claude, or both
Slash command
/dataset-splitter:splitting-datasetsThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Split datasets into training, validation, and testing sets with configurable ratios and stratification options.
Split datasets into training, validation, and testing sets with configurable ratios and stratification options.
This skill automates the process of dividing a dataset into subsets for training, validating, and testing machine learning models. It ensures proper data preparation and facilitates robust model evaluation.
This skill activates when you need to:
User request: "Split the data in 'my_data.csv' into 70% training, 15% validation, and 15% testing sets."
The skill will:
User request: "Create a train-test split of 'large_dataset.csv' with an 80/20 ratio."
The skill will:
This skill can be integrated with other data processing and model training tools within the Claude Code ecosystem to create a complete machine learning workflow.
The skill produces structured output relevant to the task.
Searches MemPalace before answering questions about past work, people, projects, or prior decisions. Returns verbatim stored content instead of guessing from model memory.
Guides Payload CMS config (payload.config.ts), collections, fields, hooks, access control, APIs. Debugs validation errors, security, relationships, queries, transactions, hook behavior.
Implements vector databases with Pinecone, Weaviate, Qdrant, Milvus, pgvector for semantic search, RAG, recommendations, and similarity systems. Optimizes embeddings, indexing, and hybrid search.
npx claudepluginhub flight505/skill-forge --plugin dataset-splitter