Uploads, validates, and manages datasets for DataRobot projects. Handles file uploads, data quality checks, schema review, and prediction dataset preparation.
How this skill is triggered — by the user, by Claude, or both
Slash command
/datarobot-agent-skills:datarobot-data-preparationThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
This skill provides guidance for preparing and managing data in DataRobot, including uploading datasets, validating data quality, and managing dataset versions.
This skill provides guidance for preparing and managing data in DataRobot, including uploading datasets, validating data quality, and managing dataset versions.
Most common use case: Upload and validate a dataset
upload_dataset(file_path, dataset_name) to upload datavalidate_dataset(dataset_id) to check data qualityget_dataset_schema(dataset_id) to review structureExample: "Upload sales_data.csv and check if it's ready for training"
Use this skill when you need to:
User request: "Upload my sales_data.csv file and check if it's ready for training."
Agent workflow:
User request: "Prepare a prediction dataset based on the training data structure from project abc123."
Agent workflow:
This skill guides you to use the DataRobot Python SDK directly. Install the SDK if needed:
pip install datarobot
Use these DataRobot SDK methods for data management:
Dataset Operations:
dr.Dataset.create_from_file(file_path, name) - Upload datasetdr.Dataset.get(dataset_id) - Get dataset detailsdr.Dataset.list() - List all datasetsdataset.row_count - Get row countdataset.column_count - Get column countDataset Information:
dataset.name - Dataset namedataset.id - Dataset IDdataset.created_at - Creation timestampSee the Common Patterns section below for complete examples.
This skill includes executable helper scripts that Claude can run directly:
scripts/upload_dataset.py - Upload a dataset file to DataRobotUsage example:
# Upload dataset
python scripts/upload_dataset.py sales_data.csv "Sales Data Q4 2024"
Claude can run this script directly or use it as reference when writing code.
import datarobot as dr
import os
# Initialize client
client = dr.Client(
token=os.getenv("DATAROBOT_API_TOKEN"),
endpoint=os.getenv("DATAROBOT_ENDPOINT")
)
# Upload dataset
dataset = dr.Dataset.create_from_file(
file_path="sales_data.csv",
name="Sales Data Q4 2024"
)
print(f"Dataset ID: {dataset.id}")
print(f"Rows: {dataset.row_count}, Columns: {dataset.column_count}")
# Get dataset details
dataset_info = dr.Dataset.get(dataset.id)
print(f"Dataset name: {dataset_info.name}")
print(f"Created: {dataset_info.created_at}")
import datarobot as dr
# List all datasets
datasets = dr.Dataset.list()
print(f"Found {len(datasets)} datasets")
# Search for specific dataset
for dataset in datasets:
if "sales" in dataset.name.lower():
print(f"Found: {dataset.name} (ID: {dataset.id})")
# Get specific dataset
dataset = dr.Dataset.get("abc123")
print(f"Dataset: {dataset.name}")
print(f"Size: {dataset.row_count} rows x {dataset.column_count} columns")
Common checks to perform:
Common errors and solutions:
pip install datarobot
import datarobot as dr
import os
client = dr.Client(
token=os.getenv("DATAROBOT_API_TOKEN"),
endpoint=os.getenv("DATAROBOT_ENDPOINT", "https://app.datarobot.com")
)
npx claudepluginhub datarobot-oss/datarobot-agent-skills --plugin datarobot-agent-skillsGuides DataRobot model training: project creation, dataset upload, AutoML configuration, time series setup, and model selection.
Automates Datarobot tasks via Rube MCP (Composio). Always searches tools first for current schemas before executing workflows.
Create, configure, and update datasets on Hugging Face Hub with SQL-based querying, streaming row updates, and multi-format template support.