From datapowers
Use before training any model more complex than a constant predictor - establishes a documented baseline that any future model must beat to be considered
How this skill is triggered — by the user, by Claude, or both
Slash command
/datapowers:baseline-first-modelingThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
A baseline is the floor every other model must beat. Without one, "85% accuracy" is a number, not a result.
A baseline is the floor every other model must beat. Without one, "85% accuracy" is a number, not a result.
Core principle: If a trivial baseline already hits the deploy threshold, you don't need a model — you need a rule.
NO COMPLEX MODEL (XGBoost, deep net, fine-tune) WITHOUT A DOCUMENTED BASELINE FIRST
The baseline lives in code, in the experiment spec §5, and as a tracked MLflow run.
| Problem | Trivial baseline | Strong baseline |
|---|---|---|
| Binary classification | Majority class | Logistic regression on numeric features |
| Multi-class | Class prior | Logistic / linear SVM |
| Regression | Mean / median | Ridge regression on numeric features |
| Time-series forecast | Last value, seasonal naive | Exponential smoothing / ARIMA |
| Ranking / recsys | Popularity, recency | BPR / ALS / lightFM |
| NLP classification | Class prior | TF-IDF + logistic regression |
| NLP generation | Retrieve nearest training example | Small fine-tune of base model |
| Vision classification | Class prior | Linear probe on a pretrained backbone |
| Anomaly detection | Rate threshold on a single feature | IsolationForest |
from sklearn.dummy import DummyClassifier
from sklearn.linear_model import LogisticRegression
import mlflow
with mlflow.start_run(run_name="baseline_majority"):
mlflow.log_param("data_hash", DATA_HASH)
mlflow.log_param("kind", "majority")
m = DummyClassifier(strategy="most_frequent").fit(X_train, y_train)
log_metrics_with_ci(m, X_val, y_val)
with mlflow.start_run(run_name="baseline_logreg"):
mlflow.log_param("data_hash", DATA_HASH)
mlflow.log_param("kind", "logreg")
m = LogisticRegression(max_iter=1000).fit(X_train, y_train)
log_metrics_with_ci(m, X_val, y_val)
Log: primary metric ± CI, segment metrics, calibration, prediction latency.
| Anti-pattern | Cost |
|---|---|
| Skip to XGBoost "to save time" | No reference; cannot tell if features help |
| Baseline on different data than the model | Comparison is meaningless |
| Baseline once, never re-run | Data drifts; old baseline lies |
| Reporting "model A vs. model B" without baseline | Neither may beat trivial |
| Baseline only on global metric | Segment performance unknown |
datapowers:test-driven-modelingdatapowers:validation-strategy-design, datapowers:experiment-tracking, datapowers:model-evaluation-rigorouslynpx claudepluginhub creyesp/datapowers --plugin datapowersGuides creation, editing, and verification of skills for AI coding agents using test-driven development with subagent scenarios. Use when authoring or debugging skills.