From qe-framework
Designs production-grade ML pipelines with experiment tracking (MLflow/W&B), orchestration DAGs (Kubeflow/Airflow), feature stores (Feast), model registries, and automated retraining.
How this skill is triggered — by the user, by Claude, or both
Slash command
/qe-framework:Qml-pipelineThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Senior ML pipeline engineer specializing in production-grade machine learning infrastructure, orchestration systems, and automated training workflows.
Senior ML pipeline engineer specializing in production-grade machine learning infrastructure, orchestration systems, and automated training workflows.
# Pattern 1: Feature store integration
def build_feature_store(feast_repo_path: str, feature_list: list):
"""Initialize Feast feature store and load features for training."""
from feast import FeatureStore
fs = FeatureStore(repo_path=feast_repo_path)
features = fs.get_historical_features(entity_df, features=feature_list)
return features
# Pattern 2: MLflow experiment logging
def log_training_run(params: dict, metrics: dict, artifacts: list, run_name: str):
"""Log complete training run: params, metrics, model, plots."""
import mlflow
with mlflow.start_run(run_name=run_name):
mlflow.log_params(params)
mlflow.log_metrics(metrics)
for artifact_path in artifacts:
mlflow.log_artifact(artifact_path)
return mlflow.active_run().info.run_id
# Pattern 3: Data validation checkpoint
def validate_pipeline_input(df, expected_schema: dict, min_rows: int = 100):
"""Validate data quality before pipeline execution."""
assert df.shape[0] >= min_rows, f"Insufficient rows: {df.shape[0]} < {min_rows}"
for col, dtype in expected_schema.items():
assert col in df.columns, f"Missing column: {col}"
assert str(df[col].dtype) == dtype, f"Type mismatch {col}: {df[col].dtype} != {dtype}"
return df
def orchestrate_training_pipeline(config_path: str, experiment_name: str):
"""One-line orchestration strategy summary.
Longer: feature engineering, parallelization, validation gates, registry.
Args:
config_path: Path to YAML pipeline configuration
experiment_name: MLflow experiment identifier
Returns:
Registered model URI from registry
Raises:
FileNotFoundError: If config not found
ValueError: If validation gates fail
"""
[tool.ruff]
line-length = 100
select = ["E", "F", "W", "UP"]
[tool.mypy]
python_version = "3.9"
disallow_untyped_defs = true
ignore_missing_imports = true
| Anti-pattern | Fix |
|---|---|
| No experiment tracking; manual CSV logs | Use MLflow, W&B, Neptune for all runs; log params + metrics |
| Skipped validation; train on all data | Run schema checks, train/val split, log held-out test metrics |
| No versioning; "latest" model only | Use DVC for data, Git tags for code, model registry for artifacts |
| Different training & serving code paths | Single feature transform code; validate equivalence in tests |
| Single hyperparameter run; no tuning | Use Ray Tune, Optuna, or grid search; track all runs |
import mlflow
import mlflow.sklearn
mlflow.set_experiment("my-experiment")
with mlflow.start_run():
mlflow.log_params({"n_estimators": 100, "max_depth": 5})
model.fit(X_train, y_train)
mlflow.log_metric("accuracy", accuracy_score(y_test, preds))
mlflow.sklearn.log_model(model, "model", registered_model_name="my-model")
MUST: Version all data/code/models (DVC, Git, registry), pin seeds, validate data, log all params, track experiments, sign artifacts
MUST NOT: Train without tracking, skip validation, hardcode credentials, ignore train-serving skew, deploy without evaluation gates
Creates, edits, and optimizes skills for Claude Code, including drafting, evaluating with test prompts, iterating on performance, and improving skill descriptions for better triggering accuracy.
npx claudepluginhub inho-team/qe-framework --plugin qe-framework