By infiniV
ultra-instinct ML engineering intern for Claude Code. Reads papers, audits datasets, ships SFT/DPO/LoRA runs to Hugging Face. Built on the procedural knowledge from huggingface/ml-intern, wired into Claude Code's native agentic harness.
Audit an HF dataset — schema, sample rows, anomalies, recommended training method.
Kick off the full ml-intern workflow on an ML task — research → audit dataset → architect training job → submit. Loads the ml-intern skill and dispatches the right subagents.
Pre-flight a training script before submitting it to HF Jobs — checks for the 8 expensive mistakes.
Deep literature crawl. 6–10 query angles, 2-hop citation graph BFS, 30–50 full-paper reads in parallel subagents, cross-paper synthesis with gap analysis.
Run a literature review for an ML task — finds landmark paper, crawls citation graph, extracts recipe.
Dataset quality auditor for HF datasets. Use before committing to a dataset for fine-tuning. Returns schema, row counts, sample rows, distributions, anomalies (class imbalance, duplicates, missing values, format issues), and a recommended training method based on column shape. Isolates 10k+ tokens of dataset metadata + sample rows from the main thread.
Single-paper deep reader. Reads ONE paper end-to-end (abstract → intro → method → experiments → results → limitations → future work) and returns a structured ~800-word digest where every factual claim is backed by a verbatim quote with §section reference. Designed for parallel fan-out from `/ml-research-ultra` — each invocation isolates 50k+ tokens of paper HTML from the main thread. Use when the orchestrator needs the full content of a paper, not just the recipe.
ML literature crawler. Use when the main task needs a methodology-grounded recipe drawn from multiple papers — e.g., "find the best recipe for math reasoning fine-tuning", "what dataset and method does the GRPO follow-up work use", "literature review for sparse-attention long-context training". Returns a structured ≤800-word report with anchor papers, extracted recipes, citation-graph descendants, and working code-example URLs. Isolates 50k+ tokens of paper text from the main thread.
Designs and reviews ML training submissions for both local execution and HF Jobs. Use after the recipe is chosen and the dataset is audited — produces a complete training script + the exact run command, sized to hardware, with all required fields (push_to_hub, hub_model_id, disable_tqdm, Trackio, timeout, package installs). Detects compute mode automatically and asks the user when both local and Jobs are viable. Catches the "model lost" / "30m timeout" / "missing flash-attn" mistakes before they cost real money.
Use when the user asks to fine-tune, train, evaluate, audit, or ship a machine-learning model on the Hugging Face ecosystem — SFT, DPO, GRPO, RLHF, LoRA/QLoRA, post-training, dataset auditing, paper-driven research, hf jobs submission, Trackio monitoring, push-to-Hub. Triggers include "fine-tune", "train a model", "SFT", "DPO", "GRPO", "RLHF", "post-training", "audit this dataset", "literature review for X task", "submit hf job", "find a dataset for X", "best recipe for X", "hyperparameter sweep", "OOM during training", "push to Hub". Replicates the workflow of huggingface/ml-intern inside Claude Code with zero new dependencies.
Use when the user names a specific ML model (e.g. DINOv3, SAM 2, Whisper, Qwen2-VL) and wants its real/official code, training recipe, or papers found, verified, or archived locally for grounded coding. Triggers include "find the real code for this model", "harvest DINOv3", "store the model's code and papers locally", "archive the training/inference code", "set up a local source-of-truth / reference archive for a model", or any request that future coding against a model be grounded in its actual source instead of training-time memory.
External network access
Connects to servers outside your machine
Uses power tools
Uses Bash, Write, or Edit tools
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.

ultra-instinct ML engineering intern for Claude Code. Reads papers, audits datasets, ships SFT/DPO/LoRA runs to Hugging Face.
ultra-ml-intern is a Claude Code plugin that gives Claude the workflow of an ML engineering intern. It researches ML papers, audits Hugging Face datasets, designs fine-tuning recipes (SFT, DPO, GRPO, LoRA, QLoRA, RLHF), and submits training jobs to HF Jobs with Trackio monitoring.
The procedural knowledge comes from huggingface/ml-intern, HF's standalone Python harness around the Claude API. This repo wires the same intelligence into Claude Code, Anthropic's official agentic harness for Claude. Same model, a more capable loop, and you bring your own Claude (Max subscription or API key) instead of paying for a second harness on top.
Works in any Claude Code surface: terminal CLI, IDE extensions, and the web app.
# In any Claude Code session:
/plugin marketplace add infiniV/ultra-ml-intern
/plugin install ml-intern@ultra-ml-intern
Restart Claude Code, then verify with /plugin and /agents. The slash commands (/ml-intern, /ml-research, …) keep their short names; the ultra- prefix is just the package wrapper.
What you get:
ml-intern (the workflow) and model-provenance (archive a model's real code + papers locally)/ml-intern, /ml-research, /ml-research-ultra, /ml-audit, /ml-preflight, /ml-trainml-paper-researcher, ml-paper-reader, dataset-auditor, training-job-architectHF_TOKEN is set)> "fine-tune Qwen3-0.5B for math reasoning"
The skill activates automatically and walks the 6-step research-driven workflow:
hf jobs run with Trackio monitoring| You ask | It does |
|---|---|
| "fine-tune X for Y" | Full pipeline: literature review → dataset audit → training-job design → smoke test → full run |
| "what's the best recipe for X" | Dispatches the ml-paper-researcher subagent; returns recipe + citations |
| "do a deep literature review on X" | Runs /ml-research-ultra: 6–10 query angles, 2-hop citation BFS, 30–50 papers read in parallel ml-paper-reader subagents, gap-finding synthesis, optional local PDF/HTML archive |
| "audit dataset Y" | Dispatches the dataset-auditor; returns schema, anomalies, GO/NO-GO verdict |
| "preflight train.py" | Catches missing push_to_hub, default 30m timeout, bf16 on T4, missing flash-attn install, before you spend cluster hours |
| "submit hf jobs run" | Walks pre-flight → cost estimate → smoke test → full submission → Trackio dashboard URL |
| Skill | What it does |
|---|---|
ml-intern | The end-to-end ML workflow: find landmark papers, crawl the citation graph, extract the recipe, audit the dataset and base model on Hub, write a TRL-grounded training script, pre-flight, smoke-test, and ship a full hf jobs run with Trackio monitoring. Activates whenever you ask to fine-tune, train, evaluate, or audit a model. |
model-provenance | Given a specific model (DINOv3, SAM 2, Whisper, Qwen2-VL…), finds and verifies the canonical repo over forks and lookalikes, clones it, extracts the real train/model/inference files, downloads the paper PDFs with metadata, writes a synthesis report, and archives everything to a global, project-independent ~/.claude/model-provenance/<slug>/ (reusing any existing archive instead of re-fetching). Registers a mandatory-read memory so future coding against that model is grounded in its actual source, not training-time recall. Cloned code is archived, never executed. |
npx claudepluginhub infiniv/ultra-ml-intern --plugin ml-internSpecialized Claude Code skills for UI, theming, and code quality
Open-source, local-first Claude Code plugin for token reduction, context compression, and cost optimization using hybrid RAG retrieval (BM25 + vector search), reranking, AST-aware chunking, and compact context packets.
Intelligent draw.io diagramming plugin with AI-powered diagram generation, multi-platform embedding (GitHub, Confluence, Azure DevOps, Notion, Teams, Harness), conditional formatting, live data binding, and MCP server integration for programmatic diagram creation and management.
Give your AI a memory — mine projects and conversations into a searchable palace. 33 MCP tools, auto-save hooks, and guided setup.
Complete AI coding workflow system. Self-correcting memory + persistent FTS5-indexed research wikis + auto-research loop + multi-LLM council on a single SQLite store. 33 skills, 8 agents, 22 commands, 37 hook scripts across 24 events. Cross-agent via SkillKit.
Persistent file-based planning for AI coding agents. Crash-proof markdown plans (task_plan.md, findings.md, progress.md) that survive context loss and /clear, with an opt-in completion gate and multi-agent shared state. Manus-style. Works with Claude Code, Codex CLI, Cursor, Kiro, OpenCode and 60+ agents via the SKILL.md standard. Includes Arabic, German, Spanish, and Chinese (Simplified and Traditional).
Complete creative writing suite with 10 specialized agents covering the full writing process: research gathering, character development, story architecture, world-building, dialogue coaching, editing/review, outlining, content strategy, believability auditing, and prose style/voice analysis. Includes genre-specific guides, templates, and quality checklists.