Lossless conversion of nbgrader-formatted Jupyter notebooks to otter-grader instructor format with deterministic evaluation loop, content preservation verification, and multi-layer testing
This skill should be used when the user asks to "convert an nbgrader notebook", "refactor to otter-grader", "migrate from nbgrader", "create an instructor notebook", or mentions nbgrader-to-otter conversion, otter assign, or grading format migration. Operates on one notebook at a time within its containing folder.
This skill should be used when the user asks to "test an otter-grader notebook", "validate an instructor notebook", "run otter assign", "check autograder tests", "generate a test report", or mentions QA-ing otter-grader notebooks, running the testing pipeline, or checking for leaked solutions. Triggers after refactoring from nbgrader to otter-grader.
Uses power tools
Uses Bash, Write, or Edit tools
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
Converts nbgrader instructor notebooks to otter-grader format without dropping content.
nbgrader marks solution and test cells via notebook metadata (cell.metadata.nbgrader.solution, .grade). otter-grader uses raw cell delimiters (# BEGIN QUESTION, # BEGIN SOLUTION, etc.) with inline YAML config. The conversion has three phases:
1. Transform — wrap_transform.py wraps each nbgrader question group in otter delimiters, strips solution markers from solution cells, and splits test cells into visible and hidden portions. Markdown, code, and other non-functional cells are never deleted.
2. Evaluate — diff_notebooks.py compares original and converted notebooks: normalized exact match first, fuzzy match (SequenceMatcher ≥ 0.85) for cells ≥ 20 chars. Reports dropped cells and structural gaps (code cells between question blocks).
3. Fix — fix_cells.py reinserts any dropped cells at their original relative positions and relocates structurally misplaced code cells inside the nearest question block.
Steps 2–3 repeat (max 3 iterations, monotonicity check) until the diff passes.
SCRIPTS=skills/refactoring-nbgrader-to-otter/scripts
# 1. Make a working copy
cp homework-N-nbgrader.ipynb homework-N.ipynb
# 2. Transform
python3 $SCRIPTS/wrap_transform.py homework-N.ipynb
# 3. Validate structure (skip metadata check during iteration)
python3 $SCRIPTS/validate_structure.py homework-N.ipynb --skip-cleanup
# 4. Content diff vs original
python3 $SCRIPTS/diff_notebooks.py homework-N-nbgrader.ipynb homework-N.ipynb
# 5. If diff fails, fix and re-diff (repeat until pass or max 3 iterations)
python3 $SCRIPTS/fix_cells.py homework-N-nbgrader.ipynb homework-N.ipynb report.json
python3 $SCRIPTS/diff_notebooks.py homework-N-nbgrader.ipynb homework-N.ipynb
# 6. Final cleanup (strip remaining nbgrader metadata)
python3 $SCRIPTS/cleanup_metadata.py homework-N.ipynb
# 7. Full validation (includes metadata check)
python3 $SCRIPTS/validate_structure.py homework-N.ipynb
# 8. Build with otter
otter assign homework-N.ipynb dist --no-run-tests
The first cell of the source notebook must be a raw cell with # ASSIGNMENT CONFIG and assignment-level YAML before step 8 — the transform doesn't synthesize it. Anything else otter assign needs (duplicate qN names from sub-questions, missing markers, leftover nbgrader metadata) is flagged by validate_structure in steps 3 and 7; fix what it reports and re-run.
Structural conversion is half the job. Once otter assign builds the dist/ artifacts, a separate pipeline checks that the autograder actually runs, that no solutions leaked into the student notebook, and that the student-facing prose still makes sense after solution stripping.
SCRIPTS=skills/testing-otter-grader/scripts
# 1. Pre-flight: solution cells must have outputs
python3 $SCRIPTS/check_outputs.py homework-N.ipynb
# 2. Build (runs `otter assign` with a 300s timeout, captures logs)
python3 $SCRIPTS/run_otter_assign.py homework-N.ipynb dist/ > assign.log
# 3. Verify dist/ structure (autograder zip, student notebook, companion files)
python3 $SCRIPTS/validate_generated_output.py dist/ --config homework-N.ipynb > structure.log
# 4. Coherence check (LLM-as-judge) — see below
python3 $SCRIPTS/eval_student_coherence.py dist/student/homework-N.ipynb
# read the printed prompt, evaluate, write findings to coherence.json
# 5. Run autograder against instructor solutions — expect 100%
python3 $SCRIPTS/run_autograder_tests.py \
dist/autograder/homework-N.ipynb \
dist/autograder/homework-N-autograder_*.zip > autograder.log
# 6. Aggregate everything into a single report
python3 $SCRIPTS/generate_report.py \
--notebook homework-N.ipynb \
--assign-log assign.log \
--structure-log structure.log \
--student-notebook dist/student/homework-N.ipynb \
--instructor-notebook homework-N.ipynb \
--autograder-log autograder.log \
--coherence coherence.json \
--output report.json
If report.json shows pipeline_status: "pass", the assignment is ready to upload to Gradescope. On failure, summary.fix_actions lists the prioritized changes by cell index.
otter assign strips solution cells and replaces them with # YOUR CODE HERE placeholders. Anything those cells introduced — variable names, intermediate results, narrative setup — disappears with them. Later cells that reference the missing context become incoherent for the student even though the autograder still passes.
npx claudepluginhub nyu-tandon-tmi/nbgrader-to-otter-grader --plugin nbgrader-to-otterUnity Development Toolkit - Expert agents for scripting/refactoring/optimization, script templates, and Agent Skills for Unity C# development
Complete creative writing suite with 10 specialized agents covering the full writing process: research gathering, character development, story architecture, world-building, dialogue coaching, editing/review, outlining, content strategy, believability auditing, and prose style/voice analysis. Includes genre-specific guides, templates, and quality checklists.
Comprehensive .NET development skills for modern C#, ASP.NET, MAUI, Blazor, Aspire, EF Core, Native AOT, testing, security, performance optimization, CI/CD, and cloud-native applications
Complete collection of battle-tested Claude Code configs from an Anthropic hackathon winner - agents, skills, hooks, and rules evolved over 10+ months of intensive daily use
Comprehensive SEO analysis plugin for Claude Code. 25 sub-skills (21 core + 1 orchestrator + 1 framework + 2 extension mirrors) and 18 sub-agents cover technical SEO, content quality, schema, sitemaps, Core Web Vitals, local SEO, backlinks, AI/GEO, ecommerce, hreflang, SXO, clustering, drift monitoring, and Google APIs. Includes optional MCP extensions, SPA-aware rendering, portability, and hardened SSRF/DNS-rebinding safe fetchers.
Modern R development skills for Claude Code - tidyverse patterns, rlang metaprogramming, Bayesian inference, performance optimization, and more