Skill

concept-to-video

Creates animated explainer videos from concepts using Manim (Python) with MP4/GIF output, audio overlay, and multi-scene composition.

Python

design

Popularity

Stars

252

Forks

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/armory:concept-to-video

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Creates animated explainer videos from concepts using Manim (Python) as a programmatic animation engine.

Supporting Files

SKILL.md

342 lines · ~4.9k tokens

Stats

LanguagePython

Stars252

Forks37

MaintenanceExcellent

Last CommitJun 17, 2026

Actions

View Source View Plugin View on GitHub View README

Concept to Video

Creates animated explainer videos from concepts using Manim (Python) as a programmatic animation engine.

Reference Files

File	Purpose
`references/rules/pipeline-flow.md`	RAG, ETL, CI/CD — sequential stage animations with arrows
`references/rules/architecture-layers.md`	System stacks, network layers, abstraction hierarchies
`references/rules/algorithm-stepthrough.md`	Sorting, search, graph traversal — stateful step-by-step animations
`references/rules/comparison.md`	Side-by-side A vs B, before/after, trade-off visualizations
`references/rules/agent-interaction.md`	Multi-agent message passing, distributed systems, pub/sub
`references/rules/math-concept.md`	Equations, formulas, geometric proofs — LaTeX-free by default
`references/rules/training-loop.md`	Gradient descent, RL loops, cyclic iterative processes
`references/rules/transitions.md`	Fade and wipe transitions between scene sections
`references/rules/text-animation.md`	Text replacement, progressive bullet reveal, callouts, emphasis
`references/rules/layout.md`	Canvas coordinates, VGroup arrangement, spacing guidelines
`references/rules/audio-overlay.md`	ffmpeg audio overlay — background music, voiceover, multi-track mixing
`references/rules/voiceover-scaffold.md`	Timing script generation, TTS handoff, narration best practices
`references/rules/images.md`	ImageMobject usage, logo/screenshot patterns, scaling and positioning
`references/rules/subtitles.md`	SRT generation from scene timing, ffmpeg subtitle burning
`references/rules/multi-scene.md`	Multiple Scene classes, ffmpeg concat, chapter-based composition
`references/templates/data_flow_template.py`	Parametric pipeline/data flow animation (config-driven STAGES list)
`references/templates/comparison_template.py`	Parametric side-by-side comparison (config-driven LEFT/RIGHT items)
`references/templates/timeline_template.py`	Parametric timeline animation (config-driven EVENTS list)
`scripts/render_video.py`	Wrapper around Manim CLI — handles quality, format, output path cleanup
`scripts/add_audio.py`	ffmpeg wrapper — audio overlay, volume, fade-in/out, trim-to-video

Why Manim as the engine

Manim is the "SVG of video" — you write Python code that describes animations declaratively, and it renders to MP4/GIF at any resolution. The Python scene file IS the editable intermediate: the user can see the code, request changes ("make the arrows red", "add a third step", "slow down the transition"), and only do a final high-quality render once satisfied. This makes the workflow iterative and controllable, exactly like concept-to-image uses HTML as an intermediate.

Workflow

Concept → Manim scene (.py) → Preview (low-quality) → Iterate → Final render (MP4/GIF)

Interpret the user's concept — determine the best animation approach
Design a self-contained Manim scene file — one file, one Scene class
Preview by rendering at low quality (-ql) for fast iteration
Iterate on the scene based on user feedback
Export final video at high quality using scripts/render_video.py

Step 0: Ensure dependencies

Before writing any scene, ensure Manim is installed:

# System deps (usually pre-installed)
apt-get install -y libpango1.0-dev libcairo2-dev ffmpeg 2>/dev/null

# Python package
pip install manim --break-system-packages -q

Verify with: python3 -c "import manim; print(manim.__version__)"

Step 1: Interpret the concept

Determine the best animation pattern, then read the matching rule file before writing any code.

User intent	Rule file to read	Key Manim primitives
Explain a pipeline/flow	`references/rules/pipeline-flow.md`	Arrow, Rectangle, Text, AnimationGroup
Show architecture layers	`references/rules/architecture-layers.md`	VGroup, Arrange, FadeIn with shift
Algorithm step-through	`references/rules/algorithm-stepthrough.md`	Transform, ReplacementTransform, Indicate
Compare approaches	`references/rules/comparison.md`	Split screen VGroups, simultaneous animations
Mathematical concept	`references/rules/math-concept.md`	MathTex, geometric shapes, Rotate, Scale
Agent/multi-system interaction	`references/rules/agent-interaction.md`	Arrows between entities, Create/FadeOut
Training/optimization loop	`references/rules/training-loop.md`	Loop with Transform, ValueTracker, plots
Timeline/history	`references/templates/timeline_template.py`	NumberLine, sequential Indicate
Embed images or screenshots	`references/rules/images.md`	ImageMobject, SVGMobject
Add subtitles or captions	`references/rules/subtitles.md`	SRT generation, ffmpeg subtitle burn
Multiple distinct chapters	`references/rules/multi-scene.md`	Multiple Scene classes, ffmpeg concat
Add audio or voiceover	`references/rules/audio-overlay.md`	ffmpeg, scripts/add_audio.py
Transition between sections	`references/rules/transitions.md`	FadeOut all, shift off-screen
Text reveal, callouts, emphasis	`references/rules/text-animation.md`	ReplacementTransform, LaggedStart, Indicate
Positioning, spacing, layout	`references/rules/layout.md`	next_to, arrange, to_edge, move_to

Step 2: Design the Manim scene

Template-first vs from-scratch

Check whether a parametric template covers the concept before writing a scene from scratch:

If the concept is...	Start with template
A linear pipeline (A→B→C→D)	`references/templates/data_flow_template.py` — edit `STAGES`
A two-option comparison	`references/templates/comparison_template.py` — edit `LEFT_ITEMS`, `RIGHT_ITEMS`
A chronological timeline	`references/templates/timeline_template.py` — edit `EVENTS`
Anything else	Write from scratch using the relevant rule file

When using a template: copy it to the working directory, edit the config constants at the top, do not restructure the class.

Core rules:

Single file, single Scene class: Everything in one .py file with one class XxxScene(Scene).
Self-contained: No external assets unless absolutely necessary. Use Manim primitives for everything.
Readable code: The scene file IS the user's artifact. Use clear variable names, comments for each animation beat.
Color with intention: Use Manim's color constants (BLUE, RED, GREEN, YELLOW, etc.) or hex colors. Max 4-5 colors. Every color should encode meaning.
Pacing: Include self.wait() calls between logical sections. 0.5s for breathing room, 1-2s for major transitions.
Text legibility: Use font_size=36 minimum for body text, font_size=48+ for titles. Test at target resolution.
Scene dimensions: Default Manim canvas is 14.2 × 8 units (16:9). Keep content within ±6 horizontal, ±3.5 vertical.

Animation best practices

# DO: Use animation groups for simultaneous effects
self.play(FadeIn(box), Write(label), run_time=1)

# DO: Use .animate syntax for property changes
self.play(box.animate.shift(RIGHT * 2).set_color(GREEN))

# DO: Stagger related elements
self.play(LaggedStart(*[FadeIn(item) for item in items], lag_ratio=0.2))

# DON'T: Add/remove without animation (jarring)
self.add(box)  # Only for setup before first frame

# DON'T: Make animations too fast
self.play(Transform(a, b), run_time=0.3)  # Too fast to read

Structure template

from manim import *

class ConceptScene(Scene):
    def construct(self):
        # === Section 1: Title / Setup ===
        title = Text("Concept Name", font_size=56, weight=BOLD)
        self.play(Write(title))
        self.wait(1)
        self.play(FadeOut(title))

        # === Section 2: Core animation ===
        # ... main content here ...

        # === Section 3: Summary / Conclusion ===
        # ... wrap-up animation ...
        self.wait(2)

Step 3: Preview render

Use low quality for fast iteration:

python3 scripts/render_video.py scene.py ConceptScene --quality low --format mp4

This renders at 480p/15fps — fast enough for previewing timing and layout. Present the video to the user.

Step 4: Iterate

Common refinement requests and how to handle them:

Request	Action
"Slower/faster"	Adjust `run_time=` params and `self.wait()` durations
"Change colors"	Update color constants
"Add a step"	Insert new animation block between sections
"Reorder"	Move code blocks around
"Different layout"	Adjust `.shift()`, `.next_to()`, `.arrange()` calls
"Add labels/annotations"	Add `Text` or `MathTex` objects with `.next_to()`
"Make it loop"	Add matching intro/outro states

Step 5: Final export

Once the user is satisfied:

python3 scripts/render_video.py scene.py ConceptScene --quality high --format mp4

Quality presets

Preset	Resolution	FPS	Flag	Use case
`low`	480p	15	`-ql`	Fast preview
`medium`	720p	30	`-qm`	Draft review
`high`	1080p	60	`-qh`	Final delivery
`4k`	2160p	60	`-qk`	Presentation quality

Format options

Format	Flag	Use case
`mp4`	`--format mp4`	Standard video delivery
`gif`	`--format gif`	Embeddable in docs, social
`webm`	`--format webm`	Web-optimized

Delivering the output

Present both:

The .py scene file (for future editing)
The rendered video file (final output)

Copy the final video to /mnt/user-data/outputs/ and present it.

Step 5.5: Optional audio overlay

If the user provides audio (music or voiceover), or requests it:

# Background music at 25% volume with fade-in/out
python3 scripts/add_audio.py final.mp4 music.mp3 \
    --output final_with_audio.mp4 \
    --volume 0.25 --fade-in 2 --fade-out 3 --trim-to-video

# Voiceover at full volume, trimmed to video length
python3 scripts/add_audio.py final.mp4 voiceover.mp3 \
    --output final_narrated.mp4 --trim-to-video

For voiceover scripting before recording, read references/rules/voiceover-scaffold.md. For subtitles/captions, read references/rules/subtitles.md. For advanced multi-track mixing, read references/rules/audio-overlay.md.

Error Handling

Error	Cause	Resolution
`ModuleNotFoundError: manim`	Manim not installed	Run Step 0 setup commands
`pangocairo` build error	Missing system dev headers	`apt-get install -y libpango1.0-dev`
`FileNotFoundError: ffmpeg`	ffmpeg not installed	`apt-get install -y ffmpeg`
Scene class not found	Class name mismatch	Verify class name matches CLI argument
Overlapping objects	Positions not calculated	Use `.next_to()`, `.arrange()`, explicit `.shift()` calls
Text cut off	Text too large or positioned near edge	Reduce `font_size` or adjust position within ±6,±3.5
Slow render	Too many objects or complex transformations	Reduce object count, simplify paths, use lower quality
`LaTeX Error`	LaTeX not installed (for MathTex)	Use `Text` instead, or install `texlive-latex-base`

LaTeX fallback

If LaTeX is not available, avoid MathTex and Tex. Use Text with Unicode math symbols instead:

# Instead of: MathTex(r"\frac{1}{n} \sum_{i=1}^{n} x_i")
# Use:        Text("(1/n) Σ xᵢ", font_size=36)

Agentic Mode (Opt-In)

Single-shot mode (default) is fast and cheap — the coder writes scene.py directly from a concept. Use agentic mode for production-quality renders where layout correctness and asset resolution matter enough to justify additional LLM and VLM calls.

Pipeline

concept
  └─► plan_storyboard.py ──► storyboard.json
            │
            ▼
      fetch_assets.py (optional)
            │
            ▼
      coder writes scene.py
            │
            ▼
      render_video.py --max-fix-attempts N
            │  ▲
            │  └─ LLM fixup loop (on failure, up to N retries)
            ▼
      critic_pass.py --critic
            │  ▲
            │  └─ VLM layout patch (1 call with M image blocks)
            ▼
       final MP4

Flag Reference

Script	Flag	Default	Hard cap	Effect	Cost impact
`render_video.py`	`--max-fix-attempts`	`0`	`3`	LLM-assisted auto-fix on render failure; 0 = disabled	+1 LLM call per retry
`critic_pass.py`	`--critic`	disabled	—	Enable the VLM critic pass; noop without this flag	+1 VLM call (N image blocks)
`critic_pass.py`	`--critic-budget`	`50000`	—	Token budget for critic call; aborts loudly if exceeded	Sets ceiling; use to prevent runaway spend
`critic_pass.py`	`--frames`	`5`	`10`	Frames sampled from the rendered video for the critic	More frames → higher token cost per critic run
`fetch_assets.py`	`--adapter`	`none`	—	Asset backend: `local`, `iconfinder`, `none`	`iconfinder` adds external API calls
`fetch_assets.py`	`--asset-dir`	—	—	Root directory for `--adapter=local`; required with local	None

Cost Tradeoffs

The fixup loop adds one LLM call per failed render attempt — with --max-fix-attempts 3 you may pay up to 3 extra calls before the loop exhausts or succeeds. The critic pass adds one VLM call containing N PNG image blocks (default 5, max 10); each frame adds roughly 1 token per 800 bytes of base64-encoded PNG, so complex scenes at high resolution are materially more expensive. Setting --critic-budget to a conservative token ceiling (e.g. 20000) causes BudgetExceededError before the API call is made, so you never pay for an accidentally oversized request — the error is loud and non-recoverable by design.

Invocation Example

# 1. Plan
python3 scripts/plan_storyboard.py "explain transformer self-attention" \
    --output storyboard.json

# 2. (Optional) Fetch assets
python3 scripts/fetch_assets.py storyboard.json \
    --adapter local --asset-dir ./assets --output resolved.json

# 3. Coder writes scene.py (Claude writes this from storyboard.json)

# 4. Render with auto-fix
python3 scripts/render_video.py scene.py AttentionScene \
    --quality high --format mp4 --max-fix-attempts 3 \
    --output final.mp4

# 5. Critic pass
python3 scripts/critic_pass.py scene.py final.mp4 \
    --critic --critic-budget 40000 --frames 5

Agentic pipeline design (storyboard planner, auto-fix loop, VLM critic) is adapted from Code2Video (arXiv 2510.01174, MIT). Vendored prompt templates live in references/code2video/ alongside the upstream LICENSE. Full vendoring record, pinned commit, and re-sync policy are tracked in root ATTRIBUTIONS.md.

Limitations

Manim + ffmpeg required — cannot render without these dependencies.
Audio is post-render only — Manim renders silent MP4s. Use scripts/add_audio.py to overlay audio after export.
LaTeX optional — MathTex requires a LaTeX installation. Fall back to Text with Unicode for math.
Render time scales with complexity — a 30-second 1080p scene with many objects can take 1-2 minutes to render.
3D scenes require OpenGL — ThreeDScene may not work in headless containers. Stick to 2D Scene class.
No interactivity — output is a static video file, not an interactive widget.
GIF output is silent — audio overlay only works with MP4/WEBM output formats.

Design anti-patterns to avoid

Walls of text on screen — keep to 3-5 words per label, max 2 lines
Everything appearing at once — use staged animations with LaggedStart
Uniform timing — vary run_time to create rhythm (fast for simple, slow for important)
No visual hierarchy — use size, color, and position to guide attention
Rainbow colors — 3-4 intentional colors max
Ignoring the grid — align objects to consistent positions using arrange/align

concept-to-video

Popularity

Invocation

Context Preview

Supporting Files

SKILL.md

concept-to-video

Popularity

Invocation

Context Preview

Supporting Files

SKILL.md

Concept to Video

Reference Files

Why Manim as the engine

Workflow

Step 0: Ensure dependencies

Step 1: Interpret the concept

Step 2: Design the Manim scene

Template-first vs from-scratch

Animation best practices

Structure template

Step 3: Preview render

Step 4: Iterate

Step 5: Final export

Quality presets

Format options

Delivering the output

Step 5.5: Optional audio overlay

Error Handling

LaTeX fallback

Agentic Mode (Opt-In)

Pipeline

Flag Reference

Cost Tradeoffs

Invocation Example

Limitations

Design anti-patterns to avoid

Similar Skills

Concept to Video

Reference Files

Why Manim as the engine

Workflow

Step 0: Ensure dependencies

Step 1: Interpret the concept

Step 2: Design the Manim scene

Template-first vs from-scratch

Animation best practices

Structure template

Step 3: Preview render

Step 4: Iterate

Step 5: Final export

Quality presets

Format options

Delivering the output

Step 5.5: Optional audio overlay

Error Handling

LaTeX fallback

Agentic Mode (Opt-In)

Pipeline

Flag Reference

Cost Tradeoffs

Invocation Example

Limitations

Design anti-patterns to avoid

Similar Skills