Skill

generate-video

This skill should be used when the user asks to "generate a video", "create a video", "make a video", "animate this", "text to video", "generate with Veo", "create a clip", "make a short film", "extend the video", "continue the video", "interpolate between", "image to video", "animate this image", "video from image", or needs AI video generation, video extension, or frame interpolation using the Gemini Veo API.

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/gemini-media:generate-video

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Wrap the Gemini Veo video generation REST API to produce, extend, and interpolate videos via a Python script (stdlib only, no pip dependencies). Support text-to-video, image-to-video, frame interpolation, reference images, video extension, and native audio synthesis. All output is saved to `./generated-videos/` and auto-opened on macOS.

Supporting Files

references/advanced-features.mdreferences/api-reference.mdscripts/generate_video.py

SKILL.md

230 lines · ~2.4k tokens

Stats

LanguagePython

Parent stars0

MaintenanceGood

Last CommitMar 7, 2026

Actions

View Source View Plugin View on GitHub View README

Stats

Actions

Generate Video

Prerequisites

Before any generation, verify the environment:

Confirm $GEMINI_API_KEY is set. If missing, instruct the user: export GEMINI_API_KEY='your-key-here'
Ensure python3 is available (Python 3.7+). The script uses only stdlib modules — no pip install needed.

Text-to-Video Generation

When the user requests a new video from a text description:

1. Confirm the Prompt

Restate the user's request as a clear generation prompt. If the request is vague, ask for clarification before proceeding. Video prompts work best when they describe:

Camera movement (dolly, pan, tracking, aerial, handheld)
Subject action (what happens over time)
Mood/lighting (golden hour, neon, foggy)
Audio (dialogue, music, sound effects — Veo 3.1 generates native audio)

2. Choose Settings

Use API defaults unless the user explicitly requests specific settings. Only pass flags when the user asks for them.

If the user asks about available options:

Models: veo-3.1-fast-generate-preview (default, fast), veo-3.1-generate-preview (highest quality), veo-3-fast-generate-preview, veo-3-generate-preview
Aspect ratios: 16:9 (landscape, default), 9:16 (portrait/vertical)
Resolutions: 720p (default, fastest), 1080p, 4k (1080p and 4k require 8s duration)
Durations: 4, 6, or 8 seconds (default: 8)
Audio: Native audio is generated by default on Veo 3+ models; use --no-audio to disable
Multiple outputs: Generate 1-4 video variations with --sample-count
Determinism: Use --seed for more reproducible results

3. Compose a Negative Prompt (Optional)

If the user mentions things to avoid, or if quality keywords suggest it, use --negative-prompt. Common exclusions:

"blurry, shaky, low quality" (general quality)
"text overlays, watermark" (clean output)
"cartoon, animation" (when photorealism is desired)

Only add a negative prompt when the user requests it or when it clearly improves the result.

4. Invoke the Script

python3 "${CLAUDE_PLUGIN_ROOT}/skills/generate-video/scripts/generate_video.py" generate \
  --prompt "the final prompt" \
  --output-dir "./generated-videos"

Add optional flags only when the user explicitly requests them:

--aspect-ratio "9:16" — for portrait/vertical video
--resolution "1080p" — for higher resolution (requires 8s duration)
--duration "4" — for shorter clips
--negative-prompt "blurry, low quality" — to exclude unwanted elements
--person-generation "allow_all" — when people are needed in the video

Important: Video generation is asynchronous. The script polls automatically and will take 11 seconds to 6 minutes. Warn the user that this takes longer than image generation.

5. Report the Result

The script outputs JSON to stdout. Parse it and report:

The saved video path
The generation time
Note that the video has been opened for preview
Ask if the user wants to extend the video, generate a new one, or adjust settings

Image-to-Video (Animation)

When the user provides an image to animate:

Validate the image file exists (test -f).
The prompt should describe the desired motion, not the image content.

python3 "${CLAUDE_PLUGIN_ROOT}/skills/generate-video/scripts/generate_video.py" generate \
  --prompt "The camera slowly pulls back as the leaves begin to sway" \
  --image "/path/to/image.png" \
  --output-dir "./generated-videos"

Frame Interpolation

When the user provides two images and wants a transition between them:

Validate both image files exist.
The prompt should describe the desired transition.

python3 "${CLAUDE_PLUGIN_ROOT}/skills/generate-video/scripts/generate_video.py" generate \
  --prompt "Smooth cinematic transition between the two scenes" \
  --image "/path/to/first-frame.png" \
  --last-frame "/path/to/last-frame.png" \
  --output-dir "./generated-videos"

Reference Images

When the user provides reference images for style or content guidance:

Validate each file exists (up to 3 reference images).
The prompt should explicitly mention the referenced objects.

python3 "${CLAUDE_PLUGIN_ROOT}/skills/generate-video/scripts/generate_video.py" generate \
  --prompt "A woman walks through a garden wearing the red dress" \
  --reference-image "/path/to/dress.png" \
  --reference-image "/path/to/garden.png" \
  --output-dir "./generated-videos"

Video Extension

When the user wants to extend a previously generated video:

Validate the video file exists (must be an MP4).
The prompt describes what happens next in the video.
Extensions are locked to 720p and add ~7 seconds.
Up to 20 extensions per video chain.

python3 "${CLAUDE_PLUGIN_ROOT}/skills/generate-video/scripts/generate_video.py" extend \
  --prompt "The camera continues to pan right, revealing a waterfall" \
  --video "./generated-videos/previous-video.mp4" \
  --output-dir "./generated-videos"

Background Submission

For long-running generations, submit without waiting:

python3 "${CLAUDE_PLUGIN_ROOT}/skills/generate-video/scripts/generate_video.py" generate \
  --prompt "the prompt" \
  --no-wait \
  --output-dir "./generated-videos"

This returns an operation name. Check later with:

python3 "${CLAUDE_PLUGIN_ROOT}/skills/generate-video/scripts/generate_video.py" poll \
  --operation "models/veo-3.1-fast-generate-preview/operations/..." \
  --wait \
  --output-dir "./generated-videos"

Error Handling

Exit Code	Meaning	Action
0	Success	Report video path
10	Missing `$GEMINI_API_KEY` or dependency	Tell user what to set/install
11	Invalid input (bad path, unsupported format, constraint violation)	Report the specific validation error
20	HTTP 400 — content policy or bad request	Show API error message, suggest rephrasing
21	HTTP 401/403 — auth failure	"API key is invalid or expired"
22	HTTP 429 — rate limited	Wait 10 seconds, retry once automatically. If still failing, tell user to wait.
23	HTTP 500+ — server error	Retry once automatically. If still failing, report.
24	Poll timeout	Report operation name so user can check later with the poll command
30	No video in response	"Model didn't return a video — try rephrasing the prompt"

On exit codes 22 and 23, retry the same command once before reporting failure.

Script Reference

`generate_video.py generate`

Core API caller. Flags:

--prompt (required) — generation prompt
--model — model ID (default: veo-3.1-fast-generate-preview)
--aspect-ratio — 16:9 or 9:16 (optional; API default 16:9)
--resolution — 720p, 1080p, 4k (optional; API default 720p)
--duration — 4, 6, 8 seconds (optional; API default 8)
--negative-prompt — content to exclude
--person-generation — allow_all, allow_adult, or dont_allow
--generate-audio / --no-audio — enable/disable native audio synthesis (Veo 3+)
--seed N — seed for deterministic generation (uint32)
--sample-count N — number of videos to generate (1-4)
--resize-mode — pad or crop (image-to-video only)
--compression-quality — optimized or lossless
--image PATH — first frame for image-to-video
--last-frame PATH — last frame for interpolation (requires --image)
--reference-image PATH — reference image (repeatable, max 3)
--no-wait — submit and return immediately
--poll-interval N — seconds between polls (default: 10)
--timeout N — max seconds to wait (default: 600)
--output-dir DIR — output directory (default: ./generated-videos)

`generate_video.py extend`

Video extension. Flags:

--prompt (required) — continuation prompt
--video PATH (required) — MP4 video to extend
--model — model ID (default: veo-3.1-fast-generate-preview)
--no-wait — submit and return immediately
--poll-interval N — seconds between polls (default: 10)
--timeout N — max seconds to wait (default: 600)
--output-dir DIR — output directory (default: ./generated-videos)

`generate_video.py poll`

Check or wait on a running operation. Flags:

--operation NAME (required) — operation name from a previous submission
--wait — wait for completion and download the video
--poll-interval N — seconds between polls (default: 10)
--timeout N — max seconds to wait (default: 600)
--output-dir DIR — output directory (default: ./generated-videos)

Additional Resources

references/api-reference.md — Full Veo REST API schema: endpoints, request/response formats, all parameters, error codes, constraints.
references/advanced-features.md — Prompt engineering, resolution guide, duration selection, reference images, person generation, extension chains, async workflow, SynthID watermark.

generate-video

Invocation

Context Preview

Supporting Files

SKILL.md

generate-video

Invocation

Context Preview

Supporting Files

SKILL.md

Generate Video

Prerequisites

Text-to-Video Generation

1. Confirm the Prompt

2. Choose Settings

3. Compose a Negative Prompt (Optional)

4. Invoke the Script

5. Report the Result

Image-to-Video (Animation)

Frame Interpolation

Reference Images

Video Extension

Background Submission

Error Handling

Script Reference

generate_video.py generate

generate_video.py extend

generate_video.py poll

Additional Resources

Similar Skills

Generate Video

Prerequisites

Text-to-Video Generation

1. Confirm the Prompt

2. Choose Settings

3. Compose a Negative Prompt (Optional)

4. Invoke the Script

5. Report the Result

Image-to-Video (Animation)

Frame Interpolation

Reference Images

Video Extension

Background Submission

Error Handling

Script Reference

generate_video.py generate

generate_video.py extend

generate_video.py poll

Additional Resources

Similar Skills

`generate_video.py generate`

`generate_video.py extend`

`generate_video.py poll`

`generate_video.py generate`

`generate_video.py extend`

`generate_video.py poll`