From nano-banana
Generate videos using Veo 3.1 — text-to-video, image-to-video, frame interpolation, and video extension
How this skill is triggered — by the user, by Claude, or both
Slash command
/nano-banana:video [description or instruction][description or instruction]This skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Generate videos using Google's Veo 3.1 models via the google-genai SDK. Supports four generation modes: text-to-video, image-to-video, frame interpolation, and video extension.
Generate videos using Google's Veo 3.1 models via the google-genai SDK. Supports four generation modes: text-to-video, image-to-video, frame interpolation, and video extension.
Key Features:
Use this skill when you need:
Not for: Static images (use image skill), technical diagrams (use diagram skill), or diagram source rendering (use kroki skill).
python3 ${CLAUDE_SKILL_DIR}/scripts/generate_video.py "A drone flyover of a coastal city at golden hour" -o flyover.mp4
Animate a still image with a motion prompt:
python3 ${CLAUDE_SKILL_DIR}/scripts/generate_video.py "Camera slowly zooms in while clouds drift" \
--input photo.png -o animated.mp4
Generate a smooth transition between two frames:
python3 ${CLAUDE_SKILL_DIR}/scripts/generate_video.py "Smooth morph between scenes" \
--input start.png --last-frame end.png -o transition.mp4
Continue an existing video clip:
python3 ${CLAUDE_SKILL_DIR}/scripts/generate_video.py "The camera keeps panning right revealing more of the landscape" \
--extend clip.mp4 -o extended.mp4
Guide the visual style with up to 3 reference images:
python3 ${CLAUDE_SKILL_DIR}/scripts/generate_video.py "A cat walking through a garden" \
-o styled.mp4 --reference style1.png --reference style2.png
| Model | ID | Speed | Best For |
|---|---|---|---|
| Veo 3.1 Fast | veo-3.1-fast-generate-preview | Fast | Default — quick iterations, drafts |
| Veo 3.1 | veo-3.1-generate-preview | Standard | Final output, higher quality |
Use -m to select a model:
python3 ${CLAUDE_SKILL_DIR}/scripts/generate_video.py "A sunset timelapse" -o sunset.mp4 -m veo-3.1-generate-preview
| Resolution | Supported Durations | Notes |
|---|---|---|
| 720p (default) | 4, 6, 8 seconds | All durations, required for extension |
| 1080p | 8 seconds only | Higher quality, duration locked |
# 720p, 4 seconds (fast preview)
python3 ${CLAUDE_SKILL_DIR}/scripts/generate_video.py "Quick test" -o test.mp4 --resolution 720p --duration 4
# 1080p, must be 8 seconds
python3 ${CLAUDE_SKILL_DIR}/scripts/generate_video.py "Cinematic landscape" -o landscape.mp4 --resolution 1080p --duration 8
Constraint rules:
--resolution 1080p requires --duration 8--extend requires --resolution 720p--reference images| Ratio | Use Case |
|---|---|
| 16:9 (default) | Landscape, presentations, YouTube |
| 9:16 | Portrait, mobile reels, TikTok, Stories |
# Vertical video for social media
python3 ${CLAUDE_SKILL_DIR}/scripts/generate_video.py "A vertical reel of a coffee being poured" \
-o reel.mp4 --aspect-ratio 9:16
By default, audio is stripped from generated videos using ffmpeg. This avoids unexpected AI-generated audio.
# Default: audio stripped
python3 ${CLAUDE_SKILL_DIR}/scripts/generate_video.py "A beach scene" -o beach.mp4
# Keep generated audio
python3 ${CLAUDE_SKILL_DIR}/scripts/generate_video.py "A concert crowd cheering" -o concert.mp4 --audio
Requirements for audio stripping:
ffmpeg must be installed (brew install ffmpeg on macOS)| Flag | Default | Description |
|---|---|---|
prompt (positional) | required | Text description of the video |
-o / --output | required | Output .mp4 file path |
-m / --model | veo-3.1-fast-generate-preview | Model ID |
-i / --input | — | Input image (image-to-video / interpolation) |
--last-frame | — | Last frame image (interpolation, requires --input) |
--extend | — | Video .mp4 to extend |
--reference | — | Reference image (repeatable, max 3) |
--aspect-ratio | 16:9 | 16:9 or 9:16 |
--resolution | 720p | 720p or 1080p |
--duration | 8 | 4, 6, or 8 seconds |
--audio | off | Keep generated audio |
--timeout | 360 | Max wait in seconds |
--api-key | — | Override GEMINI_API_KEY |
export GEMINI_API_KEY='your-key-here'
Get a key at https://aistudio.google.com/apikey.
Or add to a .env file in your project:
GEMINI_API_KEY=your-key-here
Run /nano-banana:setup for guided configuration.
# Vague — model guesses camera movement
"A mountain"
# Specific — clear motion intent
"A slow aerial drone shot pulling back from a snowy mountain peak, revealing the valley below, golden hour lighting"
Use terms like: pan left/right, zoom in/out, dolly, orbit, tracking shot, crane shot, steady cam, timelapse.
# Good: describes what should move
python3 ${CLAUDE_SKILL_DIR}/scripts/generate_video.py "The clouds slowly drift across the sky while the water gently ripples" \
--input landscape.png -o animated.mp4
Set the GEMINI_API_KEY environment variable or create a .env file. Run /nano-banana:setup for help.
Check the resolution/duration constraint table above. Common issue: using --resolution 1080p with --duration 4.
Default timeout is 360 seconds (6 minutes). Video generation typically takes 1-4 minutes. Increase with --timeout 600 for complex prompts.
Install ffmpeg for automatic audio stripping: brew install ffmpeg. Without it, videos are saved with AI-generated audio.
Try a simpler prompt, or switch to the standard quality model (-m veo-3.1-generate-preview). Some complex scenes may not render well at lower quality settings.
| Aspect | video | image | diagram |
|---|---|---|---|
| Output | .mp4 video | .png image | .png diagram |
| Models | Veo 3.1 | Gemini Flash/Pro | Gemini Pro |
| Duration | 4-8 seconds | Instant | 1-2 passes |
| Editing | Extend existing | Edit existing | Edit existing |
| Best For | Animation, motion | Photos, art | Architecture, flowcharts |
npx claudepluginhub flight505/nano-bananaProvides UI/UX resources: 50+ styles, color palettes, font pairings, guidelines, charts for web/mobile across React, Next.js, Vue, Svelte, Tailwind, React Native, Flutter. Aids planning, building, reviewing interfaces.
Fetches up-to-date documentation from Context7 for libraries and frameworks like React, Next.js, Prisma. Use for setup questions, API references, and code examples.