Skill

seedance-video

Use when creating, prompting, planning, or automating AI-generated videos with Seedance 2.0 / Jimeng / BytePlus ModelArk (doubao-seedance models): text-to-video, image-to-video, reference-to-video motion/camera/style transfer, audio-video, multi-shot and time-segmented prompts, video extension past the per-generation length cap, ModelArk API task submission and polling, preparing a source clip so it isn't rejected for resolution/duration, concatenating clips with one continuous soundtrack, and saving run metadata. Use this whenever someone wants to turn photos or an existing clip into a moving short, vertical, social, Instagram/Reels, marketing, or ad video, or to make, prompt, storyboard, or fix an AI-generated video — even if they don't say "Seedance" or "AI video" by name. The deliverable is always a moving video: a request for a single still image or product photo with no motion is image generation, not this skill. Also not for editing existing footage in an editor (e.g. DaVinci/Premiere cuts or color grading), sourcing stock clips, or web/page design.

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/marketing:seedance-video

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Use Seedance 2.0 like a short-form video production system: define the deliverable, bind each reference asset to one clear role, write a director-style prompt, submit or prepare the run, and preserve reproducibility metadata.

Supporting Files

agents/openai.yamlreferences/seedance-2-reference.mdscripts/concat_videos.pyscripts/prepare_reference_video.py

SKILL.md

170 lines · ~2.7k tokens

Stats

LanguagePython

Stars0

MaintenanceGood

Last CommitMay 16, 2026

Actions

View Source View Plugin View on GitHub View README

Stats

Actions

Seedance Video

Overview

For Jimeng/Seedance portal choices, prompt patterns, official ModelArk API fields, current model limits, and operational notes, read references/seedance-2-reference.md when doing implementation, API work, or parameter selection.

Workflow

Clarify the output: duration, aspect ratio, platform, style, whether audio is needed, and whether generation should be prompt-only or reference-guided.
Inventory inputs: text concept, images for identity/style/first frame/last frame, video for motion/camera/edits, audio for rhythm/mood. Use only assets the user has rights to use.
Build the prompt with this structure:

Subject + action + camera + style + constraints.
Bind references explicitly: @image1/@图片1 for identity, @video1/@视频1 for motion/camera, @audio1/@音频1 for rhythm.
State duration, aspect ratio, resolution, audio expectation, and any negative constraints.

Prefer simple reference roles. If a result drifts, reduce competing references before adding more prompt text. When using Jimeng's Chinese UI, use the same @图片1 / @视频1 object names shown by the uploader.
For ModelArk generation, use ARK_API_KEY from the environment, public asset URLs or asset IDs, and the official asynchronous task flow. Never expose API keys in code, logs, or committed files.
Save run artifacts using the repo convention when working in my-ai-workspace: video-generation/outputs/YYYY-MM-DD-<short-name>/run.json, with generated media in the same ignored folder.

Prompt Pattern

Use one excellent, specific prompt instead of a long mood board:

@图片1 defines the product shape and material. A premium black travel mug rotates slowly on a wet stone counter while warm morning light moves across brushed metal. Macro close-up, 85mm lens, shallow depth of field, slow dolly-in, realistic condensation, no text overlays, no extra logos. Generate an 8s 9:16 product ad, 720p, natural ambient sound only.

For multi-shot ideas, reach for these in order — the ordering is about audio, which is the usual thing that breaks:

One time-segmented generation (0-5s: ... 5-10s: ... 10-15s: ..., up to 15s). One render carries one continuous native soundtrack, so there is nothing to align in post. This is the default for a coherent short.
Extension chaining (Extend @视频1 by 15s ...) when the piece is longer than one generation. The model continues the same world and the same audio bed, so the track stays unbroken across the seam.
Split into separate generations + stitch only when beats need different settings or exceed what extension can hold. This is the fallback, not the default, because each clip gets its own audio that restarts at every join — you then have to paper over it with one external track (see Post-production --bgm).

Asking one generation to improvise a whole story arc still fails; the win here is segmenting the prompt, not relying on the model's freeform narrative.

For reference imitation, be explicit about what to copy:

Replace the dancer in @视频1 with the character from @图片1. Completely follow @视频1's dance action, camera movement, transition rhythm, and facial timing. Keep it one continuous shot; do not cut away.

API Run Shape

When using BytePlus ModelArk, keep the run declarative:

{
  "model": "dreamina-seedance-2-0-260128",
  "content": [
    { "type": "text", "text": "<director-style prompt>" },
    {
      "type": "image_url",
      "image_url": { "url": "<public_image_url_or_asset_id>" },
      "role": "reference_image"
    },
    {
      "type": "video_url",
      "video_url": { "url": "<public_video_url_or_asset_id>" },
      "role": "reference_video"
    }
  ],
  "resolution": "720p",
  "ratio": "9:16",
  "duration": 8,
  "generate_audio": false,
  "watermark": false
}

Create tasks with POST /api/v3/contents/generations/tasks, then poll GET /api/v3/contents/generations/tasks/{id} or use callback_url. Treat succeeded as success and failed / expired as terminal failures. Download content.video_url promptly. Record provider, model, prompt, parameters, seed, inputs, task id, status, output URL/path, and timestamp in run.json.

Preparing reference videos

Real source footage almost never meets Seedance's reference-video rules as-is. A phone screen recording, an HD/4K export, or a portrait clip whose rotation lives only in metadata will be rejected with "The uploaded video resolution is not supported". Seedance wants a reference video that is <=15s, has its pixel count inside the accepted window (ModelArk allows up to 2,086,876 px, but the Jimeng portal is stricter at ~927,408; stay under the stricter one so one file works on every portal), has even width/height, FPS 24-60, H.264/H.265, and rotation baked into the pixels rather than carried as a display matrix.

Don't hand-transcode this each time — use the bundled script. It trims, downscales into the pixel window, de-rotates, forces even dims, and prints a pass/fail readiness check. Like concat_videos.py it self-bootstraps a static ffmpeg, so no system ffmpeg is needed:

python skills/seedance-video/scripts/prepare_reference_video.py \
  SOURCE.mov --output ref.mp4 --start 1.0 --duration 14

--start skips lead-in (e.g. a screen recording that began before playback); --duration defaults to min(source, 14)s and is hard-capped at 15. The output is ready to upload as @视频1 / a reference_video role. Remember the total across up to 3 reference videos must also be <=15s.

Post-production: concatenating clips

Because Seedance is strongest on single shots, the normal way to build a longer sequence is to generate several short clips with the same settings (resolution, ratio, fps, audio on/off) and join them afterward. Use the bundled script — it needs no system ffmpeg (it self-bootstraps a static ffmpeg into a gitignored .venv on first run):

python skills/seedance-video/scripts/concat_videos.py \
  --output video-generation/outputs/YYYY-MM-DD-<short-name>/concat.mp4 \
  --run-json clip1.mp4 clip2.mp4 [clip3.mp4 ...]

When the clips share codec/size/fps/audio (true for same-settings Seedance runs) it concatenates by stream copy — instant and lossless. If they differ, it normalizes and re-encodes so the join still succeeds rather than failing. --run-json writes the repo-convention run.json next to the output recording method, inputs, and per-clip probe data.

This is why keeping generation settings consistent across shots matters: it keeps the final stitch lossless. If you must mix settings, expect a re-encode.

Continuous BGM across shots

Seedance generates audio per clip, so each shot carries its own music starting from zero. A plain concat therefore makes the soundtrack restart at every join — the single most common multi-shot complaint. Fix it by laying one continuous track over the whole sequence with --bgm:

python skills/seedance-video/scripts/concat_videos.py \
  --output video-generation/outputs/YYYY-MM-DD-<short-name>/final.mp4 \
  --bgm full-track.mp3 --bgm-offset 0 --bgm-fade 0.5 --run-json \
  clip1.mp4 clip2.mp4 [clip3.mp4 ...]

--bgm accepts any audio or video file (a music file, or a screen recording of the track playing) — only its audio is used, replacing the per-clip audio. The video stays stream-copied (lossless); only audio is re-encoded. --bgm-offset SECONDS skips lead-in in the source so the music lines up with the first frame — start at 0 and nudge up if the recording began before playback. --bgm-fade SECONDS (default 0.5) fades the end out so truncating a longer track doesn't hard-cut. Audio sync is judged by ear: generate, watch, adjust --bgm-offset, repeat.

Common Mistakes

Mistake	Fix
Vague prompt like "make it cinematic"	Specify subject, motion, camera, style, and constraints.
Many references all doing the same job	Assign one role per reference: identity, motion, rhythm, style.
Asking for a full narrative in one clip	Generate short shots and join them with `scripts/concat_videos.py`.
Generating shots with mismatched settings, then stitching	Keep resolution/ratio/fps/audio identical across shots so concat stays lossless (stream copy).
BGM restarts at every cut in a multi-shot video	Prefer one time-segmented generation or extension chaining (continuous native audio). If you already have separate clips, lay one continuous track with `concat_videos.py --bgm`, aligning with `--bgm-offset`.
Uploading a raw phone / HD / 4K / portrait clip as a reference video	Seedance rejects it ("resolution not supported") — it's over the pixel/duration ceiling and may be rotated only in metadata. Run `scripts/prepare_reference_video.py` first.
Cropping out the "AI generated" watermark afterward	Set `watermark: false` at generation. Post-hoc cropping is cosmetic, loses framing, and an embedded/labeling identifier can remain — regenerating clean is the only real fix.
Using private or local media URLs in ModelArk	Use public URLs, BytePlus TOS public-read objects, Base64 within request limits, or asset IDs.
Waiting too long to save generated output	ModelArk task data is retained for a limited window; download and persist the MP4.
Committing generated videos	Keep videos under ignored output folders; do not force-add heavy media.
Using celebrity, copyrighted character, or private-person likeness without rights	Ask for a rights-safe alternative before generating.

seedance-video

Invocation

Context Preview

Supporting Files

SKILL.md

seedance-video

Invocation

Context Preview

Supporting Files

SKILL.md

Seedance Video

Overview

Workflow

Prompt Pattern

API Run Shape

Preparing reference videos

Post-production: concatenating clips

Continuous BGM across shots

Common Mistakes

Similar Skills

Seedance Video

Overview

Workflow

Prompt Pattern

API Run Shape

Preparing reference videos

Post-production: concatenating clips

Continuous BGM across shots

Common Mistakes

Similar Skills