From renoise
Analyzes images/videos with Gemini 3.1 Pro: product photo breakdowns to JSON, video script/dialogue extraction with timestamps, style/color/composition extraction, OCR, scene descriptions. Auto-handles large files.
How this skill is triggered — by the user, by Claude, or both
Slash command
/renoise:gemini-genThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Gemini 3.1 Pro via Renoise gateway. Zero npm dependencies, native `fetch` only.
Gemini 3.1 Pro via Renoise gateway. Zero npm dependencies, native fetch only.
Handles files of any size automatically — small files are sent inline, large files (>20MB) are uploaded first.
# Analyze a product photo
node ${CLAUDE_SKILL_DIR}/scripts/gemini.mjs --file photo.jpg --mode product
# Extract a video script with timestamps
node ${CLAUDE_SKILL_DIR}/scripts/gemini.mjs --file clip.mp4 --mode video-script
# Extract visual style keywords from a reference
node ${CLAUDE_SKILL_DIR}/scripts/gemini.mjs --file reference.jpg --mode style
# Free-form analysis
node ${CLAUDE_SKILL_DIR}/scripts/gemini.mjs --file photo.jpg "Describe this image in detail"
Preset modes auto-select optimal resolution and output format. Use --mode for common tasks instead of writing custom prompts.
--mode product)Analyzes product photos and returns structured JSON with type, color, material, selling points, brand tone, and scene suggestions. Auto-selects high resolution for maximum detail.
node ${CLAUDE_SKILL_DIR}/scripts/gemini.mjs --file product.jpg --mode product
Output:
{
"type": "resistance loop bands",
"color": "Pink 10lb, Blue 15lb, Mint green 20lb",
"material": "TPE elastic, matte finish",
"selling_points": ["3 resistance levels", "foldable and portable", "pastel color scheme"],
"brand_tone": "Youthful athletic, trendy fitness",
"scene_suggestions": ["living room workout", "hotel room fitness", "outdoor park"]
}
--mode video-script)Watches a video and outputs timestamped dialogue, scene descriptions, and camera movements. Auto-selects low resolution to reduce token consumption.
node ${CLAUDE_SKILL_DIR}/scripts/gemini.mjs --file clip.mp4 --mode video-script
--mode style)Extracts visual style keywords from a reference image or video: color palette, lighting, camera language, composition, and mood.
node ${CLAUDE_SKILL_DIR}/scripts/gemini.mjs --file reference.jpg --mode style
# Text only
node ${CLAUDE_SKILL_DIR}/scripts/gemini.mjs "Explain quantum computing"
# Analyze an image (high resolution for product detail)
node ${CLAUDE_SKILL_DIR}/scripts/gemini.mjs --file photo.jpg --resolution high "Describe this product"
# Analyze a video (low resolution to save tokens)
node ${CLAUDE_SKILL_DIR}/scripts/gemini.mjs --file clip.mp4 --resolution low "Summarize this clip"
# Multiple images
node ${CLAUDE_SKILL_DIR}/scripts/gemini.mjs --file a.jpg --file b.jpg "Compare these two"
# JSON output mode
node ${CLAUDE_SKILL_DIR}/scripts/gemini.mjs --json "Return a JSON object with name and age"
| Flag | Default | Description |
|---|---|---|
--file <path> | — | Attach local file (repeatable). Files >20MB auto-uploaded |
--file-uri <uri> | — | Attach uploaded file by URI (requires --file-mime) |
--file-mime <mime> | — | MIME type for --file-uri |
--resolution <level> | medium | low / medium / high / ultra_high |
--model <name> | gemini-3.1-pro | Model name |
--temperature <n> | 1.0 | Temperature |
--max-tokens <n> | 8192 | Max output tokens |
--json | off | Request JSON response format |
--mode <name> | — | Preset analysis mode: product, video-script, style |
mediaResolution controls token allocation per image/frame:
| Level | Image Tokens | Video Frame Tokens | Best For |
|---|---|---|---|
low | 280 | 70 | Bulk processing, video analysis |
medium | 560 | 140 | General use (default) |
high | 840 | 210 | Product photos, fine text |
ultra_high | 1120 | 280 | Extreme detail |
Files larger than 20MB are automatically uploaded before analysis — no manual steps needed. The upload goes through the Renoise file gateway and returns a temporary URL (valid for 1 hour).
# This just works — the script detects file size and auto-uploads if needed
node ${CLAUDE_SKILL_DIR}/scripts/gemini.mjs --file large-video.mp4 "Analyze this video"
For manual uploads (e.g. reusing the same large file across multiple calls):
# Upload once
FILE_URL=$(node ${CLAUDE_PLUGIN_ROOT}/skills/renoise-gen/scripts/upload.mjs large-video.mp4)
# Use the URL in multiple calls
node ${CLAUDE_SKILL_DIR}/scripts/gemini.mjs --file-uri "$FILE_URL" --file-mime video/mp4 "Summarize"
node ${CLAUDE_SKILL_DIR}/scripts/gemini.mjs --file-uri "$FILE_URL" --file-mime video/mp4 "Extract dialogue"
| Use gemini-gen for | Use renoise-gen for |
|---|---|
| Analyzing product photos | Generating images |
| Understanding video content | Generating videos |
| Extracting scripts from video | Text-to-video / image-to-video |
| Comparing visual assets | Product design sheets |
| OCR / text extraction | Scene backgrounds |
| Describing scenes for prompts | — |
POST https://renoise.ai/api/public/llm/proxy/v1beta/models/{model}:generateContent?key={RENOISE_API_KEY}
{
"contents": [
{
"role": "user",
"parts": [
{ "inlineData": { "mimeType": "image/jpeg", "data": "<base64>" }, "mediaResolution": { "level": "media_resolution_high" } },
{ "text": "Describe this image" }
]
}
],
"generationConfig": {
"temperature": 1.0,
"maxOutputTokens": 8192
}
}
| Extension | MIME Type | Max Inline |
|---|---|---|
| .jpg/.jpeg | image/jpeg | 20MB |
| .png | image/png | 20MB |
| .webp | image/webp | 20MB |
| .gif | image/gif | 20MB |
| .mp4 | video/mp4 | 20MB |
| .mov | video/quicktime | 20MB |
| .webm | video/webm | 20MB |
Environment variable RENOISE_API_KEY. Get one at: https://www.renoise.ai
npx claudepluginhub arcocodes/renoise-plugins-official --plugin renoiseProcesses audio, images, videos, and PDFs, and generates images/videos using Google Gemini, Imagen, and Veo models. Useful for transcription, OCR, visual Q&A, document extraction, and media generation.
Analyzes PDFs, images, videos, YouTube links, and documents using Google Gemini. Generates images from text prompts with Nano Banana Pro.
Analyzes images with MiniMax vision tool for description, OCR, text extraction, UI mockup review, chart data parsing, diagrams. Auto-triggers on image shares or analysis requests.