From 003-jeremy-vertex-ai-media-master
Orchestrates Google Vertex AI multimodal operations: video analysis with Gemini 2.5, image generation with Imagen 4, audio with Lyria, and marketing campaign automation via Python SDK.
How this skill is triggered — by the user, by Claude, or both
Slash command
/003-jeremy-vertex-ai-media-master:vertex-ai-media-masterThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Multimodal media operations on Google Cloud Vertex AI covering video understanding, audio generation, image creation, and marketing campaign automation. This skill orchestrates Gemini 2.5 Pro/Flash, Imagen 4, and Lyria models to process, analyze, and generate rich media assets.
Multimodal media operations on Google Cloud Vertex AI covering video understanding, audio generation, image creation, and marketing campaign automation. This skill orchestrates Gemini 2.5 Pro/Flash, Imagen 4, and Lyria models to process, analyze, and generate rich media assets.
google-cloud-aiplatform Python SDK installed (pip install google-cloud-aiplatform[vision,audio])GOOGLE_CLOUD_PROJECT and GOOGLE_APPLICATION_CREDENTIALS environment variables setroles/aiplatform.user permissionus-central1 recommended for model availability).gs:// URIs) or provide local paths for smaller assets.| Error | Cause | Solution |
|---|---|---|
PermissionDenied on Vertex AI API | Service account lacks aiplatform.user role | Grant the required IAM role to the service account |
ResourceExhausted / quota exceeded | Too many concurrent requests or token limit hit | Implement request batching; switch to Gemini 2.5 Flash for lower-cost operations |
InvalidArgument on image generation | Prompt violates safety filters or unsupported aspect ratio | Revise the prompt to remove restricted content; use a supported aspect ratio (1:1, 16:9, 9:16) |
| Video processing timeout | Source video exceeds duration or resolution limits | Use low-resolution mode for videos over 2 hours; split longer videos into segments |
| Audio generation returns empty | Prompt too vague or duration parameter missing | Specify genre, tempo, mood, and an explicit duration in seconds |
NotFound on model ID | Incorrect model name or model not available in region | Verify the model ID against current Vertex AI documentation; try us-central1 |
Example 1: Analyze a competitor video ad
gs://bucket/competitor-ad.mp4.Example 2: Generate campaign assets from a product brief
Example 3: Repurpose a long-form video into short-form clips
${CLAUDE_SKILL_DIR}/references/core-capabilities.mdnpx claudepluginhub jeremylongshore/claude-code-plugins-plus-skills --plugin 003-jeremy-vertex-ai-media-masterGenerates AI videos from text descriptions or images using Google Veo 3.1 (default) or OpenAI Sora. Supports dialogue/audio, reference images, image-to-video animation, and interactive requirement gathering.
Processes audio, images, videos, and PDFs, and generates images/videos using Google Gemini, Imagen, and Veo models. Useful for transcription, OCR, visual Q&A, document extraction, and media generation.
Provides patterns for multimodal LLM integration: vision (image analysis, document understanding), audio (STT, TTS), video generation (Kling, Sora, Veo, Runway). Use for AI pipelines with images, audio, video.