From hamel-tools
Analyzes PDFs, images, videos, YouTube links, and documents using Google Gemini. Generates images from text prompts with Nano Banana Pro.
How this skill is triggered — by the user, by Claude, or both
Slash command
/hamel-tools:gemThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Use the `ai-gem` CLI tool for multimodal AI processing and image generation via Google's Gemini API.
Use the ai-gem CLI tool for multimodal AI processing and image generation via Google's Gemini API.
# Text queries
ai-gem "Write a haiku about Python programming"
# Analyze documents
ai-gem "Summarize this document" document.pdf
# Analyze images
ai-gem "What's in this image?" photo.jpg
# Process YouTube videos
ai-gem "Create a 5-point summary" "https://youtu.be/VIDEO_ID"
# Compare multiple files
ai-gem "Compare these files" file1.pdf file2.png
# Web search
ai-gem "Current AI news" --search
# Generate images (uses Nano Banana Pro by default)
ai-gem --image "A cute robot reading a book in a cozy library"
ai-gem --image "A landscape at sunset" --aspect-ratio 16:9
ai-gem --image "A cat wearing a hat" -o cat.png
ai-gem --image "Edit this to add sunglasses" reference.jpg
# Use alternative image model
ai-gem --image "A blue triangle" -m gemini-2.5-flash-image
--image / -i: Generate an image instead of text--output / -o: Output file path (auto-generated if omitted)--aspect-ratio / -a: Aspect ratio (1:1, 9:16, 16:9, etc.)--model / -m: Override model (default: nano-banana-pro-preview)GEMINI_API_KEY environment variable must be sethamel package must be installed: pip install hamelnpx claudepluginhub hamelsmu/hamel --plugin hamel-toolsProcesses audio, images, videos, and PDFs, and generates images/videos using Google Gemini, Imagen, and Veo models. Useful for transcription, OCR, visual Q&A, document extraction, and media generation.
Generates images and text from prompts using Google Gemini Web. Supports reference image uploads, multi-turn sessions, and experimental video generation as backend for other skills.
Generates images and AI art using Gemini API via inline Python scripts run with uv. Handles image editing, quick scripting, and one-off Python tasks without files.