By krasnoperov
Consistent image generation with Gemini - Generate images from text, edit with natural language, create consistent image series with reference sheets methodology.
AI image generation skill for Claude Code - Generate consistent game sprites, character sheets, and visual assets using Gemini's spatial understanding and reference sheet methodology.
Gemini can preserve visual features across different views, poses, and compositions when given specific image references. This skill leverages that capability through precise instructions for managing your assets:
See skills/gemini-images/SKILL.md for complete methodology and techniques.
This is a Claude Code skill. Install it from the marketplace:
/plugin marketplace add krasnoperov/claude-plugins
/plugin install gemini-images@krasnoperov-plugins
Once installed, use the /gemini-images skill in your conversations:
/gemini-images generate a pixel art character sheet with front, back, and side views
/gemini-images edit character.png "add armor and sword"
/gemini-images compose character.png background.png "place character in forest scene"
You can also use this package directly via npx:
export GEMINI_API_KEY="your-key-here"
# Generate from text
npx -y @krasnoperov/gemini-images@latest generate "pixel art tree" --output tree.png
# Transform image
npx -y @krasnoperov/gemini-images@latest edit tree.png "add glowing runes" --output tree-magic.png
# Combine references
npx -y @krasnoperov/gemini-images@latest compose hero.png sword.png "character holding sword" --output hero-armed.png
Get your API key: Google AI Studio
generate "<prompt>" Text → Image
edit <image> "<prompt>" Image + Instructions → Image
compose <img1> <img2> ... "<prompt>" Images + Instructions → Image
These three operations compose into any workflow you need.
See skills/gemini-images/examples/ directory:
Gemini has spatial understanding - it preserves visual features when given specific references.
npx -y @krasnoperov/gemini-images@latest generate \
"Character sheet: front view, back view, side view. Character: ranger, auburn hair, green jerkin." \
--output character_sheet.png
npx -y @krasnoperov/gemini-images@latest compose character_sheet.png accessories.png \
"Image 1: Character sheet
Image 2: Accessories
Character: From image 1, front-facing
Items: Backpack from image 2
Lighting: Soft natural light, top-left
Camera: Eye level, medium shot" \
--output scene.png
--model <model> gemini-3-pro-image-preview (default) or gemini-2.5-flash-image
--aspect-ratio <ratio> 1:1 (default), 16:9, 9:16, 2:3, 3:2, 3:4, 4:3, 4:5, 5:4, 21:9
--image-size <size> 1K (default), 2K, 4K (Pro model required for 2K/4K)
--output <path> Output file or directory (default: ./output/)
--api-key <key> Gemini API key (or set GEMINI_API_KEY env var)
| Feature | Flash | Pro |
|---|---|---|
| Max Resolution | 1K | 4K |
| Reference Images | 1 | 14 |
| Speed | Fast | Moderate |
| Best For | Quick iterations | Production assets |
npm run build # Build TypeScript
npm run typecheck # Type checking
npm run dev # Dev mode with type stripping
Reference sheet methodology based on Towards Data Science.
MIT License - Copyright (c) 2025 Aleksei Krasnoperov
See LICENSE file for details.
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
npx claudepluginhub krasnoperov/claude-plugins --plugin gemini-imagesAudio/video transcription with speaker diarization, AI summarization, and infographic generation. Transform recordings into transcripts, summaries, and visual content.
Blocking-gate workflow for delivering a PR end-to-end with review-quill and merge-steward.
Ultra-compressed communication mode. Cuts ~75% of tokens while keeping full technical accuracy by speaking like a caveman.
Frontend design skill for UI/UX implementation
Comprehensive UI/UX design plugin for mobile (iOS, Android, React Native) and web applications with design systems, accessibility, and modern patterns
Memory compression system for Claude Code - persist context across sessions
Marketing skills for AI agents — conversion optimization, copywriting, SEO, paid ads, ad creative, and growth
Standalone image generation plugin using Nano Banana MCP server. Generates and edits images, icons, diagrams, patterns, and visual assets via Gemini image models. No Gemini CLI dependency required.