From csa-ppt
AI image generation skill using Azure OpenAI GPT-image-2 model. Generates photorealistic illustrations, conceptual visuals, scene imagery, hero backgrounds, and custom icons for presentations. Use this skill when slides need visual storytelling beyond structural diagrams — cover art, industry scenario photos, abstract concept illustrations, product UI mockups, or decorative backgrounds. Complements azure-diagrams (architecture) and excalidraw-diagram (hand-drawn) by providing the "visual expression layer" that programmatic tools cannot produce. Supports Chinese/English/Japanese/Korean text rendering in images. Output: PNG files ready for embedding into slides via the pptx skill.
How this skill is triggered — by the user, by Claude, or both
Slash command
/csa-ppt:gpt-imageThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Generate photorealistic and artistic images using Azure OpenAI GPT-image-2 for presentations.
Generate photorealistic and artistic images using Azure OpenAI GPT-image-2 for presentations.
| Use Case | Example | Why GPT-image-2 |
|---|---|---|
| Cover / Title slides | "AI赋能数字化转型" concept art | No existing tool can create photorealistic concept illustrations |
| Industry scenario imagery | Retail store, factory floor, hospital, smart city | Scene-based visuals for solution demos |
| Abstract concept visuals | "Zero Trust Security" metaphor, "Cloud Native" illustration | Visual storytelling beyond boxes and arrows |
| Product UI mockups | Dashboard preview, mobile app concept | More realistic than excalidraw hand-drawn |
| Hero backgrounds | Gradient mesh with tech elements for slide backgrounds | CSS gradients are flat; AI creates depth |
| Custom icons / logos | Branded concept icons not in Azure icon library | Fills gaps in standard icon sets |
| Multilingual infographics | Images with embedded Chinese/English text | Native CJK text rendering support |
| Scenario | Use Instead |
|---|---|
| Cloud architecture diagrams (Azure/AWS/GCP) | azure-diagrams — has 700+ official icons |
| Hand-drawn / whiteboard style sketches | excalidraw-diagram — purpose-built for that aesthetic |
| Data charts (bar, pie, line) | pptx built-in PptxGenJS charts |
| Simple icons and gradient bars | SVG + Sharp rasterization |
Always execute inline — do not create separate .py files:
python3 << 'EOF'
import os, base64, requests
endpoint = os.environ["AZURE_OPENAI_ENDPOINT"]
api_key = os.environ["AZURE_OPENAI_API_KEY"]
deployment = os.environ.get("GPT_IMAGE_DEPLOYMENT", "gpt-image-2")
api_version = "2025-04-01-preview"
url = f"{endpoint}openai/deployments/{deployment}/images/generations?api-version={api_version}"
response = requests.post(url, headers={
"Api-Key": api_key,
"Content-Type": "application/json"
}, json={
"prompt": "YOUR PROMPT HERE",
"n": 1,
"size": "1536x1024",
"quality": "high",
"output_format": "png"
}).json()
if "error" in response:
print(f"ERROR: {response['error']}")
else:
img_data = base64.b64decode(response["data"][0]["b64_json"])
output_path = "outputs/{project}/diagrams/image-name.png"
os.makedirs(os.path.dirname(output_path), exist_ok=True)
with open(output_path, "wb") as f:
f.write(img_data)
print(f"Image saved: {output_path} ({len(img_data)} bytes)")
EOF
推荐方式:编辑 config.json(在 skill 根目录 gpt-image/config.json):
{
"azure_openai_endpoint": "https://your-resource.openai.azure.com/",
"azure_openai_api_key": "your-api-key",
"deployment_name": "gpt-image-2",
"api_version": "2025-04-01-preview"
}
配置优先级:CLI 参数 > config.json > 环境变量
备选方式:环境变量(如果 config.json 未填写,会自动回退到环境变量):
export AZURE_OPENAI_ENDPOINT="https://your-resource.openai.azure.com/"
export AZURE_OPENAI_API_KEY="your-key"
export GPT_IMAGE_DEPLOYMENT="gpt-image-2"
pip install requests pillow
# 查看当前配置来源
python3 scripts/generate_image.py config
# 完整验证(含 API 连通性测试)
python3 scripts/generate_image.py verify
| Parameter | Values | Default | Notes |
|---|---|---|---|
size | See size table below | 1024x1024 | Both edges must be multiples of 16 |
quality | low, medium, high | high | low = fast/cheap, high = best quality |
n | 1-10 | 1 | Number of images per request |
output_format | png, jpeg, webp | png | Use png for transparency support |
background | auto, transparent | auto | Requires png format |
output_compression | 0-100 | — | JPEG only |
| Use Case | Size | Aspect Ratio | Notes |
|---|---|---|---|
| Full-width slide image | 1536x1024 | 3:2 | Fits 16:9 slide well |
| Right-panel diagram | 1024x1024 | 1:1 | For 左文右图 layout |
| Wide banner / hero | 1920x1080 | 16:9 | Full slide background |
| Tall sidebar | 1024x1536 | 2:3 | Vertical illustration |
| 4K high-res | 3840x2160 | 16:9 | For print or zoom |
GPT-image-2 supports arbitrary resolutions: both edges must be multiples of 16px, long edge up to 3840px, pixel count 655,360–8,294,400.
A [style] illustration of [concept], featuring [specific visual elements].
The scene uses a clean, modern aesthetic with a predominantly white/light
background. Key accent colors: warm orange and cool blue tones.
Corporate presentation quality, high detail, professional.
A photorealistic view of [industry setting], showing [specific activity].
Modern, well-lit environment. Shot from [angle]. Professional quality,
suitable for a business presentation. No text overlays.
An abstract [style] visualization representing [concept]. Use geometric
shapes, flowing lines, and a color palette of [colors]. Minimalist
composition with strong visual hierarchy. Clean background suitable
for slide overlay text.
A subtle, abstract background pattern for a presentation slide.
Soft gradients blending [color1] to [color2] with [geometric/organic]
elements. Low contrast, suitable for overlaying dark text. Resolution:
1920x1080, corporate and modern feel.
GPT-image-2 supports rendering text in Chinese, English, Japanese, Korean, Hindi, Bengali:
Create an infographic with the title "数字化转型路线图" in bold Chinese text
at the top. Below it, show 4 stages with labels in Chinese...
Tip: Keep embedded text short and large. Complex multi-paragraph text may render poorly.
All generated images go to {workspace_path}/diagrams/ with descriptive names:
outputs/{project}/diagrams/
├── cover-concept.png ← Title slide illustration
├── retail-scenario.png ← Industry scenario image
├── zero-trust-visual.png ← Abstract concept art
├── hero-background.png ← Slide background
└── manifest.md ← Updated with new entries
When adding to diagrams/manifest.md:
| cover-concept.png | Concept Art | gpt-image-2 | 1536x1024 | AI digital transformation concept for title slide |
After generating each image:
ls -la {path}python3 -c "from PIL import Image; img=Image.open('{path}'); print(img.size)"If generation fails:
quality="medium"quality="low"progress.md as [ERROR], suggest using SVG+Sharp as fallback for simple visualsGPT-image-2 also supports editing existing images (inpainting):
python3 << 'EOF'
import os, base64, requests
endpoint = os.environ["AZURE_OPENAI_ENDPOINT"]
api_key = os.environ["AZURE_OPENAI_API_KEY"]
deployment = os.environ.get("GPT_IMAGE_DEPLOYMENT", "gpt-image-2")
api_version = "2025-04-01-preview"
url = f"{endpoint}openai/deployments/{deployment}/images/edits?api-version={api_version}"
with open("source_image.png", "rb") as img_file:
response = requests.post(url,
headers={"Api-Key": api_key},
data={
"prompt": "Replace the background with a modern office setting",
"n": 1,
"size": "1024x1024",
"quality": "high"
},
files={
"image": ("source.png", img_file, "image/png"),
# Optional mask for targeted editing:
# "mask": ("mask.png", open("mask.png", "rb"), "image/png"),
}
).json()
if "error" in response:
print(f"ERROR: {response['error']}")
else:
img_data = base64.b64decode(response["data"][0]["b64_json"])
with open("edited_image.png", "wb") as f:
f.write(img_data)
print("Edited image saved")
EOF
Use cases for editing:
| Error | Cause | Resolution |
|---|---|---|
DeploymentNotFound | Wrong deployment name | Check GPT_IMAGE_DEPLOYMENT env var |
401 Unauthorized | Invalid API key | Verify AZURE_OPENAI_API_KEY |
429 Too Many Requests | Rate limit | Wait + retry with lower quality |
content_policy_violation | Prompt blocked by content filter | Rephrase prompt, avoid specific faces/people |
InvalidPayload | Bad parameters | Check size is multiple of 16, valid format |
quality="high" for final presentation images. Use "medium" only for drafts/iteration.1536x1024 as the default size for most PPT use cases (good balance of quality and speed).background="transparent") are useful for overlaying on colored slide backgrounds.#E8913A, blue #4472C4, etc.) in prompts.Creates, edits, and optimizes skills for Claude Code, including drafting, evaluating with test prompts, iterating on performance, and improving skill descriptions for better triggering accuracy.
npx claudepluginhub huqianghui/csa-ppt-plugin --plugin csa-ppt