From azure-gpt-image
Generate or edit images with Azure OpenAI gpt-image-2 (also works with gpt-image-1/1.5/1-mini). Use this skill whenever the user wants to create, render, illustrate, mock up, or edit an image, or asks for icons, posters, hero shots, product shots, avatars, concept art, banners, thumbnails, or any visual asset — and whenever they mention gpt-image, gpt-image-2, DALL·E, Azure OpenAI image, or "generate/make/produce an image" on Azure. Also use when the user wants to inpaint, mask out, or modify part of an existing image, or composite multiple images. Always invoke this skill before reaching for any other image-generation approach in an Azure environment.
How this skill is triggered — by the user, by Claude, or both
Slash command
/azure-gpt-image:azure-gpt-imageThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Use the bundled CLI at `scripts/azure-image.ts` to generate or edit images via Azure OpenAI. It uses Node's built-in `fetch`/`FormData` — no `npm install` needed. The shebang invokes `npx -y tsx`, so the first run downloads `tsx` (cached afterward).
Use the bundled CLI at scripts/azure-image.ts to generate or edit images via Azure OpenAI. It uses Node's built-in fetch/FormData — no npm install needed. The shebang invokes npx -y tsx, so the first run downloads tsx (cached afterward).
The CLI reads credentials from environment variables:
| Variable | Required | Default | Notes |
|---|---|---|---|
AZURE_OPENAI_ENDPOINT | yes | — | e.g. https://my-resource.openai.azure.com |
AZURE_OPENAI_API_KEY | yes | — | resource key from Azure portal. Sent as both Authorization: Bearer <key> (Foundry endpoints) and api-key: <key> (classic Azure OpenAI) so both endpoint families work. |
AZURE_OPENAI_DEPLOYMENT | no | gpt-image-2 | your deployment name |
AZURE_OPENAI_API_VERSION | no | preview | the new v1 endpoint accepts preview |
If a variable is missing, the script exits non-zero with a clear message — surface that error to the user rather than guessing. Don't try to read keys from .env files or other locations; the user is expected to have them in their shell environment.
Rate limits. Azure resources have per-minute caps (commonly 5–10 requests/minute on lower tiers). Don't fan out parallel image calls — sequence them with a few seconds between each. If you get HTTP 429, wait 30–60s and retry.
1024x1024, quality=high, n=1, output saved as ./image.png in the current working directory.~/.claude/skills/azure-gpt-image/scripts/azure-image.ts generate \
--prompt "<carefully crafted prompt>" \
--size 1024x1024 --quality high --out ./out.png
gpt-image-2 rewards specific, visually concrete prompts. Describe:
If the user gives a brief prompt, you may expand it before sending. If they give a detailed prompt, pass it through unchanged.
For gpt-image-2 the common useful sizes:
1024x1024 — fastest square1536x1024 — landscape1024x1536 — portrait2048x2048 — large square (slower, more tokens)Arbitrary sizes are allowed if all of:
If the user asks for a non-conforming size, round to the nearest legal size and tell them what you used.
low → fastest draft (good for exploration), medium → balanced, high → final assets, auto → model decides. Default to high unless the user is clearly iterating quickly. Token costs grow steeply with quality — for batch generation (--n 4+), consider medium first.
When the user gives you an existing image and asks to change it, use edit:
~/.claude/skills/azure-gpt-image/scripts/azure-image.ts edit \
--image ./photo.png --prompt "replace the sky with a sunset" \
--out ./photo-sunset.png
For precise localized edits, supply a PNG mask: same dimensions as the image, alpha channel where transparent pixels (alpha=0) mark the editable region. Opaque pixels are kept unchanged. The mask may be made with Photoshop, GIMP, or programmatically (e.g. a small sharp/pillow script).
--input-fidelity high preserves features (faces, logos, fine details) of the original more strongly. Use it for face edits and brand work; skip it for generic background changes.
To composite multiple sources into one image, repeat --image:
azure-image.ts edit --image hero.png --image product.png \
--prompt "place the product on the desk in the hero shot" --out composite.png
Input images must be PNG or JPG, < 50 MB each.
--n 2 through --n 10 generates multiple variations in one call. With --n > 1, pass --out-dir ./out/ instead of --out; the script writes image_0.png, image_1.png, ... If the user asks for "a few options" or "show me variations", default to --n 4.
Azure's apim gateway closes idle connections after ~60–90 seconds. Long-running renders (--quality high, especially edit --input-fidelity high) regularly exceed that and the request fails with HTTP 408 even though the model is still working. The documented fix is to stream — SSE events keep the connection warm, and the same final image arrives via the completed event.
The CLI handles this automatically: on any HTTP 408, it retries once with --stream enabled. You don't need to opt in. If you see warn: HTTP 408 from gateway — retrying with --stream, that's the recovery — don't downgrade quality.
You can still pass --stream up front for slow renders you expect to be long (typically quality=high edits). Add --save-partials to persist the intermediate frames (<stem>.partial_0.png, etc.) if you want to show progress; otherwise the CLI only writes the final image.
--output-format png (default) for graphics, transparency-capable, lossless--output-format jpeg with --output-compression 80 for photos / web useWebP is not supported on Azure (only on OpenAI direct).
The script exits non-zero with the API's error body on failure. Common ones:
contentFilter — prompt or output was flagged. Reword more neutrally and try again. Do not bypass safety filters or coach the user around them.DeploymentNotFound — AZURE_OPENAI_DEPLOYMENT doesn't match a deployment in this resource. Ask the user to confirm the name.--stream and that almost always succeeds. If it 408s a second time, the render itself is hanging; reduce --quality or --n, or try again later.When an error fires, show the user the raw message (don't paraphrase loosely) and propose a concrete next step.
See references/api.md for the full Azure REST contract — endpoint shapes, parameter table, response schema. Read it if the user asks about the API directly, wants to call it without this CLI, or needs a feature the CLI doesn't expose.
Creates, edits, and optimizes skills for Claude Code, including drafting, evaluating with test prompts, iterating on performance, and improving skill descriptions for better triggering accuracy.
npx claudepluginhub getcreatr/skills --plugin azure-gpt-image