Use when writing or debugging FLUX.2 image generation prompts (Black Forest Labs flux2-max, flux2-pro, flux2-flex, flux2-klein on Replicate or BFL API), including text-to-image, image editing, and reformulating verbose prompts. Symptoms include flat lighting, default 3x3 grid layouts, ignored later instructions, wrong colors, miscounted items, or vague "professional photo" output.
How this skill is triggered — by the user, by Claude, or both
Slash command
/flux2-prompt-engineering:flux2-prompt-engineeringThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
FLUX.2 is Black Forest Labs' image generation and editing model family. It uses Mistral-3 24B VLM as its text encoder — it reads natural prose, not keyword lists. The single most important constraint: **30–80 words is the sweet spot**. The model accepts up to 32K tokens but attention degrades beyond ~80 words, causing later instructions to be silently ignored.
FLUX.2 is Black Forest Labs' image generation and editing model family. It uses Mistral-3 24B VLM as its text encoder — it reads natural prose, not keyword lists. The single most important constraint: 30–80 words is the sweet spot. The model accepts up to 32K tokens but attention degrades beyond ~80 words, causing later instructions to be silently ignored.
Use this skill when:
Do NOT use this skill for:
--ar, --v, --style flags, weighting syntax differs)The 12 rules below are FLUX.2-specific. Applying them to other models will degrade output.
STRUCTURE: Count + Surface → Per-item (position, description, color, angle) → Shadows → Camera → Size
WORD COUNT: Target 60–80. Hard ceiling 100. Recount before sending — over runs are silent quality losses.
COLORS: Hex codes per object: "wall #363636", "interior #B8B8B8"
CAMERA: "Hasselblad X2D 80mm f/5.6" (never "professional photo")
LAYOUT: Explicit positions: "upper left," "far lower right corner"
ANGLES: Per-item: "straight" or "tilted N degrees" — alternate between them
SHADOWS: Physical objects: "monstera shadow," "venetian blind stripes at 30°"
SURFACES: "Micro-cement" for smooth modern; "concrete" only if rough is desired
TEXT: In 'single quotes', keep 1–4 words, describe font style
NEGATIVE: Never. Rephrase as positive description.
FORMATTING: No em dashes, bullets, markdown. Continuous prose only.
COUNTS: Exact number, never ranges ("six items" not "5–6 items")
UPSAMPLING: On for T2I with short prompts. Off for edits.
EDITING: One instruction per pass. Can't resize. Crop+upscale instead.
CROP TRICK: 10–15% inward crop + LANCZOS upscale = reliable zoom
FLUX.2 does not support negative prompts. Negations often cause the model to generate exactly what you're avoiding.
| BAD | GOOD |
|---|---|
| "no blur" | "sharp focus" |
| "no people" | "empty scene" |
| "no text inside frames" | "solid uniform cream interiors" |
| "photo areas contain no image" | "blank polaroid frames with solid gray interiors" |
| "no bright or saturated colors" | "muted warm neutral palette" |
The model needs a single number to anchor its composition. Ranges cause ambiguity.
| BAD | GOOD |
|---|---|
| "5–6 paper elements" | "Six items" |
| "several polaroids" | "Three polaroids" |
| "a few swatch cards" | "Two Pantone swatch cards" |
Every word must carry visual information. The VLM processes meaning, not grammar. Articles, adjectives that don't change the visual output, and flowery descriptions waste tokens.
Filler patterns to cut:
Always pair hex codes with the exact object. Abstract color references produce inconsistent results.
| BAD | GOOD |
|---|---|
| "muted warm neutrals" | (assign per object below) |
| "anthracite/dark olive" | "anthracite #6B5F58" |
| "cool silver-gray" | "steel gray #8A9199" |
| "dark charcoal" | "#3A3A3A" |
| "cream-colored" | "off-white #E8E4DF" |
Naming specific equipment produces more authentic photorealism than abstract descriptors.
| BAD | GOOD |
|---|---|
| "Shot with a medium format camera" | "Hasselblad X2D 80mm f/5.6" |
| "professional photo" | "Shot on Sony A7IV, 85mm, f/2.8" |
| "photorealistic" | (use camera spec instead) |
| "natural film-like grain" | "Kodak Portra 400 color science" |
Focal length controls zoom level:
Abstract layout descriptors ("scattered," "casual arrangement") are ignored — the model defaults to grid layouts. Assign every item a named position.
Position vocabulary: "upper left," "far upper right corner," "left middle edge," "center-right," "lower left," "far lower right corner," "bottom center near edge"
Each item gets its own sentence: position first, then description.
Upper left: large straight polaroid, white border, solid #B8B8B8 interior, washi tape top.
Angle ranges produce uniform tilt. Assign specific degrees to each item, and make some explicitly straight.
| BAD | GOOD |
|---|---|
| "rotated at subtle casual angles (2–8°)" | Item A: "straight." Item B: "tilted 12 degrees." Item C: "tilted 6 degrees." |
| "each rotated slightly" | "Large items straight, small items tilted" |
"Soft shadows" or "warm natural light" produces flat lighting. Two requirements, both mandatory:
The word "soft" is a red flag — it almost always hides a missing physical source. Replace it.
| BAD | GOOD |
|---|---|
| "warm directional sunlight" | "warm afternoon light from upper left" |
| "soft shadows from a window" | "venetian blind shadows at 30 degrees from window upper left" |
| "filtered through horizontal window blinds" | "diagonal venetian blind stripe shadows at 30 degrees" |
| "A monstera deliciosa leaf casts its organic shadow" | "monstera leaf shadow falling diagonally across upper left items" |
| "leather portfolio shadow across the corner" | "leather portfolio shadow at 45 degrees across lower right" |
"Horizontal blinds" produces flat horizontal lines. "At 30 degrees" or "diagonal" produces realistic angled stripes. Both shadow types can coexist:
Monstera leaf shadow diagonally across upper left, venetian blind stripe shadows at 30 degrees across surface.
"Concrete" produces heavily pitted, rough surfaces. Use interior finish terminology for modern aesthetics.
| BAD | GOOD |
|---|---|
| "dark charcoal concrete wall with fine porous texture" | "smooth matte dark gray micro-cement wall #363636" |
| "concrete surface" | "micro-cement surface" (real finish term) |
FLUX.2 can render legible text. Use quotation marks around the text content and describe placement.
| BAD | GOOD |
|---|---|
| reading MARCO LEDER / CREATIVE DIRECTOR | text reading 'MARCO LEDER' and 'CREATIVE DIRECTOR' in small sans-serif |
Keep text short (1–4 words per line). Describe font style ("minimal sans-serif," "bold industrial lettering") and surface ("on the business card," "above the door").
The prompt is processed by a VLM text encoder, not rendered as markup. Formatting characters waste tokens and may confuse parsing.
Remove: em dashes (—), bullet points (•, -, *), markdown headers (#), parenthetical asides, line break formatting, ellipses (...)
Write as: continuous flowing prose, sentence after sentence.
Start the prompt with the number of items and the surface. This anchors the model's composition.
Six items on smooth dark gray micro-cement wall #363636.
Priority order: Subject count + surface → Per-item descriptions → Lighting/shadows → Camera/technical → Dimensions
Word order matters — FLUX.2 weights earlier tokens more heavily. What comes first gets the most attention.
| Length | Words | Use Case |
|---|---|---|
| Short | 10–30 | Quick concepts, style exploration |
| Medium | 30–80 | Sweet spot for most projects |
| Long | 80–100 | Complex multi-item scenes (hard ceiling) |
| Danger zone | 100+ | Attention degrades, later items ignored |
When given a verbose or poorly structured prompt, follow these steps in order:
Before sending any prompt to FLUX.2, verify:
A top-down flatlay photograph of a designer's moodboard on a dark charcoal concrete wall with fine porous texture. Warm directional sunlight enters from upper left, filtered through horizontal window blinds, casting soft diagonal bands across the scene. A monstera deliciosa leaf casts its organic shadow in the upper-left corner. Scattered across the wall are 5–6 matte paper elements, each rotated at subtle casual angles (2–8°): several blank Polaroid frames in varying sizes (small, medium, large) with classic thick white borders. The photo areas contain no image, just soft gray gradients. Two Pantone swatch cards — one anthracite/dark olive with PANTONE® logo, one with two blocks of dark charcoal and cool silver-gray. One cream-colored business card reading "MARCO LEDER / CREATIVE DIRECTOR / STUDIO" in minimal sans-serif. Color palette is muted warm neutrals. No bright or saturated colors. Shot with a medium format camera. Photorealistic.
Violations: 150 words (still ~2× over sweet spot), negative phrasing ("no image," "no bright colors"), range item count ("5–6"), abstract layout ("scattered"), angle range ("2–8°"), no hex codes, no camera specifics, em dashes, abstract shadow description, "concrete" surface.
Top-down flat lay, six items on smooth matte dark gray micro-cement wall #363636. Upper left: tall Pantone card, white stock, PANTONE label, solid #B8B8B8 top block and anthracite #6B5F58 bottom block, washi tape, straight. Upper right: large straight polaroid, white border, solid #A8B0B8 interior, washi tape top. Center: medium polaroid tilted 8 degrees, solid #CCCCCC interior, washi tape. Lower left: Pantone card, two blocks charcoal #3A3A3A and steel gray #8A9199, labeled 'PANTONE ANTHRACITE' and 'STILL LIFE 03', tilted 12 degrees. Lower right: large straight polaroid, solid #B0B0B0 interior, washi tape. Bottom center: off-white #E8E4DF business card, rounded corners, text 'MARCO LEDER' and 'CREATIVE DIRECTOR' in small sans-serif, tilted 5 degrees. Monstera leaf shadow across upper left items, diagonal venetian blind shadows at 30 degrees across surface, warm afternoon light. Hasselblad X2D 80mm f/5.6, dark minimalist editorial. 2048x1324.
Use with: "prompt_upsampling": true — the dense structure provides layout; upsampling enriches material detail.
| Aspect | Before | After |
|---|---|---|
| Words | ~150 | ~90 |
| Item count | "5–6" | "six items" |
| Layout | "Scattered across" | Per-item positions |
| Angles | "2–8° range" | Per-item: "straight," "8 degrees," "12 degrees," "5 degrees" |
| Colors | "anthracite/dark olive" | "#6B5F58" |
| Negatives | "no image," "no bright colors" | "solid #B8B8B8 interior," removed |
| Camera | "medium format camera" | "Hasselblad X2D 80mm f/5.6" |
| Surface | "dark charcoal concrete wall" | "smooth matte dark gray micro-cement wall #363636" |
| Shadows | "filtered through horizontal window blinds" | "diagonal venetian blind shadows at 30 degrees" |
| Formatting | Em dashes, bullets | Continuous prose |
| Text | No quotes | 'MARCO LEDER' in quotes with font described |
| Model | Best For | References |
|---|---|---|
| max | Highest fidelity, complex edits, character consistency | Up to 8 images |
| pro | Production at scale, good quality/cost balance | Up to 8 images |
| flex | Typography, fine-grained control (steps, guidance) | Up to 10 images |
| klein | Sub-second generation, real-time apps | Up to 5 images |
{
"width": 2048,
"height": 1324,
"prompt": "...",
"resolution": "4 MP",
"aspect_ratio": "custom",
"output_format": "png",
"output_quality": 80,
"safety_tolerance": 2,
"prompt_upsampling": true
}
input_images: Omit entirely for text-to-image. Do not send an empty array. For editing, provide the source image URL.
safety_tolerance: 0–6 scale. Default 2. Does not affect layout or quality.
JSON structured prompts: FLUX.2 supports JSON with fields like scene, subjects, camera. However, Replicate double-escapes inner JSON when embedded in the payload, corrupting the prompt. Always flatten JSON to natural language prose instead.
The prompt_upsampling parameter tells the model's text encoder to automatically expand your prompt with material, lighting, and atmospheric details.
USE when: Writing a dense caveman-mode prompt (60–85 words) for text-to-image generation. The dense prompt provides structure; upsampling fills in richness.
DO NOT USE when: Editing an existing image (rewrites edit instructions), or when your prompt is already 100+ words (pushes past attention limit).
Color swaps, style changes, adding/removing objects, background replacement, text changes.
Geometric resizing — "make cards 30% larger" has no meaning in diffusion space. The model thinks in semantics, not pixels.
Workaround: crop inward and upscale with LANCZOS.
from PIL import Image
img = Image.open("input.png")
w, h = img.size
pct = 15 # 10=subtle, 15=moderate, 20=aggressive
crop_x, crop_y = int(w * pct / 100), int(h * pct / 100)
cropped = img.crop((crop_x, crop_y, w - crop_x, h - crop_y))
result = cropped.resize((2048, 1324), Image.LANCZOS)
result.save("output.png")
| Mistake | Symptom | Fix (Rule N) |
|---|---|---|
| Verbose 180-word prompt | Later instructions ignored, wrong items | Caveman compress to 80 words + prompt_upsampling: true (Rules 3, 12) |
| "Scattered," "casually arranged" | Default 3x3 grid layout | Per-item named positions (Rule 6) |
| Embedding JSON in Replicate payload | Corrupted prompt, garbage output | Flatten to natural language prose (API Parameters) |
| Saying "make cards 30% larger" in edit | No change to size | Crop + LANCZOS upscale (Image Editing) |
| Using "same" repeatedly in edit prompt | Requested change doesn't apply | Describe end state instead (Editing Rules) |
| "Concrete" wall | Heavily pitted, rough texture | "Micro-cement" for smooth modern (Rule 9) |
| "Soft natural light" / "warm sunlight" | Flat, lifeless lighting | Name the physical shadow source + angle (Rule 8) |
| "Multi-edit: change color and add a logo" | Only one edit applied | One instruction per pass (Editing Rules) |
| 80mm focal length, want tighter crop | Subject too small | Bump focal length to 100–120mm (Rule 5) |
| Item count buried mid-sentence | Wrong number of items rendered | Front-load count + surface in first words (Rule 12) |
Guides creation, editing, and verification of skills for AI coding agents using test-driven development with subagent scenarios. Use when authoring or debugging skills.
npx claudepluginhub marcoleder/claude-plugins --plugin flux2-prompt-engineering