From jack-tar-deckhand
Top-level image orchestrator. Routes all slide image generation to the appropriate skill (jack-tar-ollama:image, jack-tar-ollama:icon, jack-tar-ollama:pattern, jack-tar-ollama:diagram, jack-tar-cloud:image, jack-tar-cloud:icon, render_chart). Produces ImageManifest and ChartManifest. Also reads strategy-map.json to determine per-slide rendering approach (full_render, backdrop_render, composed).
How this skill is triggered — by the user, by Claude, or both
Slash command
/jack-tar-deckhand:imagegen-bridgeThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Orchestrate ALL image generation for a presentation deck. This skill is invoked by the Deck Conductor after the SlideOutline and StyleGuide have been produced.
Orchestrate ALL image generation for a presentation deck. This skill is invoked by the Deck Conductor after the SlideOutline and StyleGuide have been produced.
You are the routing orchestrator. You NEVER generate images directly. You read the DeckContext, classify each slide's image needs, route to the appropriate generation skill, track budget and cache, post-process results, and write the ImageManifest and ChartManifest.
Consult the image-generation-expert agent for prompt translation advice when generating production-quality hero images.
Parse $ARGUMENTS for:
draft or production (default: draft)PLUGIN_ROOT=$(python3 -c "
from pathlib import Path
import sys, os
if os.environ.get('JACK_TAR_DECKHAND_ROOT'):
print(os.environ['JACK_TAR_DECKHAND_ROOT']); sys.exit()
home = Path.home()
for base in [home / '.claude' / 'plugins' / 'cache']:
for p in base.rglob('jack-tar-deckhand/.claude-plugin/plugin.json'):
print(str(p.parent.parent)); sys.exit()
dev = Path.cwd() / 'plugins' / 'jack-tar-deckhand'
if dev.exists():
print(str(dev)); sys.exit()
print('NOT_FOUND')
" 2>/dev/null)
if [ -z "$PLUGIN_ROOT" ] || [ "$PLUGIN_ROOT" = "NOT_FOUND" ]; then echo "ERROR: jack-tar-deckhand not found" && exit 1; fi
Before any image generation, read local-config.json from the project root to get machine-specific Ollama model tags and timeouts. This file is gitignored — it contains the exact model identifiers installed on this machine (e.g., x/z-image-turbo:fp8 not x/z-image-turbo).
python3 -c "
import json
with open('local-config.json') as f:
config = json.load(f)
print(json.dumps(config, indent=2))
"
Use config.ollama.default_image_model for hero/background/element images and config.ollama.default_diagram_model for diagrams. Never hardcode Ollama model names — always read from this file.
Call each engine plugin's verify skill to discover what's available. Extract the STATUS line from each response.
Call in sequence:
/jack-tar-ollama:verify/jack-tar-cloud:verifyParse each response:
Build the available_providers dict:
{
"ollama": {
"available": True/False,
"models": ["x/z-image-turbo", ...] # from MODELS section
},
"openai": {"available": True/False},
"google": {"available": True/False},
"fal": {"available": True/False},
"recraft": {"available": True/False}
}
If jack-tar-ollama is not installed or returns NOT_AVAILABLE, set ollama.available = False.
If jack-tar-cloud is not installed or returns NOT_AVAILABLE, set all cloud providers to False.
Report the findings:
If NO providers are available, warn that all images will be placeholders but continue — the deck must always be completable.
Read the required DeckContext files:
./tmp/deck/outline.json (SlideOutline) using the Read tool./tmp/deck/style-guide.json (StyleGuide) using the Read tool./tmp/deck/talk-brief.json (TalkBrief) using the Read tool -- needed for data_sources (charts)./tmp/deck/strategy-map.json (StrategyMap) if it exists — determines per-slide rendering strategy./tmp/deck/brand-profile.json (BrandProfile) if it exists — provides palette for prompt constraintsVerify all three required files exist. If any is missing, report the error and stop.
Parse the JSON content of each file.
Read or initialise the budget state:
python3 -c "
import json, os
budget_path = './tmp/deck/budget-state.json'
if os.path.exists(budget_path):
with open(budget_path) as f:
budget = json.load(f)
print(json.dumps(budget))
else:
print(json.dumps({'state': 'allow', 'spent': 0.0, 'total_budget': 2.0}))
"
Parse the budget state. The state field is one of: allow, allow_with_caps, degrade, typography_only.
If a strategy map exists, check each slide's strategy before routing:
prompt-engineer agent (Haiku model) with a structured brief from assemble_brief(), then render through Ollama → cloud_low → cloud_full stages.Use the image router to determine which skill handles each slide:
PYTHONPATH="$PLUGIN_ROOT" python3 -c "
import json
from src.image_router import route_all_slides, get_chart_slides
with open('./tmp/deck/outline.json') as f:
outline = json.load(f)
providers = $PROVIDERS_DICT
budget_state = '$BUDGET_STATE'
mode = '$MODE'
decisions = route_all_slides(outline, mode, providers, budget_state)
charts = get_chart_slides(outline)
result = {
'image_decisions': [d._asdict() for d in decisions],
'chart_slides': charts,
}
print(json.dumps(result, indent=2))
"
Review the routing decisions. Report a summary table:
| Slide | Visual Type | Skill | Provider | Model | Est. Cost | Fallback? |
|---|
For slides with strategy full_render or backdrop_render:
PYTHONPATH="$PLUGIN_ROOT" python3 -c "
from src.slide_prompt_composer import assemble_brief
import json
with open('./tmp/deck/outline.json') as f:
outline = json.load(f)
with open('./tmp/deck/style-guide.json') as f:
style_guide = json.load(f)
brief = assemble_brief(outline['slides'][SLIDE_INDEX], 'STRATEGY', style_guide, brand_profile, 'FUNNEL_STAGE')
print(json.dumps(brief, indent=2))
"
Dispatch the prompt-engineer agent with the brief to generate the image prompt.
Execute the funnel stage:
PYTHONPATH="$PLUGIN_ROOT" python3 -c "
from src.render_funnel import execute_funnel_stage
result = execute_funnel_stage(
deck_dir='./tmp/deck',
slide_number=N,
strategy='STRATEGY',
prompt='GENERATED_PROMPT',
funnel_stage='STAGE',
model='MODEL',
output_path='./tmp/deck/images/slide-NN-hero.png',
)
import json; print(json.dumps(result, indent=2))
"
slide-NN-hero-vN.png.Production mode: If production-upgrade-plan.json exists in the deck directory, skip this step and use Step 9A instead. The upgrade plan takes precedence over the routing matrix for production renders.
For each routing decision where skill is not skip and not placeholder:
PYTHONPATH="$PLUGIN_ROOT" python3 -c "
from src.cache_manager import ImageCacheManager
cache = ImageCacheManager()
cache_key = cache.compute_cache_key('$VISUAL_DIRECTION', ($WIDTH, $HEIGHT), 'presentation', '$MODEL', $PALETTE_LIST)
cached = cache.get(cache_key)
print(f'CACHE_HIT:{cache_key}' if cached is not None else f'CACHE_MISS:{cache_key}')
cache.close()
"
Track which slides have cache hits and which need generation.
For each slide that needs generation (cache miss), construct the model-specific prompt:
PYTHONPATH="$PLUGIN_ROOT" python3 -c "
from src.prompt_translator import translate_prompt
import json
translated = translate_prompt(
visual_direction='''$VISUAL_DIRECTION''',
model='$MODEL_NAME',
style_guide=$STYLE_GUIDE_DICT,
)
print(json.dumps(translated, indent=2))
"
For production-mode hero images, consult the image-generation-expert agent before finalising the prompt.
For pragmatic_composition slides that do NOT have a separate background image, include the target background colour in every element image prompt. Use descriptive language alongside hex values since Ollama models approximate hex colours rather than interpreting them precisely:
"on a very dark background, almost black with slight teal tint, hex #0E1513"
Use identical background description text across all element prompts for that slide. This is critical because the assembler samples the corner pixel of the first element image to set the slide background colour. If one element has a noticeably different background, it will create visible seams where the element image meets the slide background.
For atmospheric dark backgrounds, subtle textures, and neutral surfaces (strategy: background or pragmatic_composition background), prefer Ollama over cloud providers. Cloud models (especially Nanobanana) over-generate from vague atmospheric prompts, adding unwanted complexity, objects, and even text. Ollama produces cleaner, subtler results for this use case — and it's free.
Reserve cloud providers for images that need:
For each slide that needs generation, invoke the appropriate skill. Process slides sequentially.
IMPORTANT: Store the prompt. After generating each image, you MUST include the source_prompt field in the image manifest entry. This is the translated prompt that was actually sent to the model. The production upgrade plan needs these prompts to re-render at higher quality without regenerating them. Without source_prompt, the production pipeline cannot function.
After generating EVERY image, dispatch the image-reviewer agent to assess it. This keeps images out of the main orchestration context.
Generate the image with the current prompt
Dispatch the image-reviewer agent with:
Example dispatch:
Review this generated image for quality.
Image: ./tmp/deck/images/slide-10-scene-v3.png
Visual direction: "Side profile view of two heads facing each other..."
Brand palette: #006B5E, #5CDBC0, #0E1513, #F5FBF7
Strategy: backdrop
Iteration: 3 of 10
Parse the JSON verdict returned by the agent
If verdict is "pass": proceed to next image, log the summary
If verdict is "refine": use the issues array to guide prompt refinement, regenerate, and dispatch a new agent review
Escalation: after 3 consecutive "refine" verdicts, re-dispatch the image-reviewer at Sonnet tier for a more nuanced assessment
Hard stop: after 10 iterations total, accept the best version. Set status to "accepted_with_issues" in the manifest and store the final summary in "review_summary"
Save versions as slide-NN-TYPE-vN.png so the Speaker can review alternatives if needed. The final accepted version overwrites slide-NN-TYPE.png.
Context savings: The main context keeps only the summary string (~50 chars) per review, not the image itself. A 17-slide deck with 3 iterations each accumulates ~17 short strings instead of ~51 images.
Never skip review. A broken image that reaches the assembled deck wastes the Speaker's time and undermines confidence in the pipeline.
For pragmatic_composition slides, calculate the target aspect ratio from the strategy map's element_layout dimensions before generating each element image. For each element: aspect_ratio = element.w / element.h (normalised coordinates). Then set --width and --height to match this ratio at the desired resolution. For example, for a 2.79:1 ratio at 1024px wide: --width 1024 --height 368. Do NOT generate square images for non-square placement boxes -- the image will be stretched or cropped by the assembler, degrading quality.
/jack-tar-ollama:image "TRANSLATED_PROMPT" --output ./tmp/deck/images/slide-NN-hero.png --width 1024 --height 576 --model x/z-image-turbo
/jack-tar-ollama:pattern "TRANSLATED_PROMPT" --output ./tmp/deck/images/slide-NN-pattern.png --width 1024 --height 1024
/jack-tar-ollama:diagram "TRANSLATED_PROMPT" --type TYPE --output ./tmp/deck/images/slide-NN-diagram.png --width 1024 --height 768
/jack-tar-cloud:image "TRANSLATED_PROMPT" --output ./tmp/deck/images/slide-NN-TYPE.png --provider PROVIDER --model MODEL
When provider is google, the --model parameter selects the tier:
--model imagen-4.0-fast-generate-001 ($0.02)--model gemini-3.1-flash-image-preview ($0.067)--model gemini-3-pro-image-preview ($0.134)The routing matrix and production-upgrade-plan already specify the correct model. Use the model from the plan entry directly — do NOT hardcode model names in the bridge.
/jack-tar-cloud:icon "TRANSLATED_PROMPT" --output ./tmp/deck/images/slide-NN-icon --provider PROVIDER --colors PALETTE_HEX
PYTHONPATH="$PLUGIN_ROOT" python3 -c "
from src.render_chart import render_chart
result = render_chart(chart_type='$CHART_TYPE', data=$DATA, output_path='./tmp/deck/images/slide-NN-chart.png', style_guide=$STYLE_GUIDE)
import json; print(json.dumps(result))
"
PYTHONPATH="$PLUGIN_ROOT" python3 -c "
from src.process_image import generate_placeholder
generate_placeholder(width=1920, height=1080, colour='$HEX', output_path='./tmp/deck/images/slide-NN-placeholder.png')
"
If any skill invocation fails:
status: "failed" or status: "placeholder"After each cloud generation, update the budget tracker:
PYTHONPATH="$PLUGIN_ROOT" python3 -c "
from src.budget_tracker import BudgetTracker
import json
budget_path = './tmp/deck/budget-state.json'
with open(budget_path) as f:
budget_data = json.load(f)
bt = BudgetTracker(total_budget_usd=budget_data['total_budget'])
bt.log_api_call('$MODEL', $COST, '$IMAGE_ID')
budget_data['spent'] = bt.spent
budget_data['state'] = bt.state
with open(budget_path, 'w') as f:
json.dump(budget_data, f, indent=2)
print(f'Budget: \${bt.spent:.3f} / \${budget_data[\"total_budget\"]:.2f} ({bt.state})')
"
If budget state changes, re-route remaining slides with the new budget state.
In production mode, the imagegen-bridge reads production-upgrade-plan.json instead of computing routing decisions. The image-generation-expert agent has already determined the optimal engine for each slide.
PYTHONPATH="$PLUGIN_ROOT" python3 -c "
from src.image_router import load_upgrade_plan, execute_upgrade_plan_entry
import json
plan = load_upgrade_plan('./tmp/deck')
for entry in plan['entries']:
params = execute_upgrade_plan_entry(entry)
print(f'Slide {entry[\"slide_number\"]}: {params[\"skill\"]} via {params[\"provider\"]} ({params[\"model\"]})')
"
For each entry:
For entries where image_id contains elem-, skip the refinement loop — use draft_prompt directly with a single Pro call (element images are already validated during drafting).
For all other raster_upscale entries, execute the cross-tier refinement loop:
Phase 1 — Flash draft and refinement (up to 3 iterations)
Generate a Flash draft using gemini-3.1-flash-image-preview with the plan's draft_prompt:
/jack-tar-cloud:image "DRAFT_PROMPT" --provider google --model gemini-3.1-flash-image-preview --width WIDTH --height HEIGHT --output ./tmp/deck/images/slide-NN-hero-flash-v1.png
Dispatch the image-reviewer agent on the Flash output. If verdict is pass, skip Phase 2 entirely — Flash quality is sufficient and no Pro spend is needed. Store the Flash image as the final output.
If verdict is refine, dispatch prompt-engineer in refinement mode:
{
"mode": "refine",
"original_prompt": "<draft_prompt>",
"iteration": 1,
"reviewer_feedback": {
"strengths": ["<from reviewer strengths[]>"],
"issues": ["<from reviewer issues[]>"],
"composition_notes": {"<from reviewer composition_notes{}>"}
},
"brand_constraints": {"palette_hex": ["<from brand-profile.json>"]},
"funnel_stage": "cloud_low"
}
Generate Flash v2 with the refined prompt → re-review. If pass, use Flash as final; skip Pro.
If still refine after v2, do a third Flash iteration (total: 3 Flash calls max). Flash iterations are cheap (~$0.067 each) — iterate freely.
If all 3 Flash iterations return refine, escalate to Speaker:
Phase 2 — Pro escalation (single shot)
If Flash passes (on any iteration), take the prompt that produced the passing Flash result and generate once with gemini-3-pro-image-preview:
/jack-tar-cloud:image "REFINED_PROMPT" --provider google --model gemini-3-pro-image-preview --width WIDTH --height HEIGHT --output ./tmp/deck/images/slide-NN-hero.png
Dispatch image-reviewer on the Pro output. Pro gets ONE shot — no iterations.
status: "flag_for_speaker" in the manifest. Include both the Pro and best Flash versions so the Speaker can choose. Do not retry Pro.Manifest recording
Always store the source_prompt used for the final accepted image — this may be the original draft_prompt or a refined version. Record the iteration count and which tier produced the final image (flash_v1, flash_v2, flash_v3, pro) in review_summary.
Invoke jack-tar-cloud:icon with Recraft:
/jack-tar-cloud:icon "DRAFT_PROMPT" --provider recraft --output ./tmp/deck/images/slide-NN-diagram.svg
The output is SVG. After generation, rasterise it to PNG using src/process_image.py, passing the slide's background colour to fix Recraft's default white backgrounds:
from src.process_image import rasterize_svg
# Get slide background colour from the StyleGuide's slidePalette:
# title slides: slidePalette.title_slide.background (or palette.primary)
# content slides: slidePalette.content_slides.background (or palette.background)
# code slides: slidePalette.code_slides.background (or '#0E1513')
result = rasterize_svg(
'tmp/deck/images/slide-NN-diagram.svg',
'tmp/deck/images/slide-NN-diagram.png',
width=1920,
background_color=slide_bg_color, # e.g. '#F5FBF7' or '#0E1513'
)
This replaces Recraft's near-white SVG backgrounds with the actual slide background colour, preventing visible white rectangles on assembled slides.
Recraft V4 interprets prompts differently from raster models. Follow these rules:
These patterns reduced Recraft iterations from 3+ to 1-2 per diagram.
For slides with a single image, use the outline's visual_direction as the prompt (it may have been refined during drafting).
For slides with multiple element images (pragmatic_composition, three_across layouts), use the draft_prompt from the production upgrade plan entry for each element — NOT the outline's visual_direction. The outline has one visual_direction per slide but element images each need their own distinct prompt. Using the slide-level prompt for all elements produces identical images.
Rule: If image_id contains elem-, always use the production plan's draft_prompt for that entry.
Skip — the existing draft image is already production quality (matplotlib chart or similar).
After generating or regenerating any image for a slide whose strategy is backdrop or pragmatic_composition, you MUST re-run vision alignment to update detected_positions in the ImageManifest. The old coordinates are stale the moment the image changes.
backdrop or pragmatic_composition.vision-analyst agent with:
element_layout.elementselem_N IDs back to the element IDs from the strategy map (in left-to-right, top-to-bottom order).detected_positions array into the slide's ImageManifest entry.This step is not optional. Skipping it will cause text labels to misalign with the visual elements on the assembled slide.
When to trigger: Any time an image is generated, regenerated, or replaced for a position-dependent slide — including manual re-runs, prompt tuning, and production upgrades.
For each generated image (not cached, not placeholder):
PYTHONPATH="$PLUGIN_ROOT" python3 -c "
from src.process_image import resize, crop_to_aspect, compute_content_hash
resize('$PATH', $WIDTH, $HEIGHT)
crop_to_aspect('$PATH', '16:9')
content_hash = compute_content_hash('$PATH')
print(f'hash:{content_hash}')
"
PYTHONPATH="$PLUGIN_ROOT" python3 -c "
from src.cache_manager import ImageCacheManager
cache = ImageCacheManager()
cache.put('$CACHE_KEY', open('$IMAGE_PATH', 'rb').read())
cache.close()
"
Each image entry in $IMAGES_LIST MUST include source_prompt — the translated prompt that was sent to the generation model. Example entry:
{
"slide_number": 1,
"file_path": "./tmp/deck/images/slide-01-hero.png",
"status": "generated",
"content_hash": "abc123...",
"dimensions": {"width": 1024, "height": 576},
"alt_text": "Headline text",
"image_id": "slide-01-hero",
"model_used": "x/z-image-turbo",
"source_prompt": "A dramatic teal wave cresting over..."
}
PYTHONPATH="$PLUGIN_ROOT" python3 -c "
import json
from datetime import datetime, timezone
from src.deckcontext import write_contract
images = $IMAGES_LIST
manifest = {
'generated_at': datetime.now(timezone.utc).isoformat(),
'image_backend': 'multi-model',
'images': images,
'summary': {
'total_images': len(images),
'generated_count': sum(1 for i in images if i['status'] == 'generated'),
'cached_count': sum(1 for i in images if i['status'] == 'cached'),
'placeholder_count': sum(1 for i in images if i['status'] == 'placeholder'),
'failed_count': sum(1 for i in images if i['status'] == 'failed'),
'total_generation_seconds': round(sum(i.get('generation_time_seconds', 0) for i in images), 2),
},
}
write_contract('./tmp/deck', 'image-manifest', manifest)
print(json.dumps(manifest['summary'], indent=2))
"
PYTHONPATH="$PLUGIN_ROOT" python3 -c "
import json
from src.deckcontext import write_contract
charts = $CHARTS_LIST
write_contract('./tmp/deck', 'chart-manifest', {'charts': charts})
print(f'Charts rendered: {len(charts)}')
"
=== Image Generation Summary ===
Mode: draft|production
Provider availability: Ollama (yes/no), OpenAI (yes/no), Google (yes/no), FAL (yes/no), Recraft (yes/no)
Images:
Total: N
Generated: N (N via Ollama, N via cloud)
Cached: N (saved $X.XX)
Placeholders: N
Failed: N
Charts:
Total: N
Rendered: N
Budget:
Spent: $X.XX / $X.XX (NN%)
Budget state: allow|allow_with_caps|degrade|typography_only
Timing:
Total generation time: Xs
Average per image: Xs
Do not ask follow-up questions. Report and stop.
npx claudepluginhub stevegjones/jack-tar-deckhand --plugin jack-tar-deckhandProvides a checklist for code reviews covering functionality, security, performance, maintainability, tests, and quality. Use for pull requests, audits, team standards, and developer training.