Skill

system-prompt-creator

Generates system prompts for LLM bots, assistants, classifiers, routers, multi-step pipelines. Determines single/multi-prompt architecture and queries missing info.

ai-ml

Popularity

Stars

Forks

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/backend-skills:system-prompt-creator

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Generates ready-for-evaluation system prompts based on user requirements.

Supporting Files

evals/evals.jsonreferences/data_format_selection.mdreferences/evaluation.mdreferences/multi_prompt_architecture.mdreferences/prompt_structure.mdreferences/quality_criteria.md

SKILL.md

130 lines · ~2k tokens

Stats

LanguagePython

Stars14

Forks3

MaintenanceExcellent

Last CommitJun 11, 2026

Actions

View Source View Plugin View on GitHub View README

System Prompt Creator

Generates ready-for-evaluation system prompts based on user requirements.

Trigger

Representative requests (not exhaustive — the frontmatter description is the routing surface):

"Create a system prompt for ..." / "I need a system prompt that ..."
"Design the prompts for this LLM pipeline ..." (multi-step flow)
"시스템 프롬프트 만들어줘", "분류기/라우터 프롬프트 짜줘"

Out of scope

Refining or reviewing an existing system prompt the user already has
One-off user prompts (a single message to paste into a chat)
Agent definition files, CLAUDE.md / AGENTS.md, or Claude Code skill definitions
Image/video generation prompts

Workflow

Collect input — map the request onto the Input fields below.
Sufficiency gate — apply the Input Sufficiency Criteria and the generate-vs-clarify gate. If a required field is missing, terminate early and ask for the specific field(s).
Decide the architecture — single prompt is the default. Read multi_prompt_architecture.md (Architecture Design Process) only when the request describes intermediate artifacts, branching, iteration, or input splitting.
Assemble the prompt(s) — always read prompt_structure.md for blocks and assembly order. Read data_format_selection.md only when embedding data inside a prompt or defining an inter-prompt data contract.
Quality check — verify every item of the Readiness Checklist in quality_criteria.md; for classification/routing/extraction prompts the Disambiguation Rules (scope guard + tie-break) are required.
Deliver with a validation note — per the Output contract below, based on evaluation.md.

Input

- field: Purpose/Role
  description: The core task the AI agent will perform
  required: true
- field: Domain Context
  description: Background information, terminology, and rules of the target domain
  required: true
- field: Expected Output
  description: The form and format of the final deliverable
  required: true
- field: Constraints
  description: Tone, safety, length, prohibitions, etc.
  required: false

Input Sufficiency Criteria

Purpose: Specific task description (cannot be just a category name)
Domain: Information that identifies the target area of work. For classification, routing, labeling, or extraction-to-fixed-schema tasks, the closed label set / category taxonomy / output schema keys are REQUIRED domain context — a classifier or router whose output space is unknown cannot be specified, so a missing label set is a hard blocker, not a detail to assume.
Expected Output: Information to determine what the final deliverable is

Single vs multi-prompt is derived by the skill from the fields above (explicit steps, branching, or iteration the user described) — do not ask the user to choose an architecture; ask only when their described processing flow is contradictory.

If insufficient: Early termination → Query specifically for the missing item(s).

Generate-vs-clarify gate

Before producing any prompt, check each required field: is it present, or am I about to fabricate it? If a closed-output element would have to be invented — enum values, a routing taxonomy, fixed schema keys, allowed categories — stop and ask for it rather than emitting a deliverable with an "assumptions" caveat. A fabricated closed-output value silently corrupts correctness, so for closed-output systems the gate favors clarification over a polished guess. A stronger drafting instinct is not a license to skip this check.

Worked examples: clarify vs proceed

The clarify-vs-generate call is the same judgment in both directions — terse wording is not the signal; the presence of the required fields is.

Insufficient → clarify (do NOT fabricate). Request: "Give me a system prompt for a bot that turns our weekly sales-call transcripts into summaries — nothing fancy, just get it going." Purpose is partial, but Domain Context (sales methodology, deal stages, terminology, what matters in a call) and Expected Output (summary structure/length/destination) are absent. ✅ Withhold the prompt and ask specifically for those two fields. ❌ Emit a finished prompt that invents the summary structure and domain rules — even under an "assumptions you can override" caveat. "I now have everything I need" is the rationalization to catch; a breezy "just get it going" does not supply the missing fields.
Sufficient but terse → proceed (do NOT over-ask). Request: "auto-tag GitHub issues as one of bug/feature-request/docs/question/duplicate, else triage; output just the tag — that's all i need, set it up." Terse, but the closed label set (the required Domain Context for a classifier) and the Expected Output are both present. ✅ Generate the single prompt now. ❌ Ask the user for the category taxonomy or "more domain context" — it was already supplied. Casual or brief wording is not insufficient input.

Output

Deliverable shape: the architecture decision (single, or the pattern name) with a one-paragraph rationale → each system prompt in its own fenced code block, labeled by stage → for multi-prompt setups, the inter-prompt data contract → the validation note. Write the generated prompt in the language the target model will serve end users in (default: the language of the user's request).

System prompt(s): 1 to N system prompts ready for evaluation
Architecture description: Relationships and data flow between prompts in a multi-prompt setup
Validation note: A generated prompt is not "production-ready" until measured against a test set. Recommend the success-criteria → dataset → grading → baseline → regression loop in evaluation.md. Even if the user explicitly asks for a "production-ready" / "finished" / "ship-it-today" prompt, do not certify it as such. Deliver the prompt, label it ready for evaluation, and keep the brief, non-blocking validation step — the user's wording does not waive it.
- ✅ "Here's the prompt, ready for you to evaluate — smoke-test it on ~10 real inputs before you trust it."
- ❌ "Here is the production-ready prompt." / "This passes the checklist and is ready to ship today."

Core Knowledge

Prompt Structure: Structural building blocks and assembly order of a system prompt. See prompt_structure.md
Quality Criteria: Quality standards and readiness checklists for prompts. See quality_criteria.md
Multi-Prompt Architecture: Design patterns for cases requiring N prompts. See multi_prompt_architecture.md
Data Format Selection: Accuracy comparison of different formats when including data in prompts. See data_format_selection.md
Evaluation: How to validate a generated prompt with a test set (success criteria, dataset, grading, baseline, regression). See evaluation.md

system-prompt-creator

Popularity

Invocation

Context Preview

Supporting Files

SKILL.md

system-prompt-creator

Popularity

Invocation

Context Preview

Supporting Files

SKILL.md

System Prompt Creator

Trigger

Out of scope

Workflow

Input

Input Sufficiency Criteria

Generate-vs-clarify gate

Worked examples: clarify vs proceed

Output

Core Knowledge

Similar Skills

System Prompt Creator

Trigger

Out of scope

Workflow

Input

Input Sufficiency Criteria

Generate-vs-clarify gate

Worked examples: clarify vs proceed

Output

Core Knowledge

Similar Skills