Skill

cutover-executor

Execute BOTH approved cutover checklists (control plane then data plane) against the target AWS account, one resource at a time. Reads cutover-checklist-control-plane.json, cutover-checklist-data-plane.json, data-migration-plan.json, migration-plan.json. Builds an execution-steps.json with preview/execute/verify/rollback/poll per step. Walks control-plane steps first (Terraform module applies + AWS control-plane API), then data-plane steps (snapshot share, restore, DataSync, DMS, freeze, route53 swap, validation). Mandatory per-step human approval. Halts and offers rollback on failure. Resumable via append-only JSONL journal — on resume, re-verifies any in-flight steps against AWS before continuing.

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/aws-migration-architect:cutover-executor

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

This skill is what actually moves resources. Every prior skill produced *artifacts*; this one mutates AWS. The executor walks the **two approved checklists** in order: control plane (creating the empty target shape) first, then data plane (moving data into it, freezing writes, swapping DNS, validating). Per-step human approval is mandatory and cannot be turned off.

Supporting Files

SPEC.md

SKILL.md

203 lines · ~4.9k tokens

Stats

LanguageJavaScript

Parent stars0

MaintenanceGood

Last CommitJun 13, 2026

Actions

View Source View Plugin View on GitHub View README

Stats

Actions

AWS Migration: Cutover Executor

This skill is what actually moves resources. Every prior skill produced artifacts; this one mutates AWS. The executor walks the two approved checklists in order: control plane (creating the empty target shape) first, then data plane (moving data into it, freezing writes, swapping DNS, validating). Per-step human approval is mandatory and cannot be turned off.

When to use this skill

After cutover-control-plane AND cutover-data-plane have produced their respective checklists AND a human has read both end-to-end and signed both (APPROVED BY: line near the top of each)
When the operator is ready to start the actual cutover window
On resume after a halted/aborted prior run (the journal makes this safe)

Prerequisites

cutover-checklist-control-plane.json + .md exist in the run directory
cutover-checklist-data-plane.json + .md exist
data-migration-plan.json exists (sizing, freeze windows, validation criteria)
migration-plan.json exists (rollback steps per phase)
dependency-graph.json, hardcoded-values.json, resource-ownership.json exist
Target IAM: target-cutover-control-plane.json attached for the entire run; target-cutover-data-plane.json additionally attached when the executor reaches the data-plane phases (executor prompts the operator at the handoff)
terraform/ modules generated by terraform-generator exist and terraform validate is clean
The human has read BOTH checklists end-to-end and added an APPROVED BY: <name> ON: <YYYY-MM-DD> line near the top of each markdown file

Inputs

Input	Source	Required
`cutover-checklist-control-plane.json`	`cutover-control-plane`	yes
`cutover-checklist-control-plane.md`	`cutover-control-plane`	yes (sign-off check)
`cutover-checklist-data-plane.json`	`cutover-data-plane`	yes
`cutover-checklist-data-plane.md`	`cutover-data-plane`	yes (sign-off check)
`data-migration-plan.json`	`data-migration-planner`	yes
`migration-plan.json`	`migration-planner`	yes
`dependency-graph.json`	`dependency-analyzer`	yes
`hardcoded-values.json`	`dependency-analyzer`	yes
`resource-ownership.json`	`inventory`	yes
`terraform/` directory	`terraform-generator`	yes
`execution-log.jsonl` (prior run)	self	only on resume

Outputs

execution-steps.json — the compiled per-resource step list, validates against schemas/execution-step.schema.json (array of steps). Generated once before execution begins.
execution-log.jsonl — append-only journal. Every preview/approve/execute/verify/poll/rollback event is one line. Validates against schemas/execution-log.schema.json when read as { metadata, entries }.
execution/<step_id>.stdout.log, .stderr.log — full I/O of every executed command. The journal keeps tails; full output lives here.
execution-report.md — human-readable summary at end of run: per-phase outcomes, skipped/failed step IDs, links to log files.

Workflow

Step 1 — Pre-flight gate

Before generating any execution steps, verify all of:

Concurrent-run lock. Read <root>/RUNNING.lock if it exists. If a live PID owns it, halt with the holder's identity. If the lock is stale (dead PID), require --resume to take over; otherwise halt and ask the operator to either resume or remove the stale lock. If no lock exists, write a fresh one with our PID, hostname, started_at, and resolved operator identity.
aws sts get-caller-identity --profile $MIGRATION_SOURCE_PROFILE succeeds
aws sts get-caller-identity --profile $MIGRATION_TARGET_PROFILE succeeds
target-cutover-control-plane.json permissions are attached
terraform init && terraform validate in terraform/root/ is clean
cutover-checklist-control-plane.md has a human approval signature (look for ^APPROVED BY: .+ ON: \d{4}-\d{2}-\d{2} line within the first 50 lines — if absent, halt and tell the operator)
cutover-checklist-data-plane.md has its own approval signature (same format, separate sign-off — operator must approve both checklists individually)

Operator identity is resolved once at pre-flight and used throughout the run for every approval-class journal event. Resolution order: $USER → git config --global user.email → "unknown".

If any pre-flight check fails: write a single execution-log.jsonl entry of event step-failed with notes explaining which check failed, and stop. Do not generate execution-steps.json. (Lock file is NOT written if pre-flight fails — no cleanup needed.)

There is a SECOND IAM check at the control-plane → data-plane handoff (Step 4): the executor prompts the operator to attach target-cutover-data-plane.json before proceeding to data-plane steps, and refuses to advance until simulation confirms the attachment.

Mid-run SSO probe. SSO sessions expire (typically 1 hour). The executor probes both profiles via aws sts get-caller-identity whenever 15 minutes have elapsed since the last successful probe. On failure: halt loudly with aws sso login --profile <name> instructions and the resume command. The journal records each probe outcome (sso-probe-passed / sso-probe-failed); the lock file stays so resume can take over.

Lock lifecycle. Written at pre-flight success. Removed at clean exit (completed / aborted / halted-by-operator). Left in place on unclean exit (process killed) so the next --resume invocation can detect it as stale and take over.

Step 2 — Compile execution-steps.json from BOTH checklists

Walk control-plane first, then data-plane. The compiled execution-steps.json is a single flat array with steps in execution order: all 7 control-plane phases (0–6), then all 5 data-plane phases (1–5).

Step ID prefix carries the plane:

Control-plane steps: cp-phase{N}-{seq}-{slug} (N = 0..6)
Data-plane steps: dp-phase{N}-{seq}-{slug} (N = 1..5)

The handoff between control plane and data plane is itself an execution step (cp-phase6-999-handoff-to-data-plane) with action: "manual-decision". Its execute_cmd is a no-op; its approval prompt summarizes the handoff_to_data_plane.criteria[] from the control-plane checklist and asks the operator to confirm all are met PLUS that target-cutover-data-plane.json is now attached.

For each during_cutover item in either checklist, emit one or more execution-step entries. The mapping rules:

Each checklist item that names a tool (control plane) or has operation_type (data plane) produces ≥1 execution step. Items without a tool (e.g. "stakeholders notified") become action: "manual-decision" with tool: "human".
step_id is deterministic: {cp|dp}-phase{N}-{seq}-{slug} where seq is zero-padded 3-digit order within the phase and slug is kebab(resource-name + action). Stable across reruns so the journal matches.
Control-plane terraform-apply steps are module-level. One step per Terraform module (networking, storage, etc.), execute_cmd: terraform apply -target=module.<name>. NOT per-resource.
requires[] carries the dependency edges: a Phase 2 S3 sync requires the Phase 1 target-bucket terraform-apply step; an RDS snapshot-restore requires the snapshot-share step; KMS grants precede any encrypted snapshot operation.
rollback_cmd is pulled from migration-plan.phases[N].rollback.steps[]. When the plan's rollback is a list, join with && only when all steps are idempotent; otherwise emit the first step and put the remainder in notes so the human runs them manually.
approval_required is always true for risk: high, for any action containing delete/promote/route53-change, and for any step whose resource_arn appears in hardcoded-values.manual_review_required[].
long_running: true for datasync-start, dms-start-replication-task, s3-batch-replication-job, dynamodb-export, dynamodb-import. Each MUST set poll_cmd, poll_interval_seconds (default 60), and poll_terminal_states.

Canonical command templates per action (the agent expands placeholders from the checklist + dependency graph):

Action	preview_cmd	execute_cmd	verify_cmd
`terraform-apply`	`terraform plan -target={addr}`	`terraform apply -target={addr} -auto-approve`	`terraform state show {addr}`
`snapshot-share`	`aws rds describe-db-snapshot-attributes --db-snapshot-identifier {id} --profile $MIGRATION_SOURCE_PROFILE`	`aws rds modify-db-snapshot-attribute --db-snapshot-identifier {id} --attribute-name restore --values-to-add {target_acct} --profile $MIGRATION_SOURCE_PROFILE`	`aws rds describe-db-snapshot-attributes --db-snapshot-identifier {id} --profile $MIGRATION_SOURCE_PROFILE \| grep {target_acct}`
`snapshot-restore`	(none — destructive in target)	`aws rds restore-db-instance-from-db-snapshot --db-instance-identifier {new_id} --db-snapshot-identifier {arn} --profile $MIGRATION_TARGET_PROFILE`	`aws rds describe-db-instances --db-instance-identifier {new_id} --query 'DBInstances[0].DBInstanceStatus' --profile $MIGRATION_TARGET_PROFILE`
`ami-share`	`aws ec2 describe-image-attribute --image-id {ami} --attribute launchPermission --profile $MIGRATION_SOURCE_PROFILE`	`aws ec2 modify-image-attribute --image-id {ami} --launch-permission "Add=[{UserId={target_acct}}]" --profile $MIGRATION_SOURCE_PROFILE`	`aws ec2 describe-image-attribute --image-id {ami} --attribute launchPermission --profile $MIGRATION_SOURCE_PROFILE \| grep {target_acct}`
`kms-grant`	`aws kms list-grants --key-id {key} --profile $MIGRATION_SOURCE_PROFILE`	`aws kms create-grant --key-id {key} --grantee-principal arn:aws:iam::{target_acct}:root --operations Decrypt DescribeKey --profile $MIGRATION_SOURCE_PROFILE`	`aws kms list-grants --key-id {key} --profile $MIGRATION_SOURCE_PROFILE \| grep {target_acct}`
`datasync-start`	`aws datasync describe-task --task-arn {task_arn} --profile $MIGRATION_TARGET_PROFILE`	`aws datasync start-task-execution --task-arn {task_arn} --profile $MIGRATION_TARGET_PROFILE`	(poll-only)
`dms-start-replication-task`	`aws dms describe-replication-tasks --filters Name=replication-task-arn,Values={task} --profile $MIGRATION_TARGET_PROFILE`	`aws dms start-replication-task --replication-task-arn {task} --start-replication-task-type start-replication --profile $MIGRATION_TARGET_PROFILE`	(poll-only)
`s3-sync`	`aws s3 sync s3://{src} s3://{tgt} --dryrun --profile $MIGRATION_TARGET_PROFILE`	`aws s3 sync s3://{src} s3://{tgt} --profile $MIGRATION_TARGET_PROFILE`	`aws s3 ls s3://{tgt} --summarize --recursive --profile $MIGRATION_TARGET_PROFILE \| tail -2`
`s3-batch-replication-job`	`aws s3control describe-job --account-id {target_acct} --job-id {id} --profile $MIGRATION_TARGET_PROFILE`	`aws s3control create-job --account-id {target_acct} --operation '{...}' --report '{...}' --manifest '{...}' --profile $MIGRATION_TARGET_PROFILE`	(poll-only)
`dynamodb-export`	`aws dynamodb describe-table --table-name {src} --profile $MIGRATION_SOURCE_PROFILE`	`aws dynamodb export-table-to-point-in-time --table-arn {arn} --s3-bucket {bucket} --profile $MIGRATION_SOURCE_PROFILE`	(poll-only)
`dynamodb-import`	`aws s3 ls s3://{bucket}/{prefix} --profile $MIGRATION_TARGET_PROFILE`	`aws dynamodb import-table --s3-bucket-source '{...}' --input-format DYNAMODB_JSON --table-creation-parameters '{...}' --profile $MIGRATION_TARGET_PROFILE`	(poll-only)
`secret-put-value`	`aws secretsmanager describe-secret --secret-id {name} --profile $MIGRATION_TARGET_PROFILE`	`aws secretsmanager put-secret-value --secret-id {name} --secret-string @{secret_file} --profile $MIGRATION_TARGET_PROFILE`	`aws secretsmanager get-secret-value --secret-id {name} --query SecretString --profile $MIGRATION_TARGET_PROFILE \| wc -c`
`route53-change`	`aws route53 get-change --id $(aws route53 change-resource-record-sets ... --dry-run 2>&1 \| ...)` (planner emits the change-batch JSON; preview is showing the diff)	`aws route53 change-resource-record-sets --hosted-zone-id {zid} --change-batch file://{change_file} --profile $MIGRATION_TARGET_PROFILE`	`dig +short @8.8.8.8 {name}`
`rds-promote-read-replica`	`aws rds describe-db-instances --db-instance-identifier {id} --profile $MIGRATION_TARGET_PROFILE`	`aws rds promote-read-replica --db-instance-identifier {id} --profile $MIGRATION_TARGET_PROFILE`	`aws rds describe-db-instances --db-instance-identifier {id} --query 'DBInstances[0].StatusInfos' --profile $MIGRATION_TARGET_PROFILE`

For secret-put-value: the executor never inlines the secret. The operator places the value in a file outside the run directory and supplies its path at prompt time.

Validate execution-steps.json against schemas/execution-step.schema.json. Halt if validation fails — do not start executing.

Step 3 — Show plan, get final go/no-go

Print a one-screen summary:

Execution plan compiled.
  Total steps:        N
  Per-phase:          P1=… P2=… P3=… P4=… P5=… P6=…
  Long-running:       N (DataSync=… DMS=… S3-batch=… DynamoDB=…)
  High-risk:          N
  Approval-required:  N (all high-risk + delete/promote/route53)
  Manual-decision:    N (human-only steps, no AWS calls)
  Steps file:         <path>/execution-steps.json
  Journal:            <path>/execution-log.jsonl

Ask the operator: proceed / cancel. On cancel, write a single journal entry event: aborted with notes: "operator declined at pre-flight" and stop.

Step 4 — Walk the steps in dependency order

For each step where requires[] is satisfied (all predecessors are step-succeeded in the journal):

Append step-started to the journal.
Preview. If preview_cmd is set, run it (read-only). Append preview-shown with the stdout tail.
Approval. Always prompt the human with: step description, resource ARN, risk, the exact execute_cmd that will run, and the preview output. Append approval-requested. Choices: approve / skip / abort.
- approve → append approved
- skip → append skipped, mark step state skipped, move to next step
- abort → append aborted, write summary with verdict: "aborted", stop
Execute. Run execute_cmd via Bash. Capture stdout/stderr to execution/<step_id>.stdout.log / .stderr.log. Append executed with exit_code and tails.
For long-running steps: instead of step 6 directly, enter poll loop. Run poll_cmd every poll_interval_seconds. Append poll-tick with the parsed status. When status matches poll_terminal_states, append poll-terminal and proceed to verify. If the terminal status is a failure value (e.g. failed, stopped), treat as verify failure.
Verify. Run verify_cmd. Match verify_success_pattern if set. Append verify-passed or verify-failed.
On success: append step-succeeded, move to next step.
On failure (verify-failed or non-zero execute exit): append step-failed, then enter the rollback dialog.

Step 5 — Rollback dialog on failure

When a step fails:

Show the operator: the step that failed, exit code + stderr tail, the rollback_cmd from the plan, and the rollback window (from migration-plan.phases[N].rollback.rollback_window_minutes).
Ask: retry / rollback / abort.
- retry → re-run execute_cmd (counts as a new attempt; same step_id, new entries). Limit 3 retries per step; after that, only rollback/abort offered.
- rollback → append rollback-approved, run rollback_cmd, append rollback-executed with result. Then ask: continue with next step / abort.
- abort → write summary with verdict: "halted", halt_reason, halt_step_id, stop.

Some steps are inherently irreversible (e.g. kms-grant, certain route53-change situations where TTL has propagated). For these, the rollback_cmd field carries the best-effort undo and notes flags the irreversibility. The executor surfaces that flag prominently in the rollback dialog.

Step 6 — Resume semantics

If execution-log.jsonl already exists for this run:

Read it. Reconstruct per-step state from the last event for each step_id.
For any step in state executed or poll-tick (not yet terminal/verified) — these were in-flight when the prior run stopped:
- Append resume-reverified.
- Run verify_cmd (or poll_cmd + check terminal for long-running) against AWS now.
- If it succeeds: append resume-marked-done and treat as step-succeeded.
- If it does not succeed: append resume-reprompted, show the operator the situation, ask retry / skip / abort.
For steps in state step-succeeded or skipped: leave them. Do not re-prompt.
For steps in state step-failed, aborted: surface them and ask the operator whether to retry, skip, or abort.
After the in-flight reconciliation completes, continue normal execution from the next pending step.

Step 7 — End-of-run reporting

When the walk reaches the last step (success), the operator aborts, or a rollback completes and the operator chooses not to continue:

Append a final journal entry with event: step-succeeded of the last step (if applicable).
Write the summary object into the journal-file-as-JSON view (the JSONL stays append-only; the summary lives in a sibling execution-summary.json derived from the journal).
Generate execution-report.md with: verdict, per-phase counts, list of skipped/failed step IDs with links to their .stderr.log, total elapsed time, total approvals.
If verdict is completed, remind the operator to run /aws-migration-architect:audit next.

Failure modes the executor must handle gracefully

AWS API throttling. Retry execute_cmd once with 30s backoff before treating as failure. Log the retry as notes.
SSO session expiry mid-run. Detect ExpiredTokenException. Halt with a clear message: aws sso login --profile <name> and resume.
Terraform state lock. Surface the lock holder and offer the operator the unlock command. Never force-unlock automatically.
Pre-existing target resource. If verify_cmd indicates the target already exists (e.g. snapshot already restored) before execute_cmd ran, treat as step-succeeded and log notes: "target already existed; skipped execute".
Operator walks away during long poll. Polling continues silently and journal accumulates poll-tick entries. On poll-terminal, the executor blocks waiting for the next prompt; it does not auto-advance.

Related skills

cutover-control-plane / cutover-data-plane — produce the two checklists this skill consumes
migration-planner — provides the rollback steps
post-migration-auditor — runs after a successful execution

Sub-agent

Calls cutover-executor to do the walk.

cutover-executor

Invocation

Context Preview

Supporting Files

SKILL.md

cutover-executor

Invocation

Context Preview

Supporting Files

SKILL.md

AWS Migration: Cutover Executor

When to use this skill

Prerequisites

Inputs

Outputs

Workflow

Step 1 — Pre-flight gate

Step 2 — Compile execution-steps.json from BOTH checklists

Step 3 — Show plan, get final go/no-go

Step 4 — Walk the steps in dependency order

Step 5 — Rollback dialog on failure

Step 6 — Resume semantics

Step 7 — End-of-run reporting

Failure modes the executor must handle gracefully

Related skills

Sub-agent

Similar Skills

AWS Migration: Cutover Executor

When to use this skill

Prerequisites

Inputs

Outputs

Workflow

Step 1 — Pre-flight gate

Step 2 — Compile execution-steps.json from BOTH checklists

Step 3 — Show plan, get final go/no-go

Step 4 — Walk the steps in dependency order

Step 5 — Rollback dialog on failure

Step 6 — Resume semantics

Step 7 — End-of-run reporting

Failure modes the executor must handle gracefully

Related skills

Sub-agent

Similar Skills