Skill

aws-bootstrap

Use when the user wants to provision AWS GPU instances, check GPU instance status, terminate instances, manage EBS data volumes, clean up cloud resources, check GPU vCPU quotas, or request quota increases. Wraps the aws-bootstrap CLI for EC2 GPU instance lifecycle management (launch, status, terminate, cleanup, list, quota).

Popularity

Parent stars

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/aws-bootstrap-skill:aws-bootstrap

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

You have access to the `aws-bootstrap` CLI tool for provisioning and managing AWS EC2 GPU instances. Use it via the Bash tool. **Always pass `-o json` before the subcommand** (e.g., `aws-bootstrap -o json status`) **when you need to parse results programmatically.** The `--output`/`-o` flag is a global option and must come before the command name — placing it after (e.g., `aws-bootstrap status ...

Supporting Files

references/commands.md

SKILL.md

210 lines · ~2.1k tokens

Stats

LanguagePython

Parent stars1

MaintenanceExcellent

Last CommitFeb 20, 2026

Actions

View Source View Plugin View on GitHub View README

Stats

Actions

aws-bootstrap -- AWS GPU Instance Management

You have access to the aws-bootstrap CLI tool for provisioning and managing AWS EC2 GPU instances. Use it via the Bash tool. Always pass -o json before the subcommand (e.g., aws-bootstrap -o json status) when you need to parse results programmatically. The --output/-o flag is a global option and must come before the command name — placing it after (e.g., aws-bootstrap status -o json) will fail.

Prerequisites

Before running any commands, verify:

The aws-bootstrap CLI is installed (pip install aws-bootstrap-g4dn or uv pip install aws-bootstrap-g4dn)
AWS credentials are configured (AWS_PROFILE env var or --profile flag)
An SSH key pair exists at ~/.ssh/id_ed25519 (or specify via --key-path)

You can check if the CLI is installed by running: aws-bootstrap --version

Quick Reference

Command	Purpose	Key Options
`aws-bootstrap launch`	Provision a GPU instance (spot by default)	`--instance-type`, `--spot/--on-demand`, `--ebs-storage`, `--dry-run`
`aws-bootstrap status`	List running instances with IPs, pricing	`--gpu` (CUDA info), `--no-instructions`
`aws-bootstrap terminate`	Terminate instances and clean up	`[ID_OR_ALIAS...]`, `--keep-ebs`, `--yes`
`aws-bootstrap cleanup`	Remove stale SSH config + orphan EBS	`--include-ebs`, `--dry-run`
`aws-bootstrap list instance-types`	Browse GPU instance types	`--prefix` (default: g4dn)
`aws-bootstrap list amis`	Browse Deep Learning AMIs	`--filter`
`aws-bootstrap quota show`	Show GPU vCPU quotas (all families)	`--family`
`aws-bootstrap quota request`	Request a quota increase	`--family`, `--type`, `--desired-value`, `--yes`
`aws-bootstrap quota history`	Show quota increase request history	`--family`, `--type`, `--status`

Global options (before the command): --output json|yaml|table|text, --profile, --region

Structured Output

Always use --output json (aliased as -o json) when you need to process results:

# Get instance status as JSON
aws-bootstrap -o json status

# Dry-run launch to see what would happen
aws-bootstrap -o json launch --dry-run

# Terminate with --yes (required in structured output modes)
aws-bootstrap -o json terminate --yes

Commands requiring confirmation (terminate, cleanup) must include --yes when using --output json/yaml/table.

Common Workflows

Launch a GPU Instance

# Default: spot g4dn.xlarge in us-west-2
aws-bootstrap launch

# Specify instance type and region
aws-bootstrap launch --instance-type g5.xlarge --region us-east-1

# On-demand pricing (no spot interruption risk)
aws-bootstrap launch --on-demand

# With persistent EBS data volume (survives termination)
aws-bootstrap launch --ebs-storage 96

# Dry run first to validate configuration
aws-bootstrap launch --dry-run

# Custom Python version in remote venv
aws-bootstrap launch --python-version 3.13

# Non-default SSH port
aws-bootstrap launch --ssh-port 2222

After launch, the CLI:

Creates the instance (spot with auto-fallback to on-demand)
Adds an SSH alias (e.g. aws-gpu1) to ~/.ssh/config
Runs remote setup (CUDA-matched PyTorch, Jupyter, GPU benchmark)
Mounts EBS volume at /data (if requested)

Check Instance Status

# Human-readable status
aws-bootstrap status

# With GPU info (CUDA toolkit, driver version, GPU name)
aws-bootstrap status --gpu

# Machine-readable
aws-bootstrap -o json status

Connect to an Instance

After launch, use the SSH alias printed in the output:

# Direct SSH (venv auto-activates)
ssh aws-gpu1

# Jupyter tunnel
ssh -NL 8888:localhost:8888 aws-gpu1
# Then open: http://localhost:8888

# VSCode Remote SSH
code --folder-uri vscode-remote://ssh-remote+aws-gpu1/home/ubuntu/workspace

# Run GPU benchmark
ssh aws-gpu1 'python ~/gpu_benchmark.py'

Terminate and Clean Up

# Terminate by alias
aws-bootstrap terminate aws-gpu1

# Terminate all instances (with confirmation)
aws-bootstrap terminate

# Terminate but keep EBS volumes for reuse
aws-bootstrap terminate --keep-ebs

# Clean up stale SSH config entries
aws-bootstrap cleanup

# Also clean up orphan EBS volumes
aws-bootstrap cleanup --include-ebs

# Preview what would be cleaned (no changes)
aws-bootstrap cleanup --include-ebs --dry-run

Persistent Data with EBS

# Create a new volume on launch
aws-bootstrap launch --ebs-storage 96

# After terminating with --keep-ebs, reattach to a new instance
aws-bootstrap terminate --keep-ebs
# Note the volume ID from output, then:
aws-bootstrap launch --ebs-volume-id vol-0abc123def456

EBS volumes are mounted at /data, survive spot interruptions, and persist independently of instances. Use /data for large datasets, model checkpoints, and training outputs — it persists across instance lifecycles while the root volume does not. For example:

# Store training data on persistent volume
ssh aws-gpu1 'mkdir -p /data/datasets /data/checkpoints /data/outputs'

# Download a dataset to persistent storage
ssh aws-gpu1 'cd /data/datasets && wget https://example.com/dataset.tar.gz'

Remote Instance Environment

After launch and remote setup, each instance comes pre-configured with:

Python Virtual Environment (`~/venv`)

Located at ~/venv, auto-activated on SSH login (via ~/.bashrc)
PyTorch is pre-installed with the correct CUDA wheel matching the host's CUDA toolkit version (e.g. cu124, cu128) — torch.cuda.is_available() works out of the box
torchvision is also pre-installed with matching CUDA support
Additional numerical/ML libraries from requirements.txt: numpy, tqdm, and other common dependencies
Use --python-version on launch to pin a specific Python version (e.g. 3.13)
To install additional packages: ssh aws-gpu1 'pip install transformers datasets'

GPU and CUDA

NVIDIA drivers and CUDA toolkit are pre-installed via the Deep Learning AMI
nvidia-smi and nvcc are available on PATH
A GPU benchmark is pre-installed at ~/gpu_benchmark.py (CNN + Transformer workloads)
A Jupyter notebook for interactive GPU verification is at ~/gpu_smoke_test.ipynb

Jupyter

JupyterLab runs as a systemd service on port 8888
Access via SSH tunnel: ssh -NL 8888:localhost:8888 aws-gpu1, then open http://localhost:8888

EBS Data Volume (`/data`)

If launched with --ebs-storage or --ebs-volume-id, a persistent gp3 EBS volume is mounted at /data. Use this for:

Large datasets — download and store training data here so it persists across spot interruptions
Model checkpoints — save checkpoints to /data/checkpoints to avoid losing training progress
Training outputs — write logs, metrics, and results to /data/outputs

The /data volume is not lost on spot interruption — when AWS reclaims the instance, the volume detaches automatically and can be reattached to a new instance with --ebs-volume-id.

Error Handling

Spot capacity errors: The CLI auto-falls back to on-demand pricing
Quota limits (MaxSpotInstanceCountExceeded, VcpuLimitExceeded): Check with aws-bootstrap quota show and request increases with aws-bootstrap quota request --family gvt --type spot --desired-value 4. Other families: --family p (P2-P6), --family dl
SSH timeouts: Instance may still be initializing -- check aws-bootstrap status
No public IP: Check VPC settings or assign an Elastic IP
EBS mount failures: Non-fatal -- instance remains usable, may need manual mount

Detailed Command Reference

See commands.md for full option documentation and JSON output schemas.

aws-bootstrap

Popularity

Invocation

Context Preview

Supporting Files

SKILL.md

aws-bootstrap

Popularity

Invocation

Context Preview

Supporting Files

SKILL.md

aws-bootstrap -- AWS GPU Instance Management

Prerequisites

Quick Reference

Structured Output

Common Workflows

Launch a GPU Instance

Check Instance Status

Connect to an Instance

Terminate and Clean Up

Persistent Data with EBS

Remote Instance Environment

Python Virtual Environment (~/venv)

GPU and CUDA

Jupyter

EBS Data Volume (/data)

Error Handling

Detailed Command Reference

Similar Skills

aws-bootstrap -- AWS GPU Instance Management

Prerequisites

Quick Reference

Structured Output

Common Workflows

Launch a GPU Instance

Check Instance Status

Connect to an Instance

Terminate and Clean Up

Persistent Data with EBS

Remote Instance Environment

Python Virtual Environment (~/venv)

GPU and CUDA

Jupyter

EBS Data Volume (/data)

Error Handling

Detailed Command Reference

Similar Skills

Python Virtual Environment (`~/venv`)

EBS Data Volume (`/data`)

Python Virtual Environment (`~/venv`)

EBS Data Volume (`/data`)