From skillry-cloud-and-infrastructure
Use when you need to choose or review cloud architecture decisions — managed vs self-hosted services, multi-AZ high availability, scaling strategy, VPC/subnet/network design, least-privilege service roles, and cost-aware tradeoffs.
How this skill is triggered — by the user, by Claude, or both
Slash command
/skillry-cloud-and-infrastructure:334-cloud-service-architectureThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Guide and review cloud architecture decisions across providers — choosing managed services over self-hosted where it lowers operational risk, designing for multi-AZ availability, sizing and autoscaling, laying out VPC/subnet/routing/egress, attaching least-privilege roles to each service, and keeping the design cost-aware. The goal is an architecture that meets the stated availability and scale...
Guide and review cloud architecture decisions across providers — choosing managed services over self-hosted where it lowers operational risk, designing for multi-AZ availability, sizing and autoscaling, laying out VPC/subnet/routing/egress, attaching least-privilege roles to each service, and keeping the design cost-aware. The goal is an architecture that meets the stated availability and scale targets at a justified cost, with explicit tradeoffs documented so a human owner can approve it before anything is provisioned. Ground every recommendation in the project's actual requirements and existing IaC, not generic best-practice lists.
# Find stated NFRs, SLOs, and existing infra in the repo
grep -rniE "availability|uptime|SLA|SLO|RTO|RPO|latency|throughput|budget|cost" docs/ README* 2>/dev/null | head -30
# Inventory current cloud resources declared in IaC
grep -rnE "resource\s+\"(aws|google|azurerm)_" . --include="*.tf" | sed -E 's/.*"(aws|google|azurerm)_([a-z_]+)".*/\1_\2/' | sort | uniq -c
Record: availability target, expected load and growth, data durability needs (RPO/RTO), latency budget, compliance/data-residency, and cost ceiling.
For each stateful or operationally heavy component, compare a managed service against self-hosting on the axes: operational burden, failover/backup built in, scaling model, lock-in, and total cost. Prefer managed unless a hard constraint (cost at scale, residency, custom extensions) justifies self-hosting.
# Are subnets spread across multiple AZs? Single-AZ data tiers are a SPOF.
grep -rn "availability_zone\|multi_az\|zone" . --include="*.tf"
# Load balancer + multiple targets across AZs?
grep -rn "lb\|load_balancer\|target_group\|autoscaling_group" . --include="*.tf"
Choose horizontal autoscaling for stateless tiers (target CPU/RPS), managed read replicas / connection pooling for data tiers, and queues to absorb spikes. Define min/max bounds and a scale-in cooldown.
# Public vs private subnet placement — data tiers belong in private subnets
grep -rn "map_public_ip_on_launch\|public_subnet\|private_subnet\|nat_gateway\|igw\|internet_gateway" . --include="*.tf"
# Egress controls and security-group scope
grep -rn "0\.0\.0\.0/0\|cidr_blocks\|egress\|ingress" . --include="*.tf"
# Each service should assume a scoped role, not a shared admin identity
grep -rn "iam_role\|service_account\|managed_identity\|assume_role" . --include="*.tf"
# Cost estimate from a plan (read-only) if Infracost is available
infracost breakdown --path . 2>/dev/null || echo "infracost not installed — estimate manually"
0.0.0.0/0 to data ports.# Multi-AZ subnets + managed DB with failover — review target
data "aws_availability_zones" "available" { state = "available" }
resource "aws_subnet" "private" {
count = 2
vpc_id = aws_vpc.main.id
availability_zone = data.aws_availability_zones.available.names[count.index]
cidr_block = cidrsubnet(aws_vpc.main.cidr_block, 4, count.index)
# no map_public_ip_on_launch -> private
}
resource "aws_db_instance" "primary" {
engine = "postgres"
multi_az = true # synchronous standby in another AZ
storage_encrypted = true
backup_retention_period = 7 # supports point-in-time recovery (RPO)
# credentials come from a secret manager, never inline
}
# Stateless tier: autoscaling behind a load balancer
resource "aws_autoscaling_group" "web" {
min_size = 2
max_size = 10
vpc_zone_identifier = aws_subnet.private[*].id # spread across AZs
target_group_arns = [aws_lb_target_group.web.arn]
}
# Read-only cost + plan review (no provisioning)
terraform plan -out=tfplan
infracost breakdown --path . # monthly cost estimate for human review
0.0.0.0/0.Produce a structured report with:
terraform apply, aws/gcloud/az create|delete, console changes) without explicit human approval — produce the design and a plan/estimate for review.describe/list/plan/infracost breakdown) is safe to run unattended.0.0.0.0/0 exposure — require a human decision.Provides UI/UX resources: 50+ styles, color palettes, font pairings, guidelines, charts for web/mobile across React, Next.js, Vue, Svelte, Tailwind, React Native, Flutter. Aids planning, building, reviewing interfaces.
Fetches up-to-date documentation from Context7 for libraries and frameworks like React, Next.js, Prisma. Use for setup questions, API references, and code examples.
npx claudepluginhub fluxonlab/skillry --plugin skillry-cloud-and-infrastructure