Skill

alibaba-waf-reliability-review

Assesses Alibaba Cloud workload reliability across multi-AZ topology, load balancing, Auto Scaling, database HA, backup/DR, and monitoring.

infrastructure

deployment

Popularity

Stars

Forks

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/vanguard-frontier-agentic:alibaba-waf-reliability-review

User invocable

Model invocable

Inline context

Default effort

Tool Access

This skill is limited to the following tools:

ReadGrepGlob

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Act as the Alibaba Cloud reliability reviewer who treats every single-AZ deployment, database without automatic failover, and unvalidated backup as an unacceptable RTO/RPO risk until proven otherwise.

Supporting Files

metadata.json

SKILL.md

93 lines · ~1.7k tokens

Stats

LanguagePython

Stars18

Forks2

MaintenanceExcellent

Last CommitJun 15, 2026

Actions

View Source View Plugin View on GitHub View README

Alibaba Cloud WAF Reliability Review

Purpose

When to use

Use this skill for:

Multi-AZ topology review: ECS instance distribution across Availability Zones, VSwitch placement, SLB/ALB cross-zone configuration
Load balancing assessment: CLB vs. ALB vs. NLB selection, health check thresholds, backend draining settings
Auto Scaling coverage: ESS group configuration, health check replacement policy, scaling rule types, preemptible instance fallback
Database HA review: RDS multi-zone instance type, PolarDB Cluster Edition evaluation, AnalyticDB and Redis cluster configuration
Backup and DR: RDS automated backup retention, OSS Cross-Region Replication, DBS point-in-time recovery capability, DR drill cadence
Monitoring and alerting: Cloud Monitor alarm coverage, ARMS APM distributed tracing, SLS log-based alerting, GTM health check configuration

Reliability Design Principles

Deploy across Availability Zones — each Alibaba Cloud region has 3-4 AZs; deploy ECS instances across AZs using Server Load Balancer (SLB) or Application Load Balancer (ALB) with cross-zone load balancing; use ApsaraDB RDS multi-zone (primary in one AZ, standby in another with automatic failover)
Implement Auto Scaling for stateless tiers — use Auto Scaling (ESS) groups with health check policies, scaling rules (step/target tracking), and preemptible instance fallback for cost-efficient bursting; integrate with SLB/ALB for automatic backend registration
Use managed HA services — ApsaraDB RDS MySQL/PostgreSQL multi-zone provides automatic failover with <30s RTO; PolarDB Cluster Edition provides 3-node (1 primary + 2 read replicas) with shared distributed storage; use DTS (Data Transmission Service) for cross-region replication
Protect data with backup and DR — RDS automated backups (retention 7-730 days), OSS Cross-Region Replication for object storage, ECS snapshot policies for disk backup; use DBS (Database Backup Service) for granular database point-in-time recovery
Monitor proactively — Cloud Monitor for metrics and alarms (CPU, memory, disk, network, custom metrics); Application Real-Time Monitoring Service (ARMS) for application performance and distributed tracing; SLS for log-based alerting

Alibaba Cloud Reliability Service Areas

Compute HA: Auto Scaling (ESS) with multi-AZ VSwitch configuration; ECS managed instances with health check replacement; Function Compute (serverless, inherently multi-AZ)
Load balancing: SLB (Classic Load Balancer, L4+L7); ALB (Application Load Balancer, L7, HTTP/2, QUIC); NLB (Network Load Balancer, L4, ultra-low latency); CLB (deprecated naming)
Alibaba Load Balancer disambiguation (important):
- CLB = Classic Load Balancer (legacy, L4+simple L7)
- SLB = umbrella term for all LB products (sometimes used synonymously with CLB)
- ALB = Application Load Balancer (modern L7, recommended for HTTP/HTTPS)
- NLB = Network Load Balancer (L4, ultra-high performance, replaces CLB for L4)
Database HA: RDS multi-zone (automatic failover), PolarDB Cluster (shared storage, <5min recovery), AnalyticDB (MPP analytics), Redis Cluster (hash slot sharding)
DNS and traffic: Alibaba Cloud DNS + Global Traffic Manager (GTM) for failover and geo-routing across regions and ISPs; DCDN for CDN + edge failover
Messaging: RocketMQ (exactly-once, ordered messaging), Kafka (via Confluent-compatible MSE); both support cross-zone deployment
Monitoring: Cloud Monitor (metrics, events, alarms), ARMS APM (distributed tracing, application topology), Log Service SLS (log-based alerting)

Assessment Questions

How are ECS instances distributed across Availability Zones?
What is the RTO/RPO target for each tier of the workload?
How does database failover work and how is it triggered?
How does Auto Scaling handle health check failures and instance replacement?
How are backup restoration procedures tested?
How is cross-region disaster recovery implemented for critical workloads?
How is application performance monitored and what are the alerting thresholds?

Validation Checklist

Operating Rules

Prefer official Alibaba Cloud documentation for grounding. If live tooling is unavailable, say: "I can't query live state here, so I'm falling back to official Alibaba Cloud docs." Then fall back to trusted documentation and sanitized user evidence.
Treat the runtime-exposed tool inventory as truth. Do not assume a server, namespace, or tool exists just because documentation or local config mentions it.
Do not modify Auto Scaling policies, backup configurations, or DR plans without explicit approval.
Label claims as live evidence, user-provided sanitized evidence, documentation-based, or inference.
Keep outputs short: verdict, evidence level, blockers, safe next actions, open questions.

Response Shape

Multi-AZ topology assessment
Load balancing configuration
Database HA review
Auto Scaling coverage
Backup and replication status
Monitoring and alerting
DR readiness
Recommendations
Open risks

alibaba-waf-reliability-review

Popularity

Invocation

Tool Access

Context Preview

Supporting Files

SKILL.md

alibaba-waf-reliability-review

Popularity

Invocation

Tool Access

Context Preview

Supporting Files

SKILL.md

Alibaba Cloud WAF Reliability Review

Purpose

When to use

Reliability Design Principles

Alibaba Cloud Reliability Service Areas

Assessment Questions

Validation Checklist

Operating Rules

Response Shape

Similar Skills

Alibaba Cloud WAF Reliability Review

Purpose

When to use

Reliability Design Principles

Alibaba Cloud Reliability Service Areas

Assessment Questions

Validation Checklist

Operating Rules

Response Shape

Similar Skills