From grimoire
Sets reserved concurrency, SQS dead-letter queues, recursion detection, and budget alerts to prevent financial denial-of-service from runaway serverless functions.
How this skill is triggered — by the user, by Claude, or both
Slash command
/grimoire:design-serverless-cost-protectionThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Set reserved concurrency limits, configure SQS dead-letter queues with backoff, enable Lambda Recursion Detection, and configure AWS Budgets alerts — preventing runaway functions and financial denial-of-service where an attacker or bug can generate millions of invocations and thousands of dollars in minutes.
Set reserved concurrency limits, configure SQS dead-letter queues with backoff, enable Lambda Recursion Detection, and configure AWS Budgets alerts — preventing runaway functions and financial denial-of-service where an attacker or bug can generate millions of invocations and thousands of dollars in minutes.
Adopted by: OWASP Serverless Top 10 SLS-8 (Denial of Service and Financial Resource Exhaustion). AWS Lambda reserved concurrency is documented as the primary defense against both DoS and runaway cost. AWS re:Invent 2022 "Serverless Security" session lists SLS-8 as the most financially damaging class of serverless incident. Datadog's "State of Serverless" (2023) found that 12% of organizations reported unexpected Lambda cost spikes of >300% month-over-month, with most caused by event loop bugs or abuse of public-facing endpoints. Impact: Unlike EC2 (fixed cost per instance), Lambda scales automatically to handle any traffic volume — a single misconfigured SQS trigger in an infinite retry loop was reported by a startup founder in 2019 to have generated $72,000 in Lambda and DynamoDB charges in one weekend. AWS documented a case where a Lambda recursive loop caused 100,000+ invocations per second for 3 hours before the account limit was hit. Public-facing API Gateway endpoints without rate limits are directly exploitable — a botnet sending 10,000 requests/second at $0.0000002/invocation costs $2,000/hour. Why best: Reserved concurrency provides a hard cap — functions cannot scale beyond the reservation, preventing both cost exhaustion and cascade failures where one function consumes all account concurrency. The alternative (account-level concurrency limit only) allows a single function to starve all others in the account. Financial alerts provide a safety net when limits are misconfigured.
Sources: OWASP Serverless Top 10 SLS-8; AWS Lambda documentation on reserved concurrency; AWS re:Invent SVS401 "Serverless Security" (2022); Datadog State of Serverless (2023)
Set reserved concurrency on all production functions:
# SAM template — reserved concurrency per function
ProcessOrderFunction:
Type: AWS::Serverless::Function
Properties:
ReservedConcurrencyLimit: 100 # hard cap: max 100 concurrent executions
# Never leave at -1 (unlimited) for functions with external triggers
PublicApiFunction:
Type: AWS::Serverless::Function
Properties:
ReservedConcurrencyLimit: 50 # public endpoint: lower cap
# Set via CLI
aws lambda put-function-concurrency \
--function-name my-function \
--reserved-concurrent-executions 100
Calculate appropriate limit: max_rps × avg_duration_seconds = required_concurrency
Add 20% headroom: required_concurrency × 1.2 = reserved_limit
Prevent SQS-triggered Lambda infinite loops with DLQ and backoff:
# SQS queue with DLQ — prevents infinite retry loops
OrderQueue:
Type: AWS::SQS::Queue
Properties:
VisibilityTimeout: 90 # must be ≥ Lambda timeout × 6
RedrivePolicy:
deadLetterTargetArn: !GetAtt OrderDLQ.Arn
maxReceiveCount: 3 # after 3 failures, move to DLQ — stop retrying
OrderDLQ:
Type: AWS::SQS::Queue
Properties:
MessageRetentionPeriod: 1209600 # 14 days — for investigation
# Lambda event source mapping — limit batch size
OrderProcessorMapping:
Type: AWS::Lambda::EventSourceMapping
Properties:
FunctionName: !GetAtt ProcessOrderFunction.Arn
EventSourceArn: !GetAtt OrderQueue.Arn
BatchSize: 10 # process 10 at a time, not 10,000
MaximumBatchingWindowInSeconds: 5
FunctionResponseTypes:
- ReportBatchItemFailures # partial batch failure — don't re-process successes
Enable Lambda Recursion Detection (prevents Lambda-to-Lambda infinite loops):
# Enable recursive loop detection on all Lambda functions (AWS default since 2023)
aws lambda put-function-recursion-config \
--function-name my-function \
--recursive-loop Terminate
# Verify
aws lambda get-function-recursion-config --function-name my-function
# Expected: {"RecursiveLoop": "Terminate"}
# Also implement application-level recursion guard
import os
MAX_RECURSION_DEPTH = 10
def lambda_handler(event, context):
# Check for recursion via custom header/attribute
depth = int(event.get("_recursion_depth", 0))
if depth >= MAX_RECURSION_DEPTH:
raise RuntimeError(f"Maximum recursion depth {MAX_RECURSION_DEPTH} exceeded")
# Pass depth to downstream invocations
invoke_next_function(event={**event, "_recursion_depth": depth + 1})
Rate limit public API Gateway endpoints:
# API Gateway usage plan — rate + burst limits per API key or per stage
ApiUsagePlan:
Type: AWS::ApiGateway::UsagePlan
Properties:
UsagePlanName: standard
Throttle:
RateLimit: 1000 # requests per second
BurstLimit: 2000 # burst capacity
Quota:
Limit: 100000 # daily request quota
Period: DAY
# Stage-level throttling (applies to all endpoints)
ApiStage:
Type: AWS::ApiGateway::Stage
Properties:
DefaultRouteSettings:
ThrottlingRateLimit: 1000
ThrottlingBurstLimit: 500
Configure AWS Budgets alerts for Lambda cost anomalies:
# CDK / Python: programmatic budget creation
import boto3
budgets = boto3.client("budgets")
budgets.create_budget(
AccountId="123456789012",
Budget={
"BudgetName": "lambda-daily-limit",
"BudgetType": "COST",
"TimeUnit": "DAILY",
"BudgetLimit": {"Amount": "100", "Unit": "USD"},
"CostFilters": {"Service": ["AWS Lambda"]},
},
NotificationsWithSubscribers=[
{
"Notification": {
"NotificationType": "ACTUAL",
"ComparisonOperator": "GREATER_THAN",
"Threshold": 80, # alert at 80% of daily limit
},
"Subscribers": [
{"SubscriptionType": "SNS", "Address": "arn:aws:sns:...alert-topic"},
],
},
{
"Notification": {
"NotificationType": "FORECASTED",
"ComparisonOperator": "GREATER_THAN",
"Threshold": 100, # alert when forecasted to exceed limit
},
"Subscribers": [
{"SubscriptionType": "SNS", "Address": "arn:aws:sns:...alert-topic"},
],
},
],
)
maxReceiveCount ≤ 5 — without DLQ, a poisoned message triggers infinite retries at full concurrency.ReservedConcurrencyLimit: -1 (unlimited) on all functions — the SAM default; acceptable during development but must be set before production deployment.npx claudepluginhub jeffreytse/grimoire --plugin grimoireBuilds production-ready AWS serverless applications with Lambda functions, API Gateway, DynamoDB, SQS/SNS event patterns, SAM/CDK deployment, and cold start optimization.
Builds production-ready serverless applications on AWS with patterns for Lambda, API Gateway, DynamoDB, SQS/SNS, SAM/CDK deployment, and cold start optimization.
Implements structured JSON logging, distributed tracing with AWS X-Ray, anomaly-based CloudWatch alarms, and security event logging for serverless functions in production.