From harness-claude
Controls request throughput with token bucket, sliding window, and fixed window algorithms to protect APIs from abuse, enforce usage quotas, and prevent service overload using Redis.
How this skill is triggered — by the user, by Claude, or both
Slash command
/harness-claude:resilience-rate-limitingThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
> Control request throughput with token bucket, sliding window, and fixed window algorithms to protect services from overload
Control request throughput with token bucket, sliding window, and fixed window algorithms to protect services from overload
Retry-After header when the limit is exceeded.X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset.// middleware/rate-limiter.ts — sliding window with Redis
import { Redis } from 'ioredis';
interface RateLimitConfig {
windowMs: number; // Window size in milliseconds
maxRequests: number; // Max requests per window
keyPrefix: string;
}
interface RateLimitResult {
allowed: boolean;
limit: number;
remaining: number;
resetAt: number; // Unix timestamp (seconds)
retryAfter?: number; // Seconds until next allowed request
}
export class SlidingWindowRateLimiter {
constructor(
private redis: Redis,
private config: RateLimitConfig
) {}
async check(key: string): Promise<RateLimitResult> {
const now = Date.now();
const windowStart = now - this.config.windowMs;
const redisKey = `${this.config.keyPrefix}:${key}`;
const pipeline = this.redis.pipeline();
pipeline.zremrangebyscore(redisKey, 0, windowStart); // Remove expired entries
pipeline.zadd(redisKey, now.toString(), `${now}:${Math.random()}`); // Add current request
pipeline.zcard(redisKey); // Count requests in window
pipeline.pexpire(redisKey, this.config.windowMs); // Set TTL
const results = await pipeline.exec();
const count = results![2][1] as number;
const allowed = count <= this.config.maxRequests;
const resetAt = Math.ceil((now + this.config.windowMs) / 1000);
if (!allowed) {
// Remove the request we just added since it's denied
await this.redis.zrem(redisKey, `${now}:${Math.random()}`);
}
return {
allowed,
limit: this.config.maxRequests,
remaining: Math.max(0, this.config.maxRequests - count),
resetAt,
retryAfter: allowed ? undefined : Math.ceil(this.config.windowMs / 1000),
};
}
}
// Express middleware
import { Request, Response, NextFunction } from 'express';
const apiLimiter = new SlidingWindowRateLimiter(redis, {
windowMs: 60_000, // 1 minute
maxRequests: 100,
keyPrefix: 'rl:api',
});
export async function rateLimitMiddleware(req: Request, res: Response, next: NextFunction) {
const key = (req.headers['x-api-key'] as string) || req.ip;
const result = await apiLimiter.check(key);
res.setHeader('X-RateLimit-Limit', result.limit);
res.setHeader('X-RateLimit-Remaining', result.remaining);
res.setHeader('X-RateLimit-Reset', result.resetAt);
if (!result.allowed) {
res.setHeader('Retry-After', result.retryAfter!);
return res.status(429).json({ error: 'Too many requests' });
}
next();
}
Algorithm comparison:
| Algorithm | Burst handling | Memory | Accuracy |
|---|---|---|---|
| Fixed window | Allows 2x burst at boundary | Low | Low |
| Sliding window log | No burst | High | High |
| Sliding window counter | Small burst | Medium | Medium |
| Token bucket | Configurable burst | Low | High |
Token bucket (alternative implementation):
class TokenBucket {
private tokens: number;
private lastRefill: number;
constructor(
private capacity: number,
private refillRate: number
) {
this.tokens = capacity;
this.lastRefill = Date.now();
}
consume(count = 1): boolean {
this.refill();
if (this.tokens >= count) {
this.tokens -= count;
return true;
}
return false;
}
private refill() {
const now = Date.now();
const elapsed = (now - this.lastRefill) / 1000;
this.tokens = Math.min(this.capacity, this.tokens + elapsed * this.refillRate);
this.lastRefill = now;
}
}
Libraries: rate-limiter-flexible (Redis/in-memory, multiple algorithms), express-rate-limit (simple Express middleware), bottleneck (client-side rate limiting for API calls).
Distributed considerations: In-memory rate limiters only work for single-instance deployments. For multi-instance deployments, use Redis-backed rate limiting. The Lua script approach in Redis ensures atomicity.
https://cloud.google.com/architecture/rate-limiting-strategies-techniques
npx claudepluginhub intense-visions/harness-engineering --plugin harness-claudeImplements API rate limiting with sliding windows, token buckets, quotas using Redis and libraries for Node.js, Python/FastAPI, Java. Protects endpoints from excessive requests with headers and 429 responses.
Implements API rate limiting using token bucket, sliding window, Redis algorithms, and Express middleware. Use for securing public APIs, tiered access, and DoS protection.
Guides rate limiting implementation using token bucket, sliding window counters, Redis Lua scripts, tiered quotas, middleware, headers, and monitoring to protect APIs from abuse and manage quotas.