From harness-claude
Prevent cascading failures in microservices with circuit breaker patterns: closed, open, half-open states, fail-fast logic, and fallback responses. Covers Node.js opossum library and manual implementation.
How this skill is triggered — by the user, by Claude, or both
Slash command
/harness-claude:microservices-circuit-breakerThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
> Prevent cascading failures with circuit breaker, half-open state, and fallback logic.
Prevent cascading failures with circuit breaker, half-open state, and fallback logic.
Circuit breaker states:
CLOSED → normal operation, requests pass through
↓ (error rate exceeds threshold)
OPEN → requests fail immediately (fail fast)
↓ (after reset timeout)
HALF-OPEN → let one request through as a probe
↓ (probe succeeds) ↓ (probe fails)
CLOSED OPEN
opossum (Node.js circuit breaker library):
import CircuitBreaker from 'opossum';
// Wrap the fragile operation
async function fetchUserFromService(userId: string): Promise<User> {
const response = await fetch(`${process.env.USER_SERVICE_URL}/users/${userId}`, {
signal: AbortSignal.timeout(3_000), // timeout
});
if (!response.ok) throw new Error(`HTTP ${response.status}`);
return response.json();
}
const breaker = new CircuitBreaker(fetchUserFromService, {
timeout: 3_000, // call timeout
errorThresholdPercentage: 50, // open after 50% errors in rolling window
resetTimeout: 30_000, // try half-open after 30s
rollingCountTimeout: 10_000, // rolling window for error rate
rollingCountBuckets: 10, // 10 × 1s buckets
volumeThreshold: 5, // min requests before circuit can open
});
// Fallback — what to return when circuit is open
breaker.fallback((userId: string) => ({
id: userId,
name: 'Unknown User',
cached: true,
}));
// Events for monitoring
breaker.on('open', () => {
console.error('Circuit OPENED — user service is unavailable');
metrics.increment('circuit_breaker.user_service.open');
});
breaker.on('halfOpen', () => {
console.log('Circuit HALF-OPEN — probing user service');
});
breaker.on('close', () => {
console.log('Circuit CLOSED — user service recovered');
metrics.increment('circuit_breaker.user_service.closed');
});
breaker.on('fallback', (result) => {
console.warn('Fallback triggered:', result);
metrics.increment('circuit_breaker.user_service.fallback');
});
// Usage — same interface as the raw function
const user = await breaker.fire(userId);
Manual circuit breaker (without library):
enum CircuitState {
CLOSED,
OPEN,
HALF_OPEN,
}
class CircuitBreaker<T, Args extends unknown[]> {
private state = CircuitState.CLOSED;
private failureCount = 0;
private lastFailureTime = 0;
constructor(
private readonly operation: (...args: Args) => Promise<T>,
private readonly options: {
failureThreshold: number;
resetTimeoutMs: number;
fallback?: (...args: Args) => T;
}
) {}
async execute(...args: Args): Promise<T> {
if (this.state === CircuitState.OPEN) {
if (Date.now() - this.lastFailureTime >= this.options.resetTimeoutMs) {
this.state = CircuitState.HALF_OPEN;
} else {
if (this.options.fallback) return this.options.fallback(...args);
throw new Error('Circuit is OPEN');
}
}
try {
const result = await this.operation(...args);
this.onSuccess();
return result;
} catch (err) {
this.onFailure();
if (this.options.fallback) return this.options.fallback(...args);
throw err;
}
}
private onSuccess(): void {
this.failureCount = 0;
this.state = CircuitState.CLOSED;
}
private onFailure(): void {
this.failureCount++;
this.lastFailureTime = Date.now();
if (this.failureCount >= this.options.failureThreshold) {
this.state = CircuitState.OPEN;
console.error(`Circuit opened after ${this.failureCount} failures`);
}
}
getState(): CircuitState {
return this.state;
}
}
Combining with retry:
// Retry: handle transient failures (network glitch)
// Circuit breaker: handle systemic failures (service is down)
// Order matters: retry first, then circuit breaker trips if failures persist
const withRetry = async <T>(fn: () => Promise<T>, attempts = 3): Promise<T> => {
for (let i = 1; i <= attempts; i++) {
try {
return await fn();
} catch (err) {
if (i === attempts) throw err;
await new Promise((r) => setTimeout(r, 200 * i)); // 200ms, 400ms
}
}
throw new Error('Unreachable');
};
// Circuit breaker wraps the retry-enabled call
const robustFetchUser = new CircuitBreaker(
(userId: string) => withRetry(() => fetchUserFromService(userId), 2),
{ failureThreshold: 5, resetTimeoutMs: 30_000 }
);
Fallback strategies:
Threshold tuning:
errorThresholdPercentage: 50% is a good default. Lower for critical dependencies, higher for noisy but non-critical ones.resetTimeout: 30s is a good starting point. If recovery takes longer (e.g., DB restart), increase it.volumeThreshold: Prevents opening on first few calls during startup.Anti-patterns:
Bulkhead + circuit breaker: Use both. Bulkhead limits concurrency (prevents a slow service from consuming all threads). Circuit breaker detects failure and stops sending requests. They complement each other.
microservices.io/patterns/reliability/circuit-breaker.html
npx claudepluginhub intense-visions/harness-engineering --plugin harness-claudeImplements the circuit breaker pattern to protect services from cascading failures. Wraps remote calls with failure tracking, automatic open/half-open/closed states, and fallback logic.
Assists implementing circuit breakers, retries, bulkheads, and resilience patterns for fault-tolerant distributed systems.
Provides patterns for managing external dependencies: circuit breakers, timeouts, retries with exponential backoff, bulkhead, graceful degradation, and dependency isolation. Use with any service or API call.