From harness-claude
Manages feature flag lifecycle: detect flags in codebases, classify categories, and design rollout strategies with naming conventions. For auditing stale flags or planning new flag architecture.
How this skill is triggered — by the user, by Claude, or both
Slash command
/harness-claude:harness-feature-flagsThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
> Flag lifecycle management, A/B testing infrastructure, and gradual rollout design. Ship features safely with controlled exposure and clean retirement.
Flag lifecycle management, A/B testing infrastructure, and gradual rollout design. Ship features safely with controlled exposure and clean retirement.
Identify flag provider. Scan the project for feature flag infrastructure:
launchdarkly-node-server-sdk), Unleash (unleash-client), Flagsmith (flagsmith-nodejs), Split (@splitsoftware/splitio)*feature-flag*, *toggle*, *flags*flags.json, features.json, feature-flags.yamlFEATURE_*, FF_*)Catalog all flag definitions. For each flag found, record:
Map flag usage in code. For each flag, find all evaluation points:
Detect flag categories. Classify each flag:
Present detection summary:
Feature Flag Detection:
Provider: LaunchDarkly (SDK v7.x)
Flags defined: 23
Flags evaluated in code: 19 (4 defined but unused)
Categories: 12 release, 5 experiment, 4 ops, 2 permission
Stale candidates: 6 (release flags older than 90 days)
Evaluate flag naming convention. Check for consistency:
{team}.{feature}.{variant} (e.g., payments.stripe-v2.enabled)flag_123 or test_flag)ops. for operational flags)Design flag evaluation architecture. Recommend patterns for:
Design rollout strategy. For release flags:
Design A/B testing infrastructure. For experiment flags:
Design fallback behavior. Ensure resilience:
Check test coverage per flag. For each flag, verify:
Validate flag consistency. Check for common issues:
Check for flag coupling. Identify problematic patterns:
Validate rollout configuration. For active rollouts:
Generate validation report:
Flag Validation: [PASS/WARN/FAIL]
Test coverage: WARN (3 flags missing flag-off tests)
Consistency: PASS (all code references match definitions)
Coupling: WARN (2 flags have interdependencies)
Rollout config: PASS (active rollouts correctly configured)
Issues:
1. payments.stripe-v2 -- no test for flag-off fallback path
2. search.new-algorithm + search.reranking -- coupled flags
3. checkout.express -- missing kill switch test
Identify stale flags. Flag candidates for removal:
Generate cleanup plan. For each stale flag:
Assess technical debt. Calculate flag burden:
Recommend lifecycle policies. Design governance:
Generate lifecycle report:
Flag Lifecycle Report:
Active flags: 23
Stale flags: 6 (candidates for removal)
Technical debt: ~340 lines of dead code across 6 stale flags
Cleanup plan:
1. payments.old-checkout -- at 100% for 45 days, 4 files, ~80 LOC to remove
2. search.v1-algorithm -- experiment ended 60 days ago, 2 files, ~40 LOC
3. onboarding.legacy-flow -- at 100% for 90 days, 6 files, ~120 LOC
4. notifications.email-v1 -- unused for 120 days, 3 files, ~50 LOC
5. dashboard.old-charts -- at 100% for 35 days, 2 files, ~30 LOC
6. auth.password-reset-v1 -- at 100% for 60 days, 1 file, ~20 LOC
Recommendations:
- Adopt 90-day maximum lifetime policy for release flags
- Add flag expiry dates to LaunchDarkly metadata
- Schedule monthly flag cleanup sprint
- Target: reduce active flag count from 23 to 17
harness skill run harness-feature-flags -- Primary invocation for flag analysis and lifecycle management.harness validate -- Run after flag cleanup to verify project health.harness check-deps -- Verify flag provider SDK dependencies are installed.emit_interaction -- Present lifecycle report and gather decisions on cleanup priorities.| Rationalization | Why It Is Wrong |
|---|---|
| "This release flag has been at 100% for a while, but removing it is risky" | Release flags at 100% for more than 30 days are stale candidates. Every stale flag adds dead code branches and test matrix complexity. |
| "We only need to test the flag-on path since that is the path we ship" | No flags without test coverage for both paths. The flag-off path IS the fallback when the flag provider is unreachable. |
| "These two flags depend on each other, but they work fine together" | No coupled flag dependencies is a blocking finding. Flags that require other flags creates combinatorial complexity. |
| "Setting the flag default to true makes the rollout easier" | Every flag must default to safe (feature disabled). A default of true means a provider outage enables the feature for everyone. |
| "We do not need a naming convention -- our flag count is small" | Inconsistent naming becomes unmanageable as flag count grows. The skill flags inconsistency as a warning even at small scale. |
Phase 1: DETECT
Provider: LaunchDarkly (React SDK v3.x)
Flags defined: 15 (in LaunchDarkly dashboard)
Flags in code: 13 (2 unused in dashboard)
Categories: 8 release, 3 experiment, 2 ops
Phase 2: DESIGN
Naming: WARN -- inconsistent (mix of camelCase and kebab-case)
Recommended convention: team.feature.variant (kebab-case)
Evaluation: Client-side (React SDK), 200ms init timeout
Default values: WARN -- 2 flags default to true (should default to false)
Rollout: checkout.express-pay at 25% (targeting premium users)
Phase 3: VALIDATE
Test coverage: WARN -- 4 flags missing useFlags mock in component tests
Consistency: PASS
Coupling: PASS (no interdependent flags)
Kill switch: PASS (all release flags can be instantly disabled)
Phase 4: LIFECYCLE
Stale: 3 flags at 100% for 30+ days
Cleanup estimate: 180 LOC across 8 files
Recommendation: Remove search.v2-results (at 100% for 65 days, 3 components)
Result: WARN -- 3 stale flags, 4 missing tests, 2 unsafe defaults
Phase 1: DETECT
Provider: Custom implementation (FeatureFlagService.java)
Flag store: PostgreSQL table (feature_flags)
Flags defined: 28
Flags in code: 24 (4 in database but unreferenced)
Categories: 15 release, 6 experiment, 5 ops, 2 permission
Phase 2: DESIGN
Architecture: Server-side evaluation with 60s cache refresh
Concern: No SDK resilience -- database outage disables all flags
Recommend: Add in-memory cache with stale-while-revalidate pattern
Recommend: Add health check endpoint for flag service status
Rollout: Using percentage field in database -- no segment targeting
Recommend: Add user segment support for targeted rollouts
Phase 3: VALIDATE
Test coverage: WARN -- @FeatureFlag annotation used but no test helper
to toggle flags in test context (tests rely on database state)
Consistency: WARN -- 4 database entries have no code references
Coupling: FAIL -- 3 flags form a dependency chain
(billing.new-pricing requires billing.tax-calc-v2 requires billing.currency-v3)
Recommend: Consolidate into single billing.pricing-v3 flag
Phase 4: LIFECYCLE
Stale: 8 flags enabled for all users for 60+ days
Technical debt: ~520 LOC of dead code
Flag count: 28 (above recommended 20 per service)
Recommendation: Immediate cleanup sprint targeting 8 stale flags
Recommendation: Enforce 60-day lifetime policy going forward
Result: FAIL -- flag coupling detected, high stale flag count
npx claudepluginhub intense-visions/harness-engineering --plugin harness-claudeOperational discipline for feature flags as production infrastructure: flag types, naming, targeting, rollout strategy, lifecycle, governance, stale flag management, and technical debt patterns.
Guides feature flag design, gradual rollouts, A/B testing, kill switches. Recommends LaunchDarkly, Unleash, Flagsmith based on flag count and needs for safer deployments.
Guides gradual feature flag rollout from creation to cleanup with kill switches and A/B testing. Useful for safely launching features and managing risk.