From qlik-agents
Post-load data quality validation patterns for Qlik development. Provides query templates for null rate analysis, referential integrity checks, value distribution analysis, row count validation, orphaned record detection, sparse field identification, and duplicate detection. Usable by the qa-reviewer when MCP or post-load data access is available. Also provides patterns for embedding validation checks directly into load scripts. Load when performing data quality validation or writing diagnostic scripts.
How this skill is triggered — by the user, by Claude, or both
Slash command
/qlik-agents:data-quality-validatorThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Post-load data quality validation catches issues that successful reloads can mask. Scripts reload without errors yet contain incorrect data: synthetic keys, orphaned records, unexpected nulls, duplicate keys, or value anomalies. This skill covers two usage contexts:
Post-load data quality validation catches issues that successful reloads can mask. Scripts reload without errors yet contain incorrect data: synthetic keys, orphaned records, unexpected nulls, duplicate keys, or value anomalies. This skill covers two usage contexts:
The skill provides actionable query templates and patterns for both contexts.
What it catches: Fields with unexpected null rates. Key fields should have 0% nulls. Dimension fields may have expected nulls (handled by NullAsValue) or unexpected ones indicating missing source data.
What it catches: Foreign keys in fact tables that don't match any primary key in the corresponding dimension. Orphaned records that reference non-existent dimension members.
What it catches: Unexpected values ('test', 'TBD', encoding artifacts), values outside expected ranges, categorical fields with unexpected cardinality (too many or too few unique values).
What it catches: Actual row counts differ significantly from expected counts (from source profile). May indicate incomplete loads, accidental filtering, or incremental load logic errors.
What it catches: Records with duplicate primary keys (key uniqueness violation), full-row duplicates (all field values identical), which corrupt dimensional relationships.
What it catches: Fields populated for <10% of records (configurable threshold). Candidates for removal or NullAsValue handling.
What it catches: Fields where Qlik inferred a different type than expected (text loaded as numeric string, or vice versa), indicating data quality issues or mapping errors.
Validation checks embedded in load scripts run during reload, catching issues before the app is stored.
LET vExpectedRows = 50000; // From source profile
LET vActualRows = NoOfRows('TableName');
IF $(vActualRows) < $(vExpectedRows) * 0.9 THEN
TRACE [WARNING] TableName row count $(vActualRows) is more than 10% below expected $(vExpectedRows);
END IF
[_DupCheck]:
LOAD [Order.Key], Count([Order.Key]) AS _dup_count
RESIDENT [Orders]
GROUP BY [Order.Key];
LET vDupCount = 0;
[_DupSummary]:
LOAD Count([Order.Key]) AS _total_dups
RESIDENT [_DupCheck]
WHERE _dup_count > 1;
LET vDupCount = Peek('_total_dups', 0, '_DupSummary');
DROP TABLES [_DupCheck], [_DupSummary];
IF $(vDupCount) > 0 THEN
TRACE [WARNING] Orders has $(vDupCount) duplicate key values;
END IF
LET vNullCount = 0;
[_NullCheck]:
LOAD Count(*) - Count([Order.Key]) AS _null_count
RESIDENT [Orders];
LET vNullCount = Peek('_null_count', 0, '_NullCheck');
DROP TABLE [_NullCheck];
IF $(vNullCount) > 0 THEN
TRACE [CRITICAL] Orders.[Order.Key] has $(vNullCount) null values;
END IF
Note: The Count(*) syntax above applies only to Resident LOADs that are counting rows in a GROUP BY context. In other LOAD contexts, use Count(field_name) (never Count(*) in the context of aggregating a specific field).
If error-handling.qvs is loaded, use the LogMessage subroutine instead of raw TRACE:
IF $(vDupCount) > 0 THEN
CALL LogMessage('WARNING', 'Data Quality', 'Orders has ' & $(vDupCount) & ' duplicate keys');
END IF
When MCP database connectivity or post-load data access is available, deeper analysis is possible. See validation-queries.md for complete query templates.
Queries run against the loaded Qlik data model (via engine API or diagnostic scripts) or against source databases (via MCP) to compare source vs. loaded data. Output is a Data Quality Validation Report consumed by the qa-reviewer.
This output format is the contract between qa-reviewer and data validators:
# Data Quality Validation Report
**Date:** [date]
**App:** [app name]
**Validation Type:** [Embedded Script | Post-Load | MCP Source Comparison]
## Summary
- Tables Validated: [N]
- Critical Issues: [N]
- Warnings: [N]
- Clean: [N]
## Findings
### [Table Name]
| Check | Result | Details |
|-------|--------|---------|
| Row Count | [PASS/WARN/FAIL] | Expected: N, Actual: N |
| Key Uniqueness | [PASS/WARN/FAIL] | [N] duplicate keys found |
| Null Rate ([Order.Key]) | [PASS/WARN/FAIL] | [N]% null |
| Null Rate ([Order.Amount]) | [PASS/WARN/FAIL] | [N]% null |
| Value Distribution | [PASS/WARN/FAIL] | [N] unexpected values ('test', 'TBD') found |
| Referential Integrity | [PASS/WARN/FAIL] | [N] orphaned [Customer.Key] references |
| Sparse Fields | [PASS/WARN/FAIL] | [field_list] populated <10% |
### Recommendations
- [List any data quality issues requiring script fixes or downstream handling]
- [List any expected anomalies to document as known limitations]
The qlik-load-script skill includes diagnostic-patterns.md, which documents TRACE-based logging and basic row count checks during reload. The data-quality-validator extends beyond those basic patterns with deeper analysis queries.
Distinction:
Cross-reference diagnostic-patterns.md when embedding simple row count checks; use data-quality-validator when performing detailed QA analysis.
All queries in validation-queries.md follow these conventions:
[Customer.Key], [Order.Amount]See validation-queries.md for complete query templates organized by validation category.
npx claudepluginhub pupfish-llc/claude-plugins --plugin qlik-agentsGuides creation, editing, and verification of skills for AI coding agents using test-driven development with subagent scenarios. Use when authoring or debugging skills.