Count Exact#
Check: row-count-exact-check
Purpose: Verifies that the dataset contains exactly the specified number of rows. Useful for enforcing strict expectations on data volume.
Python Configuration#
from sparkdq.checks import RowCountExactCheckConfig
from sparkdq.core import Severity
RowCountExactCheckConfig(
check_id="validate_snapshot_size",
expected_count=500,
severity=Severity.ERROR
)
Declarative Configuration#
- check: row-count-exact-check
check-id: validate_snapshot_size
expected-count: 500
severity: error
Typical Use Cases#
✅ Validate fixed-size imports (e.g., daily exports with exactly 1,000 records).
✅ Ensure integrity in snapshot-based pipelines with exact record counts.
✅ Identify silent load failures or unintended duplications early in the process.