Null Check#
Check: null-check
Purpose: Verifies that the specified columns do not contain null values. This check is typically used to ensure the completeness of mandatory fields and to prevent incomplete or invalid records from entering downstream processes.
Python Configuration#
from sparkdq.checks import NullCheckConfig
from sparkdq.core import Severity
NullCheckConfig(
check_id="require_email",
columns=["email"],
severity=Severity.ERROR
)
Declarative Configuration#
- check: null-check
check-id: my-null-check
columns:
- email
severity: error
Typical Use Cases#
✅ Ensure that mandatory fields (e.g., IDs, keys, or business-critical attributes) are fully populated.
✅ Validate that primary key columns do not contain null values to guarantee referential integrity.
✅ Prevent incomplete records from entering downstream processes, reports, or analytics.
✅ Detect data gaps early to avoid issues in data transformations or aggregations.