Unique Ratio¶
Check: unique-ratio-check
Purpose: Validates that the ratio of unique non-null values in a column meets or exceeds a defined threshold. The check fails if the proportion of unique values falls below min_ratio.
Note
Null values are excluded from the uniqueness calculation. The total number of rows (including nulls) is used as the denominator.
Typical Use Cases¶
- Ensure that columns expected to be mostly unique (e.g., IDs, hashes, transaction codes) behave as intended.
- Detect high-cardinality violations where a small set of values is unexpectedly repeated at scale.
- Support feature quality validation in ML preprocessing pipelines.