sparkdq.checks#
- class ColumnLessThanCheckConfig(*, check_id: str, severity: Severity = Severity.CRITICAL, smaller_column: str, greater_column: str, inclusive: bool = False)[source]
Bases:
BaseRowCheckConfig
Declarative configuration model for the ColumnLessThanCheck.
This config defines a row-level comparison between two columns, ensuring that values in smaller_column are strictly less than (or less than or equal to, if inclusive=True) the values in greater_column.
Null values in either column are treated as invalid and will fail the check.
Example
- ColumnLessThanCheckConfig(
check_id=”start-before-end”, smaller_column=”start_time”, greater_column=”end_time”, inclusive=True
)
- smaller_column
Column expected to contain smaller (or equal) values.
- Type:
str
- greater_column
Column expected to contain greater values.
- Type:
str
- inclusive
If True, validates smaller_column <= greater_column. If False, requires strict inequality (<). Defaults to False.
- Type:
bool
- check_class
alias of
ColumnLessThanCheck
- model_config: ClassVar[ConfigDict] = {'populate_by_name': True, 'validate_by_alias': True, 'validate_by_name': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class ColumnPresenceCheckConfig(*, check_id: str, severity: Severity = Severity.CRITICAL, required_columns: list[str])[source]
Bases:
BaseAggregateCheckConfig
Declarative configuration model for the ColumnPresenceCheck.
This config defines a set of required column names that must exist in the DataFrame.
- required_columns
The list of required column names.
- Type:
list[str]
- check_class
alias of
ColumnPresenceCheck
- model_config: ClassVar[ConfigDict] = {'populate_by_name': True, 'validate_by_alias': True, 'validate_by_name': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class ColumnsAreCompleteCheckConfig(*, check_id: str, severity: Severity = Severity.CRITICAL, columns: List[str])[source]
Bases:
BaseAggregateCheckConfig
Declarative configuration model for the ColumnsAreCompleteCheck.
This configuration defines a completeness requirement for multiple columns. The check fails if any of the specified columns contain null values.
- columns
List of required columns that must be fully populated.
- Type:
List[str]
- check_class
alias of
ColumnsAreCompleteCheck
- model_config: ClassVar[ConfigDict] = {'populate_by_name': True, 'validate_by_alias': True, 'validate_by_name': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class CompletenessRatioCheckConfig(*, check_id: str, severity: Severity = Severity.CRITICAL, column: str, min_ratio: float)[source]
Bases:
BaseAggregateCheckConfig
Declarative configuration for CompletenessRatioCheck.
- column
Column name to assess.
- Type:
str
- min_ratio
Minimum allowed non-null ratio (between 0.0 and 1.0).
- Type:
float
- check_class
alias of
CompletenessRatioCheck
- model_config: ClassVar[ConfigDict] = {'populate_by_name': True, 'validate_by_alias': True, 'validate_by_name': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- validate_threshold() CompletenessRatioCheckConfig [source]
Ensures the min_ratio is between 0 and 1 (inclusive).
- class DateBetweenCheckConfig(*, check_id: str, severity: Severity = Severity.CRITICAL, columns: List[str], min_value: str, max_value: str, inclusive: tuple[bool, bool] = (False, False))[source]
Bases:
BaseRowCheckConfig
Declarative configuration model for DateBetweenCheck.
- columns
Date columns to validate.
- Type:
List[str]
- min_value
Minimum allowed date in ‘YYYY-MM-DD’ format.
- Type:
str
- max_value
Maximum allowed date in ‘YYYY-MM-DD’ format.
- Type:
str
- inclusive
Inclusion flags for min and max boundaries.
- Type:
tuple[bool, bool]
- check_class
alias of
DateBetweenCheck
- model_config: ClassVar[ConfigDict] = {'populate_by_name': True, 'validate_by_alias': True, 'validate_by_name': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- validate_between_values() DateBetweenCheckConfig [source]
Validates that min_value and max_value are properly configured and that
min_value
is not greater thanmax_value
.- Returns:
The validated configuration object.
- Return type:
DateBetweenCheckConfig
- Raises:
InvalidCheckConfigurationError – If min_value or max_value are not set or if min_value > max_value.
- class DateMaxCheckConfig(*, check_id: str, severity: Severity = Severity.CRITICAL, columns: List[str], max_value: str, inclusive: bool = False)[source]
Bases:
BaseRowCheckConfig
Declarative configuration model for DateMaxCheck.
- columns
Date columns to validate.
- Type:
List[str]
- max_value
The maximum allowed date in ISO format.
- Type:
str
- inclusive
Whether to include the maximum date.
- Type:
bool
- check_class
alias of
DateMaxCheck
- model_config: ClassVar[ConfigDict] = {'populate_by_name': True, 'validate_by_alias': True, 'validate_by_name': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class DateMinCheckConfig(*, check_id: str, severity: Severity = Severity.CRITICAL, columns: List[str], min_value: str, inclusive: bool = False)[source]
Bases:
BaseRowCheckConfig
Declarative configuration model for the DateMinCheck.
- columns
The list of date columns to validate.
- Type:
List[str]
- min_value
The minimum allowed date (inclusive), in ‘YYYY-MM-DD’ format.
- Type:
str
- inclusive
Whether to include the minimum date.
- Type:
bool
- check_class
alias of
DateMinCheck
- model_config: ClassVar[ConfigDict] = {'populate_by_name': True, 'validate_by_alias': True, 'validate_by_name': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class DistinctRatioCheckConfig(*, check_id: str, severity: Severity = Severity.CRITICAL, column: str, min_ratio: float)[source]
Bases:
BaseAggregateCheckConfig
Declarative configuration model for DistinctRatioCheck.
- column
The column to evaluate for distinctness.
- Type:
str
- min_ratio
Minimum required ratio of distinct values (between 0 and 1).
- Type:
float
- check_class
alias of
DistinctRatioCheck
- model_config: ClassVar[ConfigDict] = {'populate_by_name': True, 'validate_by_alias': True, 'validate_by_name': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class ExactlyOneNotNullCheckConfig(*, check_id: str, severity: Severity = Severity.CRITICAL, columns: List[str])[source]
Bases:
BaseRowCheckConfig
Declarative configuration model for the ExactlyOneNotNullCheck.
- columns
The names of the columns where exactly one must be non-null per row.
- Type:
List[str]
- check_class
alias of
ExactlyOneNotNullCheck
- model_config: ClassVar[ConfigDict] = {'populate_by_name': True, 'validate_by_alias': True, 'validate_by_name': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class FreshnessCheckConfig(*, check_id: str, severity: Severity = Severity.CRITICAL, column: str, interval: int, period: Literal['year', 'month', 'week', 'day', 'hour', 'minute', 'second'])[source]
Bases:
BaseAggregateCheckConfig
Declarative configuration model for the FreshnessCheck.
Ensures that the newest value in the specified timestamp column is recent enough relative to the current time.
- column
Name of the timestamp column.
- Type:
str
- interval
Time window size (must be positive).
- Type:
int
- period
Unit of time (e.g., “days”, “hours”, “mins”).
- Type:
str
- check_class
alias of
FreshnessCheck
- model_config: ClassVar[ConfigDict] = {'populate_by_name': True, 'validate_by_alias': True, 'validate_by_name': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class IsContainedInCheckConfig(*, check_id: str, severity: Severity = Severity.CRITICAL, allowed_values: dict[str, list[object]])[source]
Bases:
BaseRowCheckConfig
Declarative configuration model for the IsContainedInCheck.
This config allows validation that specified columns contain only predefined values.
- allowed_values
Mapping of column names to allowed values.
- Type:
dict[str, list[object]]
- check_class
alias of
IsContainedInCheck
- model_config: ClassVar[ConfigDict] = {'populate_by_name': True, 'validate_by_alias': True, 'validate_by_name': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- validate_allowed_values() IsContainedInCheckConfig [source]
Validate that allowed_values is not empty and properly formed.
- Returns:
The validated configuration object.
- Return type:
IsContainedInCheckConfig
- Raises:
InvalidCheckConfigurationError – If allowed_values is invalid.
- class IsNotContainedInCheckConfig(*, check_id: str, severity: Severity = Severity.CRITICAL, forbidden_values: dict[str, list[object]])[source]
Bases:
BaseRowCheckConfig
Declarative configuration model for the IsNotContainedInCheck.
This config allows validation that specified columns do NOT contain forbidden values.
- forbidden_values
Mapping of column names to forbidden values.
- Type:
dict[str, list[object]]
- check_class
alias of
IsNotContainedInCheck
- model_config: ClassVar[ConfigDict] = {'populate_by_name': True, 'validate_by_alias': True, 'validate_by_name': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- validate_forbidden_values() IsNotContainedInCheckConfig [source]
Validate that forbidden_values is not empty and properly formed.
- Returns:
The validated configuration object.
- Return type:
IsNotContainedInCheckConfig
- Raises:
InvalidCheckConfigurationError – If forbidden_values is missing or invalid.
- class NotNullCheckConfig(*, check_id: str, severity: Severity = Severity.CRITICAL, columns: List[str])[source]
Bases:
BaseRowCheckConfig
Declarative configuration model for the NotNullCheck.
- columns
The names of the columns that should remain null.
- Type:
List[str]
- check_class
alias of
NotNullCheck
- model_config: ClassVar[ConfigDict] = {'populate_by_name': True, 'validate_by_alias': True, 'validate_by_name': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class NullCheckConfig(*, check_id: str, severity: Severity = Severity.CRITICAL, columns: List[str])[source]
Bases:
BaseRowCheckConfig
Declarative configuration model for the NullCheck.
- columns
The names of the columns to check for null values. This is a required field and must match existing columns in the DataFrame.
- Type:
List[str]
- check_class
alias of
NullCheck
- model_config: ClassVar[ConfigDict] = {'populate_by_name': True, 'validate_by_alias': True, 'validate_by_name': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class NumericBetweenCheckConfig(*, check_id: str, severity: Severity = Severity.CRITICAL, columns: List[str], inclusive: Tuple[bool, bool] = (False, False), min_value: float | int | Decimal, max_value: float | int | Decimal)[source]
Bases:
BaseRowCheckConfig
Declarative configuration model for the NumericBetweenCheck.
This configuration defines both a lower and upper bound constraint on one or more numeric columns. It ensures that all specified columns contain only values between the configured min_value and max_value. Violations are flagged per row.
- columns
The list of numeric columns to validate.
- Type:
List[str]
- min_value
The minimum allowed value (inclusive).
- Type:
float | int | Decimal
- max_value
The maximum allowed value (inclusive).
- Type:
float | int | Decimal
- inclusive
Inclusion flags for min and max boundaries.
- Type:
tuple[bool, bool]
- check_class
alias of
NumericBetweenCheck
- model_config: ClassVar[ConfigDict] = {'populate_by_name': True, 'validate_by_alias': True, 'validate_by_name': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- validate_between_values() NumericBetweenCheckConfig [source]
Validates that
min_value
andmax_value
are properly configured and thatmin_value
is not greater thanmax_value
.- Returns:
The validated configuration object.
- Return type:
NumericBetweenCheckConfig
- Raises:
InvalidCheckConfigurationError – If min_value or max_value are not set or if min_value > max_value.
- class NumericMaxCheckConfig(*, check_id: str, severity: Severity = Severity.CRITICAL, columns: List[str], max_value: float | int | Decimal, inclusive: bool = False)[source]
Bases:
BaseRowCheckConfig
Declarative configuration model for the NumericMaxCheck.
- columns
The list of numeric columns to validate.
- Type:
List[str]
- max_value
The maximum allowed value (inclusive).
- Type:
float | int | Decimal
- inclusive
Whether to include the maximum value.
- Type:
bool
- check_class
alias of
NumericMaxCheck
- model_config: ClassVar[ConfigDict] = {'populate_by_name': True, 'validate_by_alias': True, 'validate_by_name': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class NumericMinCheckConfig(*, check_id: str, severity: Severity = Severity.CRITICAL, columns: List[str], min_value: float | int | Decimal, inclusive: bool = False)[source]
Bases:
BaseRowCheckConfig
Declarative configuration model for the NumericMinCheck.
- columns
The list of numeric columns to validate.
- Type:
List[str]
- min_value
The minimum allowed value (inclusive).
- Type:
float
- inclusive
Whether to include the minimum value.
- Type:
bool
- check_class
alias of
NumericMinCheck
- model_config: ClassVar[ConfigDict] = {'populate_by_name': True, 'validate_by_alias': True, 'validate_by_name': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class RegexMatchCheckConfig(*, check_id: str, severity: Severity = Severity.CRITICAL, column: str, pattern: str, ignore_case: bool = False, treat_null_as_failure: bool = False)[source]
Bases:
BaseRowCheckConfig
Configuration for RegexMatchCheck.
Validates that a string column matches a given regex pattern.
- column
Column to validate.
- Type:
str
- pattern
Regex pattern to use for matching.
- Type:
str
- ignore_case
If True, regex is case-insensitive (default: False).
- Type:
bool
- treat_null_as_failure
If True, null values are marked as failed (default: False).
- Type:
bool
- check_class
alias of
RegexMatchCheck
- model_config: ClassVar[ConfigDict] = {'populate_by_name': True, 'validate_by_alias': True, 'validate_by_name': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class RowCountBetweenCheckConfig(*, check_id: str, severity: Severity = Severity.CRITICAL, min_count: int, max_count: int)[source]
Bases:
BaseAggregateCheckConfig
Declarative configuration model for the RowCountBetweenCheck.
This config is used to define acceptable row count bounds in a data validation pipeline. It ensures that: - both min_count and max_count are provided, - and that min_count <= max_count.
It is typically used when defining checks via JSON, YAML, or dict-based configs.
- min_count
Minimum number of rows expected in the dataset.
- Type:
int
- max_count
Maximum number of rows allowed in the dataset.
- Type:
int
- check_class
alias of
RowCountBetweenCheck
- model_config: ClassVar[ConfigDict] = {'populate_by_name': True, 'validate_by_alias': True, 'validate_by_name': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- validate_range() RowCountBetweenCheckConfig [source]
Validate the logical consistency of the configured bounds.
This method ensures that
min_count
is not greater thanmax_count
. If violated, a configuration-level exception is raised immediately to prevent runtime failures.- Returns:
The validated configuration object.
- Return type:
RowCountBetweenCheckConfig
- Raises:
InvalidCheckConfigurationError – If
min_count > max_count
ormin_count
< 0.
- class RowCountExactCheckConfig(*, check_id: str, severity: Severity = Severity.CRITICAL, expected_count: int)[source]
Bases:
BaseAggregateCheckConfig
Declarative configuration model for the RowCountExactCheck.
This configuration defines an exact row count requirement for a dataset. It ensures that the
expected_count
parameter is provided and is non-negative.- expected_count
The exact number of rows expected in the dataset.
- Type:
int
- check_class
alias of
RowCountExactCheck
- model_config: ClassVar[ConfigDict] = {'populate_by_name': True, 'validate_by_alias': True, 'validate_by_name': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- validate_expected() RowCountExactCheckConfig [source]
Validate that the configured expected_count is greater than 0.
- Returns:
The validated configuration object.
- Return type:
RowCountExactCheckConfig
- Raises:
InvalidCheckConfigurationError – If
expected_count
is negative.
- class RowCountMaxCheckConfig(*, check_id: str, severity: Severity = Severity.CRITICAL, max_count: int)[source]
Bases:
BaseAggregateCheckConfig
Declarative configuration model for the RowCountMaxCheck.
This configuration defines a maximum row count requirement for a dataset. It ensures that the
max_count
parameter is provided and has a positive value.- max_count
Maximum number of rows allowed in the dataset.
- Type:
int
- check_class
alias of
RowCountMaxCheck
- model_config: ClassVar[ConfigDict] = {'populate_by_name': True, 'validate_by_alias': True, 'validate_by_name': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- validate_max() RowCountMaxCheckConfig [source]
Validate that the configured
max_count
is greater than 0.- Returns:
The validated configuration object.
- Return type:
RowCountMaxCheckConfig
- Raises:
InvalidCheckConfigurationError – If
max_count
is not greater than 0.
- class RowCountMinCheckConfig(*, check_id: str, severity: Severity = Severity.CRITICAL, min_count: int)[source]
Bases:
BaseAggregateCheckConfig
Declarative configuration model for the RowCountMinCheck.
This configuration defines a minimum row count requirement for a dataset. It ensures that the
min_count
parameter is provided and is non-negative.- min_count
Minimum number of rows expected in the dataset.
- Type:
int
- check_class
alias of
RowCountMinCheck
- model_config: ClassVar[ConfigDict] = {'populate_by_name': True, 'validate_by_alias': True, 'validate_by_name': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- validate_min() RowCountMinCheckConfig [source]
Validate that the configured
min_count
is greater than 0.- Returns:
The validated configuration object.
- Return type:
RowCountMinCheckConfig
- Raises:
InvalidCheckConfigurationError – If
min_count
is negative.
- class SchemaCheckConfig(*, check_id: str, severity: Severity = Severity.CRITICAL, expected_schema: dict[str, str], strict: bool = True)[source]
Bases:
BaseAggregateCheckConfig
Declarative configuration model for the ExpectedSchemaCheck.
Ensures the DataFrame matches the expected schema, with optional strict mode. Validates all specified types, including support for decimal(p,s) types.
- expected_schema
Required column names and Spark types.
- Type:
dict[str, str]
- strict
Whether to disallow unexpected columns.
- Type:
bool
- check_class
alias of
SchemaCheck
- model_config: ClassVar[ConfigDict] = {'populate_by_name': True, 'validate_by_alias': True, 'validate_by_name': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- validate_schema() SchemaCheckConfig [source]
Validates that expected_schema is not empty and all types are valid.
- Raises:
InvalidCheckConfigurationError – If any type is invalid.
- class StringLengthBetweenCheckConfig(*, check_id: str, severity: Severity = Severity.CRITICAL, column: str, min_length: int, max_length: int, inclusive: tuple[bool, bool] = (True, True))[source]
Bases:
BaseRowCheckConfig
Configuration for StringLengthBetweenCheck.
Validates that string values in the given column fall between a minimum and maximum length.
- column
The string column to validate.
- Type:
str
- min_length
Minimum valid length (must be > 0).
- Type:
int
- max_length
Maximum valid length (must be >= min_length).
- Type:
int
- inclusive
Tuple indicating inclusiveness of min and max bounds.
- Type:
tuple[bool, bool]
- check_class
alias of
StringLengthBetweenCheck
- model_config: ClassVar[ConfigDict] = {'populate_by_name': True, 'validate_by_alias': True, 'validate_by_name': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- validate_range() StringLengthBetweenCheckConfig [source]
Validates that the min/max configuration is logically sound.
- Returns:
Validated instance.
- Return type:
StringLengthBetweenCheckConfig
- Raises:
InvalidCheckConfigurationError – If min_length > max_length or values are invalid.
- class StringMaxLengthCheckConfig(*, check_id: str, severity: Severity = Severity.CRITICAL, column: str, max_length: int, inclusive: bool = True)[source]
Bases:
BaseRowCheckConfig
Configuration for StringMaxLengthCheck.
Ensures that string values do not exceed the specified maximum length.
- column
Column to validate.
- Type:
str
- max_length
Maximum allowed length (must be > 0).
- Type:
int
- inclusive
If True, length must be <= max_length. If False, strictly < max_length.
- Type:
bool
- check_class
alias of
StringMaxLengthCheck
- model_config: ClassVar[ConfigDict] = {'populate_by_name': True, 'validate_by_alias': True, 'validate_by_name': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- validate_max_length() StringMaxLengthCheckConfig [source]
Validate that max_length is greater than 0.
- Returns:
The validated object.
- Return type:
StringMaxLengthCheckConfig
- Raises:
InvalidCheckConfigurationError – If max_length <= 0.
- class StringMinLengthCheckConfig(*, check_id: str, severity: Severity = Severity.CRITICAL, column: str, min_length: int, inclusive: bool = True)[source]
Bases:
BaseRowCheckConfig
Configuration for StringMinLengthCheck.
Ensures that all non-null values in the specified column have a minimum length.
- column
Name of the string column to validate.
- Type:
str
- min_length
Minimum allowed string length (must be > 0).
- Type:
int
- inclusive
If True, allows equality (>=). If False, requires strictly greater length (>).
- Type:
bool
- check_class
alias of
StringMinLengthCheck
- model_config: ClassVar[ConfigDict] = {'populate_by_name': True, 'validate_by_alias': True, 'validate_by_name': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- validate_min_length() StringMinLengthCheckConfig [source]
Validate that the configured min_length is greater than 0.
- Returns:
The validated configuration object.
- Return type:
StringMinLengthCheckConfig
- Raises:
InvalidCheckConfigurationError – If min_length is not greater than 0.
- class TimestampBetweenCheck(check_id: str, columns: List[str], min_value: str, max_value: str, inclusive: tuple[bool, bool], severity: Severity = Severity.CRITICAL)[source]
Bases:
BaseBetweenCheck
Row-level data quality check that verifies timestamp values are within a defined range.
A row fails the check if any of the specified columns contain a timestamp value that is less than min_value or greater than max_value. Boundary inclusiveness is configurable.
- class TimestampMaxCheckConfig(*, check_id: str, severity: Severity = Severity.CRITICAL, columns: List[str], max_value: str, inclusive: bool = False)[source]
Bases:
BaseRowCheckConfig
Declarative configuration model for TimestampMaxCheck.
- columns
The timestamp columns to validate.
- Type:
List[str]
- max_value
The maximum allowed timestamp in ISO 8601 format.
- Type:
str
- inclusive
Whether to include the upper bound timestamp.
- Type:
bool
- check_class
alias of
TimestampMaxCheck
- model_config: ClassVar[ConfigDict] = {'populate_by_name': True, 'validate_by_alias': True, 'validate_by_name': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class TimestampMinCheckConfig(*, check_id: str, severity: Severity = Severity.CRITICAL, columns: List[str], min_value: str, inclusive: bool = False)[source]
Bases:
BaseRowCheckConfig
Declarative configuration model for the TimestampMinCheck.
- columns
The list of timestamp columns to validate.
- Type:
List[str]
- min_value
The minimum allowed timestamp in ISO 8601 format (e.g. ‘2023-01-01T00:00:00’).
- Type:
str
- inclusive
Whether the minimum value is inclusive.
- Type:
bool
- check_class
alias of
TimestampMinCheck
- model_config: ClassVar[ConfigDict] = {'populate_by_name': True, 'validate_by_alias': True, 'validate_by_name': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class UniqueRatioCheckConfig(*, check_id: str, severity: Severity = Severity.CRITICAL, column: str, min_ratio: float)[source]
Bases:
BaseAggregateCheckConfig
Declarative configuration model for the UniqueRatioCheck.
- column
The column to check for uniqueness.
- Type:
str
- min_ratio
The minimum acceptable ratio of distinct values (0.0 - 1.0).
- Type:
float
- check_class
alias of
UniqueRatioCheck
- model_config: ClassVar[ConfigDict] = {'populate_by_name': True, 'validate_by_alias': True, 'validate_by_name': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class UniqueRowsCheckConfig(*, check_id: str, severity: Severity = Severity.CRITICAL, subset_columns: List[str] | None = None)[source]
Bases:
BaseAggregateCheckConfig
Declarative configuration for the UniqueRowsCheck.
This check verifies that no duplicate row combinations exist in the dataset. Uniqueness can be enforced across all columns or a selected subset.
- subset_columns
List of columns to define uniqueness. If not provided, all columns are used.
- Type:
Optional[List[str]]
- check_class
alias of
UniqueRowsCheck
- model_config: ClassVar[ConfigDict] = {'populate_by_name': True, 'validate_by_alias': True, 'validate_by_name': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].