validatex.core package
Submodules
validatex.core.expectation module
Expectation base classes and registry.
This module defines the base Expectation class and the global registry that maps expectation type names to their implementation classes.
- class validatex.core.expectation.Expectation(column: str | None = None, kwargs: Dict[str, ~typing.Any]=<factory>, meta: Dict[str, ~typing.Any]=<factory>)[source]
Bases:
ABCAbstract base class for all expectations.
- Subclasses must:
Set the class attribute
expectation_type(a unique string id).Implement
_validate_pandas()and/or_validate_spark().
- column: str | None = None
- expectation_type: str = 'base_expectation'
- classmethod from_dict(d: Dict[str, Any]) Expectation[source]
Deserialize from a dictionary.
- kwargs: Dict[str, Any]
- meta: Dict[str, Any]
- validate(data: Any, engine: str = 'pandas') ExpectationResult[source]
Run this expectation against data using the specified engine.
- Parameters:
data (Any) – The dataset (pd.DataFrame or pyspark.sql.DataFrame).
engine (str) –
"pandas"or"spark".
- Return type:
- validatex.core.expectation.get_expectation_class(name: str) Type[Expectation][source]
Look up an expectation class by its registered type name.
- validatex.core.expectation.list_expectations() List[str][source]
Return a sorted list of all registered expectation type names.
- validatex.core.expectation.register_expectation(cls: Type[Expectation]) Type[Expectation][source]
Decorator that registers an expectation class by its type_name.
validatex.core.result module
Validation result data models.
Every expectation run produces an ExpectationResult.
A full validation run aggregates them into a ValidationResult.
- class validatex.core.result.ColumnHealthSummary(column: str, checks: int = 0, passed: int = 0, failed: int = 0, errors: int = 0, null_count: int | None = None, null_percent: float | None = None, unique_count: int | None = None, unique_percent: float | None = None, total_rows: int | None = None)[source]
Bases:
objectAggregated health metrics for a single column.
- checks: int = 0
- column: str
- errors: int = 0
- failed: int = 0
- property health_score: float
- null_count: int | None = None
- null_percent: float | None = None
- passed: int = 0
- total_rows: int | None = None
- unique_count: int | None = None
- unique_percent: float | None = None
- class validatex.core.result.ExpectationResult(expectation_type: str, success: bool, column: str | None = None, observed_value: Any = None, element_count: int = 0, unexpected_count: int = 0, unexpected_percent: float = 0.0, unexpected_values: List[Any] = <factory>, details: Dict[str, ~typing.Any]=<factory>, exception_info: str | None = None, meta: Dict[str, ~typing.Any]=<factory>)[source]
Bases:
objectResult of a single expectation evaluation.
- column: str | None = None
- details: Dict[str, Any]
- element_count: int = 0
- exception_info: str | None = None
- expectation_type: str
- property human_observed: str
Return a human-readable string for the observed value.
Converts raw dicts / technical strings into executive-friendly text.
- meta: Dict[str, Any]
- observed_value: Any = None
- property severity: str
Return severity level for this expectation.
- property severity_icon: str
- property status: str
- property status_icon: str
- success: bool
- unexpected_count: int = 0
- unexpected_percent: float = 0.0
- unexpected_values: List[Any]
- class validatex.core.result.ValidationResult(suite_name: str, results: List[ExpectationResult] = <factory>, run_time: datetime | None = None, run_duration_seconds: float = 0.0, data_source: str | None = None, engine: str = 'pandas', statistics: Dict[str, ~typing.Any]=<factory>)[source]
Bases:
objectAggregate result of running an entire expectation suite.
- column_health() List[ColumnHealthSummary][source]
Aggregate expectation results by column.
Extracts null % and unique % from specific expectation types when present.
- compute_quality_score() float[source]
Compute a weighted data quality score (0–100).
- Severity weights:
Critical: ×3
Warning : ×2
Info : ×1
Score = 100 × (weighted_passed / weighted_total)
- data_source: str | None = None
- engine: str = 'pandas'
- property errored_expectations: int
- property failed_expectations: int
- results: List[ExpectationResult]
- run_duration_seconds: float = 0.0
- run_time: datetime | None = None
- statistics: Dict[str, Any]
- property success: bool
True only if every expectation passed.
- property success_percent: float
- property successful_expectations: int
- suite_name: str
- property total_expectations: int
validatex.core.suite module
Expectation Suite — a named, ordered collection of expectations.
Suites can be built programmatically or loaded from YAML / JSON configs.
- class validatex.core.suite.ExpectationSuite(name: str, expectations: List[Expectation] = <factory>, meta: Dict[str, ~typing.Any]=<factory>)[source]
Bases:
objectA named collection of expectations.
Examples
>>> suite = ExpectationSuite("user_data_quality") >>> suite.add("expect_column_to_not_be_null", column="user_id") >>> suite.add("expect_column_values_to_be_between", ... column="age", min_value=0, max_value=150)
- add(expectation_type: str, column: str | None = None, meta: Dict[str, Any] | None = None, **kwargs: Any) ExpectationSuite[source]
Add an expectation to this suite.
- Parameters:
expectation_type (str) – The registered name of the expectation (e.g.
"expect_column_to_not_be_null").column (str, optional) – Target column name.
meta (dict, optional) – Arbitrary metadata to attach.
**kwargs – Additional arguments forwarded to the expectation (e.g.
min_value,regex).
- Returns:
selffor fluent chaining.- Return type:
- add_expectation(expectation: Expectation) ExpectationSuite[source]
Add a pre-built Expectation instance.
- clear() ExpectationSuite[source]
Remove all expectations.
- expectations: List[Expectation]
- classmethod from_dict(data: Dict[str, Any]) ExpectationSuite[source]
Create a suite from a plain dictionary.
- classmethod load(filepath: str) ExpectationSuite[source]
Load from a YAML or JSON file.
- meta: Dict[str, Any]
- name: str
- remove(index: int) ExpectationSuite[source]
Remove an expectation by index.
validatex.core.validator module
Validator — orchestrates expectation suite execution against a dataset.
The validate() convenience function is the primary public entry point.
- class validatex.core.validator.Validator(suite: ExpectationSuite, engine: str = 'pandas')[source]
Bases:
objectRuns an
ExpectationSuiteagainst a dataset.- Parameters:
suite (ExpectationSuite) – The suite of expectations to evaluate.
engine (str) –
"pandas"or"spark".
- run(data: Any, data_source: str | None = None) ValidationResult[source]
Execute every expectation in the suite against data.
- Parameters:
data (pd.DataFrame | pyspark.sql.DataFrame) – The dataset to validate.
data_source (str, optional) – A label describing where the data came from.
- Return type:
- validatex.core.validator.validate(data: Any, suite: ExpectationSuite, engine: str = 'pandas', data_source: str | None = None) ValidationResult[source]
Convenience function to validate data against a suite.
- Parameters:
data (pd.DataFrame | pyspark.sql.DataFrame)
suite (ExpectationSuite)
engine (str) –
"pandas"or"spark".data_source (str, optional)
- Return type:
Module contents
Core module for ValidateX - contains the fundamental building blocks.
- class validatex.core.Expectation(column: str | None = None, kwargs: Dict[str, ~typing.Any]=<factory>, meta: Dict[str, ~typing.Any]=<factory>)[source]
Bases:
ABCAbstract base class for all expectations.
- Subclasses must:
Set the class attribute
expectation_type(a unique string id).Implement
_validate_pandas()and/or_validate_spark().
- column: str | None = None
- expectation_type: str = 'base_expectation'
- classmethod from_dict(d: Dict[str, Any]) Expectation[source]
Deserialize from a dictionary.
- kwargs: Dict[str, Any]
- meta: Dict[str, Any]
- validate(data: Any, engine: str = 'pandas') ExpectationResult[source]
Run this expectation against data using the specified engine.
- Parameters:
data (Any) – The dataset (pd.DataFrame or pyspark.sql.DataFrame).
engine (str) –
"pandas"or"spark".
- Return type:
- class validatex.core.ExpectationResult(expectation_type: str, success: bool, column: str | None = None, observed_value: Any = None, element_count: int = 0, unexpected_count: int = 0, unexpected_percent: float = 0.0, unexpected_values: List[Any] = <factory>, details: Dict[str, ~typing.Any]=<factory>, exception_info: str | None = None, meta: Dict[str, ~typing.Any]=<factory>)[source]
Bases:
objectResult of a single expectation evaluation.
- column: str | None = None
- details: Dict[str, Any]
- element_count: int = 0
- exception_info: str | None = None
- expectation_type: str
- property human_observed: str
Return a human-readable string for the observed value.
Converts raw dicts / technical strings into executive-friendly text.
- meta: Dict[str, Any]
- observed_value: Any = None
- property severity: str
Return severity level for this expectation.
- property severity_icon: str
- property status: str
- property status_icon: str
- success: bool
- unexpected_count: int = 0
- unexpected_percent: float = 0.0
- unexpected_values: List[Any]
- class validatex.core.ExpectationSuite(name: str, expectations: List[Expectation] = <factory>, meta: Dict[str, ~typing.Any]=<factory>)[source]
Bases:
objectA named collection of expectations.
Examples
>>> suite = ExpectationSuite("user_data_quality") >>> suite.add("expect_column_to_not_be_null", column="user_id") >>> suite.add("expect_column_values_to_be_between", ... column="age", min_value=0, max_value=150)
- add(expectation_type: str, column: str | None = None, meta: Dict[str, Any] | None = None, **kwargs: Any) ExpectationSuite[source]
Add an expectation to this suite.
- Parameters:
expectation_type (str) – The registered name of the expectation (e.g.
"expect_column_to_not_be_null").column (str, optional) – Target column name.
meta (dict, optional) – Arbitrary metadata to attach.
**kwargs – Additional arguments forwarded to the expectation (e.g.
min_value,regex).
- Returns:
selffor fluent chaining.- Return type:
- add_expectation(expectation: Expectation) ExpectationSuite[source]
Add a pre-built Expectation instance.
- clear() ExpectationSuite[source]
Remove all expectations.
- expectations: List[Expectation]
- classmethod from_dict(data: Dict[str, Any]) ExpectationSuite[source]
Create a suite from a plain dictionary.
- classmethod load(filepath: str) ExpectationSuite[source]
Load from a YAML or JSON file.
- meta: Dict[str, Any]
- name: str
- remove(index: int) ExpectationSuite[source]
Remove an expectation by index.
- class validatex.core.ValidationResult(suite_name: str, results: List[ExpectationResult] = <factory>, run_time: datetime | None = None, run_duration_seconds: float = 0.0, data_source: str | None = None, engine: str = 'pandas', statistics: Dict[str, ~typing.Any]=<factory>)[source]
Bases:
objectAggregate result of running an entire expectation suite.
- column_health() List[ColumnHealthSummary][source]
Aggregate expectation results by column.
Extracts null % and unique % from specific expectation types when present.
- compute_quality_score() float[source]
Compute a weighted data quality score (0–100).
- Severity weights:
Critical: ×3
Warning : ×2
Info : ×1
Score = 100 × (weighted_passed / weighted_total)
- data_source: str | None = None
- engine: str = 'pandas'
- property errored_expectations: int
- property failed_expectations: int
- results: List[ExpectationResult]
- run_duration_seconds: float = 0.0
- run_time: datetime | None = None
- statistics: Dict[str, Any]
- property success: bool
True only if every expectation passed.
- property success_percent: float
- property successful_expectations: int
- suite_name: str
- property total_expectations: int
- class validatex.core.Validator(suite: ExpectationSuite, engine: str = 'pandas')[source]
Bases:
objectRuns an
ExpectationSuiteagainst a dataset.- Parameters:
suite (ExpectationSuite) – The suite of expectations to evaluate.
engine (str) –
"pandas"or"spark".
- run(data: Any, data_source: str | None = None) ValidationResult[source]
Execute every expectation in the suite against data.
- Parameters:
data (pd.DataFrame | pyspark.sql.DataFrame) – The dataset to validate.
data_source (str, optional) – A label describing where the data came from.
- Return type: