feat: SQL-backed feature validation framework #136

danarmak · 2025-12-23T15:55:21Z

What does this PR do?

Add a framework and implementation for validating both black-box and SQL-query-based features.

Changes

Add validation interfaces (FeatureValidationError, FeatureValidator, SqlFeatureCorrector)
Implement several validations (LawAndOrderValidator)
Retrofit the existing checks in SqlBackedFeature.compute and in feature_from_query to raise the new FeatureValidationError
Add an algorithm (validator_loop.py) that tries to fix validation problems using a callback

Remaining work and other notes

To build a categorical feature and use it with this validation framework you will need to write a new function matching the FeatureFromQueryCtor protocol that would, I assume, call categorical_feature_from_query and then (if it succeeded) call the resulting feature on some dataset to collect the values it returns and attrs.evolve() the feature to have a correct categories list. Without this, subsequent validations will fail because categorical_feature_from_query by itself returns a feature with a deliberately wrong categories list, since it doesn't compute it on any data. Any errors raised in this new wrapper function would need to be surfaced as FeatureValidationErrors.
Please let me know if this is a reasonable plan or if something else is needed.
Query timeouts are a separate ticket (which I'll work on soon); until then non-terminating queries will remain a problem. (Yes, you can write a non-terminating query on small input data if you try.)
I unified the Law and Order validators to reduce the number of feature computes.
The caller of the validator is responsible for choosing which and how much data to pass in; the validator will not sample it further (but it does have a rows_to_compute_individually argument because doing it for all rows is likely to be expensive.)

Related Issues

Fixes SparkBeyond/aoa#376

…idations - Add validation interfaces (FeatureValidationError, FeatureValidator, SqlFeatureCorrector) - Implement several validations (LawAndOrderValidator) - Retrofit the existing checks in SqlBackedFeature.compute and in feature_from_query to raise the new FeatureValidationError - Add an algorithm (validator_loop) that tries to fix validation problems using a callback Fixes #376.

…eded for categorical features to work correctly

…p schemas in it, so that they are definitely deleted when the process exits.

danarmak · 2025-12-24T11:28:37Z

@leonidb I fixed the categorical feature bug, and made the change I wanted to make to DuckdbManager. This PR is now ready for review.

leonidb · 2025-12-24T17:15:02Z

agentune/analyze/feature/validate/law_and_order.py

+    - No errors are raised
+    - Feature does not always return null, or always NaN, or always the same value
+    - Feature does not access input or secondary columns that it does not declare
+    - Output is the same per row regarding of the (natural) ordering of the input table


typo: regardless

yotam319-sparkbeyond · 2025-12-25T17:13:04Z

agentune/analyze/feature/sql/base.py

+    The index_column_name MAY NOT shadow a column that appears in the 'real' primary input table, even if the query
+    doesn't use that column. This restriction may be lifted in the future.
+
+    Note that `primary_table` is a string, not a DuckdbName, because it is register()ed with the connection


NIT registered*

yotam319-sparkbeyond · 2025-12-25T17:32:01Z

agentune/analyze/feature/validate/law_and_order.py

+    Does not check if the feature depends on the natural ordering of any of the secondary tables.
+    Reordering the secondary tables without making a full copy of each or defining a query that reads and fully sorts each
+    is an unsolved problem. And making a copy of each of them is too expensive in the general case. So we don't validate it
+    for now.


is this even a requirement?

yotam319-sparkbeyond · 2025-12-25T17:42:27Z

agentune/analyze/feature/validate/law_and_order.py

+
+    async def _test_row_by_row(self, feature: Feature, input: Dataset, conn: DuckDBPyConnection,
+                               all_rows_result: pl.Series) -> None:
+        row_indexes = set(random.Random(42).sample(range(input.data.height), min(self.rows_to_compute_individually, input.data.height)))


should we have a minimum input.data.height for the validation?
we don't want validation to pass since it was called on input of height 1..

yotam319-sparkbeyond · 2025-12-25T17:46:59Z

agentune/analyze/feature/validate/law_and_order.py

+            args = tuple(args_by_name[name] for name in feature.params.names)
+            row_result = await self._acompute(feature, args, conn)
+            if row_result != all_rows_result[index] and not \
+                    (isinstance(row_result, float) and math.isnan(row_result) and math.isnan(all_rows_result[index])):


why the special case? is there a diffrent representation between acompute and acompute_batch representation of nan?

yotam319-sparkbeyond · 2025-12-25T17:47:17Z

agentune/analyze/feature/validate/law_and_order.py

+        for index in row_indexes:
+            args_by_name = input.data.row(index, named=True)
+            args = tuple(args_by_name[name] for name in feature.params.names)
+            row_result = await self._acompute(feature, args, conn)


why not in parallel?

yotam319-sparkbeyond · 2025-12-27T20:13:25Z

agentune/analyze/feature/sql/validator_loop.py

+    max_global_retries: int
+    max_local_retries: int
+    corrector: SqlFeatureCorrector
+    validators: Sequence[FeatureValidator]


apparently research use a different corrector per Validator, maybe we can put the corrector inside the FeatureValidator class? what do you think?
This design also works, but the corrector needs to support all errors - I do not know what is better...

yotam319-sparkbeyond · 2025-12-27T20:46:37Z

agentune/core/util/duckdbutil.py

+    PSEUDOCOLUMN = 2
+    NONE = 3
+
+def test_rowid_nature(conn: DuckDBPyConnection, name: str) -> RowidNature:


where is this used?

danarmak marked this pull request as draft December 23, 2025 15:56

danarmak requested a review from leonidb December 23, 2025 15:56

Fix: use raw_dtype in compute implementation of SqlBackedFeature - ne…

a7291d5

…eded for categorical features to work correctly

danarmak force-pushed the feat/376-feature-validations branch from 59489f2 to a7291d5 Compare December 24, 2025 11:02

Create a dedicate in-memory 'temp' DB in duckdbmanager, and place tem…

c909a53

…p schemas in it, so that they are definitely deleted when the process exits.

danarmak marked this pull request as ready for review December 24, 2025 11:28

leonidb reviewed Dec 24, 2025

View reviewed changes

yotam319-sparkbeyond reviewed Dec 27, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: SQL-backed feature validation framework #136

feat: SQL-backed feature validation framework #136

Uh oh!

danarmak commented Dec 23, 2025 •

edited

Loading

Uh oh!

danarmak commented Dec 24, 2025

Uh oh!

leonidb Dec 24, 2025

Uh oh!

yotam319-sparkbeyond Dec 25, 2025

Uh oh!

yotam319-sparkbeyond Dec 25, 2025

Uh oh!

yotam319-sparkbeyond Dec 25, 2025

Uh oh!

yotam319-sparkbeyond Dec 25, 2025

Uh oh!

yotam319-sparkbeyond Dec 25, 2025

Uh oh!

yotam319-sparkbeyond Dec 27, 2025

Uh oh!

yotam319-sparkbeyond Dec 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

feat: SQL-backed feature validation framework #136

Are you sure you want to change the base?

feat: SQL-backed feature validation framework #136

Uh oh!

Conversation

danarmak commented Dec 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Changes

Remaining work and other notes

Related Issues

Uh oh!

danarmak commented Dec 24, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

danarmak commented Dec 23, 2025 •

edited

Loading