Query transformer infrastructure & example query transformer implementations #1
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Based on visualfabriq#27 but can be rebased on master.
This introduces the infrastructure for plug-in query transformers. Included are three sample query transformers:
InOperatorTransformer:my_col in ['ABC', 'DEF']is transformed into(my_col == 'ABC') | (my_col == 'DEF'). The operationnot inis similarly transformed.TrivialBooleanExpressionsOptimizer:(my_col == 'ABC') | (False)is transformed intoFalse(limited usefulness without an intelligent query optimizer)CachedFactorOptimizer: converts comparisons containing columns with cached factors into comparisons using the factor instead. (Naive implementation, currently only useful for edge-cases.)By default this PR does not change the behaviour or dependencies of bquery. Query transformers have to be explicitly enabled by configuring them, e.g.:
For convenience, a shortcut is provided for these (currently) most useful transformers with
transformers.standard_transformers:The overhead for queries is negligible for reasonably sized databases: For the query
db["my_col=='AB1234567890'"]bquery without query transformers requires 362 ms, with all query transformers configured (includingCachedFactorOptimizer) 367 ms.With a non-compressed database the
CachedFactorOptimizershows some minor positive effects: 547 ms vs. 296 ms