forked from duckdb/duckdb
-
Notifications
You must be signed in to change notification settings - Fork 0
[WIP] Variant shredded storage #147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
Tishj
wants to merge
1,163
commits into
parquet_variant_intermediate_conversion_merged
Choose a base branch
from
variant_shredded_storage
base: parquet_variant_intermediate_conversion_merged
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
[WIP] Variant shredded storage #147
Tishj
wants to merge
1,163
commits into
parquet_variant_intermediate_conversion_merged
from
variant_shredded_storage
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Tishj
commented
Nov 5, 2025
|
|
||
| namespace duckdb { | ||
|
|
||
| //! Struct column data represents a struct |
Owner
Author
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Struct -> Variant
… clarity and consistency, fixed a bug when the input is a constant vector
…ing index selection logic
…ts and bind logic.
…nt vector handling logic
…d test case with additional keys (so the vector_size tests will cover this pitfall)
…rror instead of returning NULL, and revise tests accordingly.
- Added `SupportsAliasReference` and `TryResolveAliasReference` in `ExpressionBinder`. - Implemented `TryResolveAliasReference` in `SelectBinder`. - Created tests for alias_ref functionality, including both valid cases and error scenarios.
* Initialize WindowExpression Booleans
…nts_shredding_auto
### Summary of Changes - Added support for `alias.name` in SELECT projection: - Introduced `SupportsAliasReference` and `TryResolveAliasReference` in `ExpressionBinder`. - Implemented `TryResolveAliasReference` specifically in `SelectBinder`. - Included robust tests for both valid aliases and edge cases. Related issue: duckdb#13991 (comment) <img width="281" height="229" alt="image" src="https://github.com/user-attachments/assets/31b673a3-0835-4618-add3-e4196abc1e2f" /> Works with: - `SELECT` - `GROUP BY` (works also with inside-expression aliases, `GROUP BY alias.x + 1`, which was impossible in legacy aliasing) - `WHERE` / `HAVING` / `QUALIFY` - `ORDER BY`
…uckdb#19878) This PR fixes duckdb#19875 During flushing of the segment we use `AlignValue` before writing the metadata, because it has to be 8-byte aligned. The existing bug is that we are not accounting for this extra space that will be required, when determining whether we will have enough space to store a container. This resulted in rare cases in creating a segment size that exceeded the size of a block, thankfully causing an InternalException. The other bug is in the RoaringBoolean's ScanPartial / FetchRow implementations. It was reading at an offset that was out of bounds of the dummy vector that was created. And the scanned data was written to the result Vector at a wrong offset. Thankfully none of these were bugs in the compression logic, so no borked files were created as a result of these bugs.
Part of epic: duckdblabs/duckdb-internal#6380 Hi there, This PR introduces a script `generate_peg_transformer.py` that looks at all the grammar files in `autocomplete/statement` and tries to find the corresponding transformer rule. It outputs per `*.gram` file which rules are still missing a transformer rule. In most cases, every grammar rule corresponds to a transformer rule for the PEG parser. Rules can be excluded if we don't require a transformer rule. For `alter.gram` for instance, the output will as follows to indicate that these transformer rules are still missing: ``` --- File: alter.gram --- [ MISSING ] AlterSchemaStmt [ MISSING ] SetData ``` Additionally, when using the `-g` argument, the script prints a template for every missing transformer rule (in alphabetical order). It will output three parts: 1. The method declaration that should be pasted into `peg_transformer.hpp` 2. The method implementation 3. The transformer rule registation The standard type that is used is `unique_ptr<SQLStatement>`. This is most likely not correct, but at the moment it is not possible to know the type in this template script. We could add a file where we can specify the return type for every rule if this is important enough. For now I added a `// TODO` to remind users to check the return type. I decided against letting this script modify the `transform_*.cpp` files directly to avoid making the script overly complex. Perhaps in a future update this could be looked into. For example, for `WithinGroupClause`: ``` --- Generation for rule: WithinGroupClause --- 1. Add DECLARATION to: ../extension/autocomplete/include/transformer/peg_transformer.hpp // TODO: Verify this return type is correct static unique_ptr<SQLStatement> TransformWithinGroupClause(PEGTransformer &transformer, optional_ptr<ParseResult> parse_result); 3. Add REGISTRATION to: ../extension/autocomplete/transformer/peg_transformer_factory.cpp Inside the appropriate Register...() function: REGISTER_TRANSFORM(TransformWithinGroupClause); 4. Add IMPLEMENTATION to: ../extension/autocomplete/transformer/transform_expression.cpp // TODO: Verify this return type is correct unique_ptr<SQLStatement> PEGTransformerFactory::TransformWithinGroupClause(PEGTransformer &transformer, optional_ptr<ParseResult> parse_result) { throw NotImplementedException("TransformWithinGroupClause has not yet been implemented"); } --- End of WithinGroupClause --- ``` In a future PR, when the transformer is close to completion, I would like to use this file in a workflow to check for missing transformer rules. Finally, it will output orphan rules, transformer rules that don't have a corresponding grammar rule. To avoid keeping transformer rules around that don't do anything. There are currently two arguments: - `-g` or `--generate`: - `-s` or `--skip-found`: Skips the rules that have the status [ FOUND ] or [ ENUM ] to keep the output a bit cleaner.
…st config (duckdb#19860) Also replace the `load` paths when we're running the `storage_compatibility` test config. As can be seen by the updated list of skipped tests, we now also properly test the Roaring boolean tests, and have to skip them because they are (correctly) not forwards-compatible. I've had to add a new list of skipped tests because of various problems with doing this path replacement for `load`, such as: - `load` can be used multiple times in a test, with different paths - The paths used in `load` can be referenced by the test (detach, attach, export, import, etc..) Parts of these could be fixed by turning the loaded path into a variable, so we could reference `${ACTIVE_DATABASE_PATH}` for example. ### Misc changes Fixed an uncaught error if the provided version isn't valid (the download fails) Print the errors to stderr rather than stdout, for easier filtering of the script output. Added prints for various reasons that a test was already being skipped without being marked as `[SKIPPED]`, for better debugging of the script.
Decode boolean primitive type in VectorElementDecoder to avoid infinite loop over Decodable version of decode function causing crash.
… TO parquet (duckdb#19336) This PR is a follow up to duckdb#19219 Which added the ability to provide a `SHREDDING` copy option to manually set the shredded type of a VARIANT column. This implemented shredding and made it easy to control, but it requires manual configuration by the user. With this PR, VARIANT columns will always be shredded based on the analysis of the first rowgroup. (TODO: this should be queryable with `parquet_metadata` / `parquet_schema`) This required some changes to the Parquet writing path: - `SchemaElement` items are no longer created during bind. - `ParquetColumnSchema` `schema_index` is now an `optional_idx`, as this index refers to the `SchemaElements` vector, which is populated later. - `ColumnWriter` no longer takes a reference to a `ParquetColumnSchema` object, instead it now owns the schema. - `FillParquetSchema` is removed, consolidated into `CreateWriterRecursive` - [1]`ParquetWriteTransformData` is introduced, this struct holds the reusable state needed to perform a pre-processing step to transform the input columns into the shape required by the ColumnWriter (relevant for VARIANT). - `FinalizeSchema` has been added to create the `SchemaElements`, and populate the `schema_index` in the schema objects. - `HasTransform`, `TransformedType` and `TransformExpression` are added to perform the pre-processing step, using the `ParquetWriteTransformData`. - `AnalyzeSchemaInit`, `AnalyzeSchema` and `AnalyzeSchemaFinalize` are added to auto-detect the shredding for VARIANT columns and apply this change to the schema, to be used before `FinalizeSchema` is called. [1] This is not as clean as I want it to be, the `ParquetWriteTransformData` can't only live in the `ParquetWriteLocalState`, there also needs to be a copy in the `ParquetWriteGlobalState` as well as the `ParquetWritePrepareBatch` method. ### Limitations `DECIMAL` types are never automatically shredded on. This is because the DECIMAL type is somewhat special, in that not only the underlying type has to be the same, the width+scale also has to be the same, so I've opted to not collect data for that.
This PR introduces basic row group pruning with using the query `LIMIT` and `OFFSET` for pagination-style queries like `SELECT * FROM t ORDER BY a LIMIT M OFFSET N`. Currently, these queries are processed with a min-heap with size `M + N` until a threshold of 0.7% of the table cardinality. For queries beyond that threshold, the table is fully sorted and then limited. Especially for larger tables, where we require out-of-core sorting, this results in a hefty performance cliff: Once again, we use the TSBS `cpu` table with 100M rows and a query `SELECT * FROM cpu ORDER BY time LIMIT 100 OFFSET N`. To illustrate the performance cliff, we set `N` once to 720000 and once to 730000 and compare the nightly build to this PR. | Offset | Nightly | This PR | Improvement | | ------ | ------- | ------- | ----------- | | 720K | 0.09s | 0.05s | **1.8x** | | 730K | 11s | 0.05s | **220x** | | 50M | 11s | 0.05s | **220x** | The PR uses a similar algorithm to duckdb#19655 to exclude row groups. Note that this is only possible, if there are no predicates and only for row groups that do not contain null values in the sort column. Moreover, this method only provides a benefit if the order column is already sorted/partitioned to some degree, i.e., pruning is not possible if the sort values are uniformly distributed. Finally, this PR moves the pruning optimizations into its own optimizer rule and adds the ability to fetch row group statistics from partition stats in the optimization phase.
* Extract SortStrategy, NaturalSort and FullSort from HashedSort * Fix HashedSort ContextClient usage. * Create SortStrategy::Factory method. * Clean up inherited instance variables. * Test AsOf usage. * Remove non-partitioned code from HashedSort. * Fix NaturalSort logic errors * Fix FullSort unused variable. fixes: duckdblabs/duckdb-internal#6568
Added `JoinHashTable::ScanKeyColumn`, a shared helper that fills a pointer vector and gathers a build column into a Vector, so full hash-table scans live in one place. Updated `PerfectHashJoinExecutor::FullScanHashTable` and `JoinFilterPushdownInfo::PushInFilter` to call this helper instead of duplicating the scan logic, and kept their early-exit semantics. All tests have been run, `make allunit`
…tract` expressions (duckdb#19829) In the `GetChildColumnBinding` method we currently iterate through child expressions and overwrite the returned `ExpressionBinding` for every child that sets `found_expression` to true. This results in overwriting a found `BoundColumnRefExpression` with a `BoundConstantExpression`, in the case of `struct_extract(my_struct, 'my_field')`
This PR fixes duckdblabs/duckdb-internal#6577 and fixes duckdblabs/duckdb-internal#6578 There's a new assertion in `ColumnScanState::Initialize` that asserts the state is only initialized once, which was broken by AlterType. We fix that by performing the initialize once, not once for every rowgroup.
* Initialize WindowExpression Booleans
This PR fixes nested sleep_ms calls and adds comprehensive tests. ## Changes - Fix nested sleep_ms calls - Add tests for nested sleep functionality ## Testing - Added test coverage for nested sleep scenarios
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Changes summary:
VariantValuefromextension/parquet/...to coreShreddingStateto core, renamed asVariantShreddingState, made it an abstract classVariantShreddingto core, made it an abstract class with pure virtual methodWriteVariantValuesVariantVisitorstruct created forConvertVariantToValueto its own header, renamed the struct toValueConverterVariantVisitorstruct created forvariant_normalizeto its own header, renamed the struct toVariantNormalizerVariantStatsforBaseStatisticsLogicalType logical_type;toPersistentColumnData[1]VariantColumnData[1] This is needed to inform the de/serialization of
shreddedfor VariantColumnDataThe internal structure of the
VariantColumnDatais:Where
shredded typeis of this shape:This mimics the VariantShredding of Parquet's VARIANT type, only we replace the repeated binary encoding with an index (
untyped_value_index) that refers to a value in theunshreddedfield at the root.At
Checkpointwe determine what the shredded type of the Variant should be, with an additional scan.Then we scan the existing data, transforming it to the shredded representation, write new
ColumnDatafor it, which we thenCheckpoint(this is whyunshreddedis of type STRUCT)