forked from duckdb/duckdb
-
Notifications
You must be signed in to change notification settings - Fork 0
Variant extract pushdown #150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
Tishj
wants to merge
657
commits into
struct_extract_cast_pushdown
Choose a base branch
from
variant_extract_pushdown
base: struct_extract_cast_pushdown
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Three minor fixes: * one test code that was wrong * one a detail of Window's interator interface that was off * one just cleaning up the code a bit, with clearer (to me) iteration
* Rename the window self-join optimizer files to match their contents
* Add support for less than 2 or more, equal to 1 filter conditions.
* Replace some switch statements with state machine operations * Fix the max threads to use the maximum number of tasks. * Rename variable with legacy name
Follow-up on duckdb#19906 This PR allows eagerly executing ungrouped min/max aggregates with Parquet row group statistics analogous to the DuckDB file format.
When `duckdb.exe` shell is used in default Windows terminal and `odbc_scanner` extension is used to connect to Oracle DB - the unicode output gets broken in console (for all subsequent queries), example: ```sql SELECT 'Здравейте' AS hello; ``` ``` UÄÄÄÄÄÄÄÄÄÄÄ¿ 3 hello 3 3 varchar 3 AÄÄÄÄÄÄÄÄÄÄÄ' 3 ????????? 3 AÄÄÄÄÄÄÄÄÄÄÄU ``` Expected: ``` ┌───────────┐ │ hello │ │ varchar │ ├───────────┤ │ Здравейте │ └───────────┘ ``` The problem is originally reported in duckdb/odbc-scanner#86 . It appeared that, when Oracle ODBC driver is loaded it changes the system locale, as returned by `setlocale(LC_ALL, NULL)`, from `C` to: ``` LC_COLLATE=C;LC_CTYPE=English_United States.1252;LC_MONETARY=C;LC_NUMERIC=C;LC_TIME=C ``` The original idea was, in `odbc_scanner`, to save the locale value before loading new ODBC drivers and restore the locale after the `odbc_connect` call returns. But it appeared that `setlocale` on Windows is not process-wide, but CRT (MSVC C runtime library) -wide ([ref](https://learn.microsoft.com/en-us/cpp/c-runtime-library/global-state?view=msvc-170)). And because after duckdb/odbc-scanner#87 the `odbc_scanner` uses its own copy of C runtime lib - it cannot access/change the locale set by Oracle driver. For the same reason `main` branch builds of DuckDB are not affected, but the problem is present in `v1.4-andium`. While the problem happens only for a minor number of users and is largely non-blocking (only Unicode data is broken, ASCII data is displayed correctly), it breaks the UX for Oracle users. And Oracle (along with MSSQL) is the main target for `odbc_scanner`, thus the intention is to fix this for `v1.4-andium`. It was found that changing the translation mode for stdout from `_O_TEXT` to `_O_BINARY` ([ref](https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/setmode?view=msvc-170)) can be used as a workaround: ```c++ _setmode(_fileno(stdout), _O_BINARY) ``` But this call also appeared to be CRT-wide, so cannot be applied selectively from `odbc_scanner`. It was also observed, that historically on `v1.4-andium` the `fputs` call is used to print unicode to console. Incoming UTF-8 text is first converted to UTF-16 (with `MultiByteToWideChar`) and then converted back to UTF-8 (perhaps in some cases it can be different multibyte encoding here, not UTF-8) with `WideCharToMultiByte` before passing it to `fputs` . While on `main` this was changed to use `WriteConsoleW` passing it UTF-16 directly. And it appeared that `WriteConsoleW` is not affected by this problem. It is understood that recent shell enhancements in `main` are not intended for `v1.4-andium`, so this PR makes the minimal backport only changing the part of `utf8_printf()` call to use `WriteConsoleW` instead of the `fputs`. Testing: with manual smoke checks I cannot see any differences in console output for ordinary queries. Though I have limited experience with the DuckDB shell (mostly use other clients) so can miss some use-cases. Fixes: duckdb/odbc-scanner#86
The pointer was used incorrectly, but it was checked against NULL. Fixes: d4f7b54 ("add support for opening duckdb filesystem from c-api")
…nish reading to throw, stop reading the current chunk instead of waiting in a busy loop
…r if the rows are not present in the index (duckdb#20430) This PR cleans up the RowGroup scan code - in particular `ScanCommitted`. This method was originally intended to scan only committed rows, but had a bunch of options bolted onto it (e.g. scanning all rows including any deleted rows, only including deletes that are no longer referenced by any transactions). This also lead to a bunch of code duplication and added complexity. This PR refactors this and cleans up these methods, also removing a bunch of unnecessary / unused methods. These scans can now be performed by passing in `ScanOptions` that determines how the data should be scanned. This was all just yak shaving when trying to fix a bug uncovered by making `BoundIndex::Delete` throw an error when attempting to delete entries from an index that were not present in the index. This PR also introduces that and fixes two issues uncovered by that change.
…et throw (duckdb#20434) We cannot always immediately throw errors in the JSON reader as we might need to wait for previous reads to finish to (1) ensure we throw the first error in the file, and (2) know the exact line number where the error occurs. However, in the current implementation, when an error is found that we cannot throw we keep on looping and re-processing the error. This PR instead breaks out of the loop. When the reader of the previous chunk is finished it will then actually throw the error.
* Add the RHS bindings when we are doing SEMI or ANTI ASOF joins with a predicate.
* Disallow using arbitrary predicates in AsOf with RIGHT/FULL/SEMI joins.
* Convert the semi-join to an inner join and import the count directly
* Disallow using arbitrary predicates in AsOf with ANTI joins.
* Convert the semi-join to an inner join and import the count directly
* Disallow using arbitrary predicates in AsOf with RIGHT/FULL/SEMI joins.
* Remove the predicate test and relocation (join predicate push-down will take care of it) * Update test plans and add correctness tests for new cases.
* Enforce ordering in test
Follow-up to duckdb#20348. Related issue: duckdblabs/duckdb-internal#7002
* Remove the predicate test and relocation (join predicate push-down will take care of it) * Update test plans and add correctness tests for new cases.
Bumped while building duckdb-wasm, I would expect other clients or packagers of duckdb might also hit this, and fix is simple.
bd03819 to
b5d59e3
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.