Conversation
za-arthur
left a comment
There was a problem hiding this comment.
Thank you @brendanyounger for the new feature. Could you also add tests and fix README?
| return TIMESTAMPOID; | ||
| case arrow::Type::DATE32: | ||
| return DATEOID; | ||
| case arrow::Type::DECIMAL: |
There was a problem hiding this comment.
Shouldn't be DECIMAL128 and DECIMAL256 used here similar to bytes_to_postgres_type?
|
@brendanyounger Hi, I tested this in PG 15 but couldn't get it to work fully. I have tried creating the foreign table with price column of type NUMERIC, and later with type FLOAT. Regardless of the type used, price always show up as 0. Example: However, if I cast it from NUMERIC to FLOAT (or vice versa), all price columns EXCEPT the first row show up correctly: The cast obviously triggers something (after the first row), but I have no idea where to start looking for this bug. Any ideas? |
|
NB: If I count the number of orders with price==0 this happens: But with a cast, it finds 105 cases where the price wrongly shows up as 0: This table is constructed from exactly 105 files, so it's the first price in each file that gets misinterpreted when using a cast (or every row if not using a cast). I guess this might help in finding the bug. |
|
And a bit more testing shows that if you add another cast, then the results becomes random (different results every time): |
| (UNIX_EPOCH_JDATE - POSTGRES_EPOCH_JDATE)); | ||
| case arrow::Type::DECIMAL128: { | ||
| auto dectype = (arrow::Decimal128Type *)arrow_type; | ||
| std::string val = arrow::Decimal128(bytes).ToString(dectype->scale()); |
There was a problem hiding this comment.
Converting the type to a string and then processing it with numeric_in is relatively inefficient, right? At least in our tests, it is significantly slower than Spark’s parsing. Are there better solutions to handle this type of conversion?
Can now read parquet files with decimal types.