Conversation
There was a problem hiding this comment.
Pull request overview
This PR removes attempts to read from the psql pipe (FILE* p) after writes, which is invalid because the pipe is opened write-only via popen(..., "w").
Changes:
- Commented out
while (fgetc(p) != EOF) ;synchronization/drain loops that attempted to read from a write-only pipe. - Added a clarifying comment in the base loader explaining why reading is not possible.
Reviewed changes
Copilot reviewed 34 out of 34 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| src/include/custom/pgsql/pgloader.h | Stops attempting to read from the write-only psql pipe during connect/copy/finish steps. |
| src/include/custom/pgsql/PGSQLZipCodeLoad.h | Removes per-row pipe read attempt after writing a record. |
| src/include/custom/pgsql/PGSQLWatchListLoad.h | Removes per-row pipe read attempt after writing a record. |
| src/include/custom/pgsql/PGSQLWatchItemLoad.h | Removes per-row pipe read attempt after writing a record. |
| src/include/custom/pgsql/PGSQLTradeTypeLoad.h | Removes per-row pipe read attempt after writing a record. |
| src/include/custom/pgsql/PGSQLTradeRequestLoad.h | Removes per-row pipe read attempt after writing a record. |
| src/include/custom/pgsql/PGSQLTradeLoad.h | Removes per-row pipe read attempt after writing a record. |
| src/include/custom/pgsql/PGSQLTradeHistoryLoad.h | Removes per-row pipe read attempt after writing a record. |
| src/include/custom/pgsql/PGSQLTaxrateLoad.h | Removes per-row pipe read attempt after writing a record. |
| src/include/custom/pgsql/PGSQLStatusTypeLoad.h | Removes per-row pipe read attempt after writing a record. |
| src/include/custom/pgsql/PGSQLSettlementLoad.h | Removes per-row pipe read attempt after writing a record. |
| src/include/custom/pgsql/PGSQLSecurityLoad.h | Removes per-row pipe read attempt after writing a record. |
| src/include/custom/pgsql/PGSQLSectorLoad.h | Removes per-row pipe read attempt after writing a record. |
| src/include/custom/pgsql/PGSQLNewsXRefLoad.h | Removes per-row pipe read attempt after writing a record. |
| src/include/custom/pgsql/PGSQLNewsItemLoad.h | Removes per-row pipe read attempt after writing a record. |
| src/include/custom/pgsql/PGSQLLastTradeLoad.h | Removes per-row pipe read attempt after writing a record. |
| src/include/custom/pgsql/PGSQLIndustryLoad.h | Removes per-row pipe read attempt after writing a record. |
| src/include/custom/pgsql/PGSQLHoldingSummaryLoad.h | Removes per-row pipe read attempt after writing a record. |
| src/include/custom/pgsql/PGSQLHoldingLoad.h | Removes per-row pipe read attempt after writing a record. |
| src/include/custom/pgsql/PGSQLHoldingHistoryLoad.h | Removes per-row pipe read attempt after writing a record. |
| src/include/custom/pgsql/PGSQLFinancialLoad.h | Removes per-row pipe read attempt after writing a record. |
| src/include/custom/pgsql/PGSQLExchangeLoad.h | Removes per-row pipe read attempt after writing a record. |
| src/include/custom/pgsql/PGSQLDailyMarketLoad.h | Removes per-row pipe read attempt after writing a record. |
| src/include/custom/pgsql/PGSQLCustomerTaxrateLoad.h | Removes per-row pipe read attempt after writing a record. |
| src/include/custom/pgsql/PGSQLCustomerLoad.h | Removes per-row pipe read attempt after writing a record. |
| src/include/custom/pgsql/PGSQLCustomerAccountLoad.h | Removes per-row pipe read attempt after writing a record. |
| src/include/custom/pgsql/PGSQLCompanyLoad.h | Removes per-row pipe read attempt after writing a record. |
| src/include/custom/pgsql/PGSQLCompanyCompetitorLoad.h | Removes per-row pipe read attempt after writing a record. |
| src/include/custom/pgsql/PGSQLCommissionRateLoad.h | Removes per-row pipe read attempt after writing a record. |
| src/include/custom/pgsql/PGSQLChargeLoad.h | Removes per-row pipe read attempt after writing a record. |
| src/include/custom/pgsql/PGSQLCashTransactionLoad.h | Removes per-row pipe read attempt after writing a record. |
| src/include/custom/pgsql/PGSQLBrokerLoad.h | Removes per-row pipe read attempt after writing a record. |
| src/include/custom/pgsql/PGSQLAddressLoad.h | Removes per-row pipe read attempt after writing a record. |
| src/include/custom/pgsql/PGSQLAccountPermissionLoad.h | Removes per-row pipe read attempt after writing a record. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
In an ideal world we would hope we could eventually add proper code to catch errors. :) I still haven't put enough though into it, but I wonder if the piping should eventually be redone... I'm going to try to make time for this soon to double check. :) |
|
Thanks for the feedback, @markwkm! I completely agree. Removing the loops fixes the immediate data loss/corruption issue, but it definitely leaves us without a way to catch errors if the pipe fails. |
|
@markwkm I tried making it more efficient .Could you please review it? |
I'll hopefully get to try it out at a customer size of 100,000 soon... I think the rename of the class to CFlatLoader isn't quite complete, the compiler is complaining, but nevertheless let's keep the changes to a minimum for now. "FlatLoader" isn't quite accurate too, it's really a "direct load", but in any event, I'm not sure a rename is necessary at this time. :) |
|
I don't think the fgetc() calls are the root cause of the problem. Building for 100,000 customers still causes issues like this when creating the foreign key constraints: I'm actually worried it's the piping to psql. I think the path worth trying is to actually use libpq. |
|
This patch might help more easily identify when errors happen with the build script. markwkm@04b7d85 |
Title: Fix inconsistent data loading (Fixes #24)
Description:
This PR fixes issue #24 where data was missing at high customer counts, causing foreign key failures.
The custom loader was incorrectly trying to read from a write-only pipe connected to psql. This caused stream errors that resulted in dropped rows. I have removed the invalid fgetc calls from the loader headers to ensure all data is flushed correctly.