Skip to content

Conversation

@jm-rivera
Copy link
Collaborator

This pull request extends the bulk download functionality to support .csv files in addition to .txt files. The main change is that the code now automatically detects and processes .csv files alongside .txt files during bulk downloads, including proper delimiter detection and conversion to parquet format when saving. The tests have also been updated to ensure correct handling of .csv files.

Bulk download enhancements:

  • Updated _save_or_return_parquet_files_from_content and _save_or_return_parquet_files_from_txt_in_zip in src/oda_reader/download/download_tools.py to auto-detect and process both .csv and .txt files, including proper delimiter detection and conversion to parquet. [1] [2] [3] [4] [5] [6] [7]

Testing improvements:

  • Added new tests in tests/download/unit/test_download_tools.py to verify auto-detection, reading, and saving of .csv files, including different delimiters and conversion to parquet. [1] [2]

Documentation and metadata updates:

  • Updated the changelog in CHANGELOG.md to record .csv file support and clarified descriptions.
  • Bumped the package version to 1.4.1 in pyproject.toml.

Error handling:

  • Improved error messages to reflect the inclusion of .csv files when no supported files are found in the archive. [1] [2]

@jm-rivera jm-rivera merged commit 68a203e into main Dec 19, 2025
9 checks passed
@jm-rivera jm-rivera deleted the enhance-parser branch December 19, 2025 16:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants