Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Oct 3, 2025

Summary

Adds a configuration file setting and command-line flag to control whether existing filings are overwritten when extracting from daily feeds, addressing issue #13.

Changes

Configuration Setting

  • Added CACHE_FEED_OVERWRITE boolean setting to the [Paths] section of config files (positioned under CACHE_FEED)
  • Default value is False to preserve existing behavior (skip already-extracted filings)
  • Setting is documented in both the example config file and inline documentation

CLI Flag

  • Added -o / --overwrite command-line flag
  • When provided, the flag overrides the config file setting
  • Allows one-time overwrite operations without modifying the config file

Implementation Details

The overwrite parameter flows through the entire extraction chain:

CLI/Config → __main__.py → downloader.download_extract_feeds() → 
EDGARCacher.extract_daily_feeds() → EDGARCacher.extract_from_feed_cache()

The extraction logic checks if not overwrite and os.path.exists(nc_out_path) before writing files, ensuring proper behavior.

Additionally, pyedgar.utilities.edgarweb.download_feed() now has overwrite=None as the default parameter, automatically using the CACHE_FEED_OVERWRITE config setting when None is passed.

Usage Examples

Using config file:

[Paths]
CACHE_FEED_OVERWRITE=False  # Default: skip existing files

Using CLI flag:

# Default behavior (no overwrite)
python -m pyedgar -x -s 2024-01-01 -e 2024-01-05

# Force overwrite for this run only
python -m pyedgar -x -s 2024-01-01 -e 2024-01-05 --overwrite

# Short form
python -m pyedgar -x -s 2024-01-01 -e 2024-01-05 -o

Use Cases

  • Incremental updates: Default behavior skips existing files, saving time and bandwidth
  • Re-download corrupted files: Use --overwrite to force re-extraction of specific date ranges
  • Testing/development: Use --overwrite to repeatedly test with the same date range

Backward Compatibility

✅ Fully backward compatible - existing code continues to work without changes. The default behavior (no overwriting) is preserved, and the feature is entirely opt-in.

Files Modified

  • pyedgar/config.py: Added config setting and documentation in [Paths] section
  • pyedgar/__main__.py: Added CLI flag and parameter passing
  • pyedgar/pyedgar.conf: Updated example config with new setting in [Paths] section
  • pyedgar/utilities/edgarweb.py: Updated download_feed() to use overwrite=None default and fallback to config setting

Total: 4 files modified

Fixes #13


💬 Share your feedback on Copilot coding agent for the chance to win a $200 gift card! Click here to start the survey.


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Co-authored-by: gaulinmp <789400+gaulinmp@users.noreply.github.com>
Copilot AI changed the title [WIP] Config entry for overwrite Add config entry and CLI flag for overwrite on extraction Oct 3, 2025
Copilot AI requested a review from gaulinmp October 3, 2025 19:38
Copy link
Owner

@gaulinmp gaulinmp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot Add a change to pyedgar.utilities.edgarweb.download_feed to make default arg for overwrite=None, and take config value on None. Also move config file value to [Paths] section, under CACHE_FEED, and call it CACHE_FEED_OVERWRITE.

@gaulinmp gaulinmp marked this pull request as ready for review October 3, 2025 21:22
@gaulinmp gaulinmp self-requested a review October 3, 2025 21:27
@gaulinmp
Copy link
Owner

gaulinmp commented Oct 3, 2025

Add a change to pyedgar.utilities.edgarweb.download_feed to make default arg overwrite=None, and take config value on None. @copilot


; OVERWRITE_ON_EXTRACT controls whether to overwrite existing filings when extracting from feeds
; Default is False to avoid re-extracting already downloaded filings
OVERWRITE_ON_EXTRACT=False
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move config file value to [Paths] section, under CACHE_FEED, and call it CACHE_FEED_OVERWRITE. @copilot

Copy link
Owner

@gaulinmp gaulinmp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a change to pyedgar.utilities.edgarweb.download_feed to make default arg overwrite=None, and take config value on None. In pyedgar.conf, move config file value to [Paths] section, under CACHE_FEED, and call it CACHE_FEED_OVERWRITE . @copilot

@gaulinmp gaulinmp marked this pull request as draft October 3, 2025 21:40
…s] section; update download_feed to use config default

Co-authored-by: gaulinmp <789400+gaulinmp@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Oct 3, 2025

@copilot Add a change to pyedgar.utilities.edgarweb.download_feed to make default arg for overwrite=None, and take config value on None. Also move config file value to [Paths] section, under CAC...

Done in commit e8cfba2. Updated download_feed to have overwrite=None as default and use config.CACHE_FEED_OVERWRITE when None. Also moved config setting to [Paths] section and renamed to CACHE_FEED_OVERWRITE.

Add a change to pyedgar.utilities.edgarweb.download_feed to make default arg overwrite=None, and take config value on None. @copilot

Done in commit e8cfba2. Updated download_feed function signature to overwrite=None and added logic to use config.CACHE_FEED_OVERWRITE when overwrite is None.

Copilot AI requested a review from gaulinmp October 3, 2025 21:42
@gaulinmp gaulinmp marked this pull request as ready for review October 3, 2025 21:42
@gaulinmp gaulinmp merged commit ae80e8a into master Oct 3, 2025
1 check failed
@gaulinmp gaulinmp deleted the copilot/fix-3a7de25f-1320-48d9-89c5-d2817bf587f0 branch October 3, 2025 21:42
Copilot AI requested a review from gaulinmp October 3, 2025 21:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Config entry for overwrite

2 participants