Skip to content

Fix KeyError on API failures and pickle error in Python 3.13 (#9)#10

Open
parthashirolkar wants to merge 1 commit intoaddypy:masterfrom
parthashirolkar:master
Open

Fix KeyError on API failures and pickle error in Python 3.13 (#9)#10
parthashirolkar wants to merge 1 commit intoaddypy:masterfrom
parthashirolkar:master

Conversation

@parthashirolkar
Copy link
Copy Markdown

Summary

This PR fixes two critical bugs reported in issue #9:

Bug #1: KeyError when API times out or returns errors

  • Problem: When the data.gov.in API returns errors (e.g., 504 Gateway Timeout), get_data() crashes with KeyError: 'total' because get_resource_info() returns an empty dict {} but the code tries to access ["total"] directly.
  • Fix: Added proper error handling to check if resource_info is empty or doesn't contain "total" before accessing it. Now raises a clear ValueError with a helpful message.

Bug #2: Pickle error in Python 3.13 with multiprocessing

  • Problem: The @retry decorator on make_request_with_retry() creates unpicklable state (an RLock), causing multiprocessing.pool.MaybeEncodingError: cannot pickle '_thread.RLock' object when pool.map() tries to serialize results from worker processes.
  • Fix: Created _make_request_for_pool(), a non-decorated version of the request function specifically for multiprocessing workers. This avoids the pickle error while maintaining the retry logic for non-multiprocessing use cases.

Changes

  • Added _make_request_for_pool() function for multiprocessing workers
  • Modified get_api_records() to use the non-decorated function
  • Added error handling in get_data() to check for missing "total" key
  • Raises clear ValueError instead of crashing with KeyError

Testing

Both fixes have been tested:

  • get_resource_info() now properly handles API failures without crashing
  • get_data() works with multiprocessing in Python 3.13 without pickle errors
  • All README examples work correctly

Fixes #9

- Add proper error handling in get_data() when get_resource_info() returns empty dict
- Create _make_request_for_pool() for multiprocessing workers to avoid pickle errors
- get_api_records() now uses non-decorated function to prevent multiprocessing issues
- Fixes addypy#9
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: get_data() crashes on API errors and Python 3.13

1 participant