Skip to content

refactor: Extract geo+vectors fetch and merge into dedicated functions#7

Open
dshkol wants to merge 1 commit intorefactor/dataframe-processingfrom
refactor/extract-geo-vectors-merge
Open

refactor: Extract geo+vectors fetch and merge into dedicated functions#7
dshkol wants to merge 1 commit intorefactor/dataframe-processingfrom
refactor/extract-geo-vectors-merge

Conversation

@dshkol
Copy link
Owner

@dshkol dshkol commented Jan 15, 2026

Summary

Extract the complex hybrid geo+vectors logic (~70 lines) from get_census() into two dedicated helper functions.

Changes

Added two new functions:

_fetch_census_with_geometry_and_vectors()

  • Handles the CensusMapper API quirk where geo.geojson doesn't return vector data properly
  • Makes separate calls to geo.geojson and data.csv endpoints
  • Orchestrates the merge of geometry and vector data

_merge_geo_and_csv_results()

  • Merges GeoDataFrame with CSV DataFrame on geographic identifier
  • Detects common merge keys (GeoUID, id, rgid) automatically
  • Falls back to index-based merge if no common key found
  • Handles duplicate key cleanup after merge

Simplified get_census()

# Before: 70+ lines of inline merge logic
# After:
if geo_format == "geopandas" and vectors:
    result = _fetch_census_with_geometry_and_vectors(
        base_url, request_data, resolution, vectors, labels
    )

Why

  • Readability: get_census() main flow is now much simpler
  • Testability: Merge logic can be unit tested in isolation
  • Single responsibility: Each function has one clear purpose
  • Documentation: Detailed docstrings explain the API quirk

Test plan

  • All existing tests pass (17 total)
  • Added test_merge_on_geoid_key: verifies key-based merge
  • Added test_merge_fallback_by_index: verifies index fallback

🤖 Generated with Claude Code

@dshkol dshkol force-pushed the refactor/dataframe-processing branch from f7955c0 to 13ef466 Compare January 17, 2026 23:47
@dshkol dshkol force-pushed the refactor/extract-geo-vectors-merge branch from d04cbe0 to ae612c1 Compare January 17, 2026 23:47
@dshkol dshkol force-pushed the refactor/dataframe-processing branch from 13ef466 to 1a67981 Compare January 17, 2026 23:53
Extract the complex hybrid geo+vectors logic (~70 lines) from get_census()
into two dedicated helper functions:

1. _fetch_census_with_geometry_and_vectors():
   - Handles the CensusMapper API quirk where geo.geojson doesn't return
     vector data properly
   - Makes separate calls to geo.geojson and data.csv endpoints
   - Orchestrates the merge of geometry and vector data

2. _merge_geo_and_csv_results():
   - Merges GeoDataFrame with CSV DataFrame on geographic identifier
   - Detects common merge keys (GeoUID, id, rgid) automatically
   - Falls back to index-based merge if no common key found
   - Handles duplicate key cleanup after merge

This improves:
- Readability: get_census() main flow is now much simpler
- Testability: merge logic can be unit tested in isolation
- Single responsibility: each function has one clear purpose
- Documentation: detailed docstrings explain the API quirk

Added 2 new unit tests:
- test_merge_on_geoid_key: verifies key-based merge
- test_merge_fallback_by_index: verifies index fallback

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@dshkol dshkol force-pushed the refactor/extract-geo-vectors-merge branch from 58530e2 to 44e62ec Compare January 17, 2026 23:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant