Performance Optimization: Switch to orjson and Critical Bug Fixes
#141
+143
−125
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
This PR optimizes backend performance by replacing the standard Python
jsonlibrary withorjsonin critical paths (large API responses and heavy serialization tasks).orjsonis significantly faster and handles native types likedatetimeandnumpyarrays automatically, removing the need for manual serialization logic. Additionally, this PR fixes a critical bug ineventbridge_service.pywhere a variable was undefined, and refactors validation schemas for better memory usage.Changes Made
Performance Optimizations
jsonwithorjson: Switched toorjsonfor JSON serialization/deserialization in the following key areas:backend/app/routers/newsletters.py): Removed manual recursive datetime serialization logic inget_parsed_trips.orjsonnow handles the serialization of nestedParsedDiveTripmodels natively, significantly reducing CPU overhead for large lists.backend/app/routers/users.py): Refactoredlist_all_usersto bypass FastAPI's slowerjsonable_encoderandJSONResponse. It now directly dumps Pydantic models to JSON bytes usingorjsonand returns a rawResponse, which is much faster for large datasets.backend/app/routers/privacy.py): Refactoredexport_user_datato remove manual.isoformat()calls on datetime objects. The raw dictionary with native datetime objects is now serialized directly byorjson, simplifying the code and improving export speed.backend/app/routers/settings.py): Updated to useorjsonfor storing and retrieving JSON blobs.backend/app/routers/dive_sites.py,backend/app/routers/diving_centers.py): Updated JSON dumping for geolocation and logging to useorjson.backend/app/schemas.py, moved constant lists (e.g.,ALLOWED_SOCIAL_PLATFORMS,SOCIAL_PLATFORM_DOMAINS) to the module level. This prevents them from being re-allocated on every single validation call, reducing memory churn.Bug Fixes
backend/app/services/eventbridge_service.py): Fixed a criticalNameErrorwhere thetargetsvariable was referenced intarget_paramsbut was undefined in thecreate_scheduled_rulemethod. The variable is now correctly defined before use.Documentation
GEMINI.md: Added a new "Performance Tuning Guidelines" section detailing the usage oforjsonand best practices for memory management in the project.Testing
./docker-test-github-actions.sh.orjson's behavior with a standalone script to confirm it correctly serializes mixeddate,time, anddatetimeobjects (includingNonevalues) into ISO 8601 strings, matching the API's previous output format.backend/tests/test_users.pyandbackend/tests/test_privacy.pypassed, confirming that the optimized endpoints return the correct structure and data types.Related Issues
Additional Notes
backend/lambda/email_processor.pyandutils/import_subsurface_dives.pyscripts intentionally retain standardjsonusage. This is to ensure portability (no C-extension dependencies likeorjsonneeded) for AWS Lambda environments and standalone CLI usage.orjsonhas been added tobackend/requirements.txt.