Conversation
# Conflicts: # nzgmdb/data_retrieval/sites.py # requirements.txt
# Conflicts: # nzgmdb/calculation/snr.py # nzgmdb/data_processing/process_observed.py # nzgmdb/data_processing/waveform_manipulation.py # nzgmdb/data_retrieval/sites.py # nzgmdb/data_retrieval/waveform_extraction.py # nzgmdb/mseed_management/reading.py # nzgmdb/phase_arrival/run_phasenet.py
Summary of ChangesHello @joelridden, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces significant improvements across the data processing pipeline, focusing on enhancing the accuracy and completeness of seismic data. Key updates include the automated computation of site-specific Vs30 and Z values, a more sophisticated approach to determining earthquake focal mechanisms, and the integration of a novel multi-event detection system. Furthermore, instrument response correction has been modernized, and the range of calculated intensity measures expanded to include PGD, all supported by comprehensive updates to data handling, configuration, and reporting tools. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces significant enhancements by incorporating new data sources for Vs30 and Z values, which improves the accuracy of site characterization. The changes are extensive, including updates to focal mechanism determination, performance optimizations through caching and batching, and a move to using remove_response which is a good practice. The addition of multi-event detection logic and comprehensive testing is also a valuable contribution.
I've identified a few issues, mainly related to data consistency and code duplication, which I've detailed in the comments. Once these are addressed, this will be a very strong PR.
# Conflicts: # requirements.txt
# Conflicts: # requirements.txt
lispandfound
left a comment
There was a problem hiding this comment.
Just some minor type checking changes. If you fix those I'm happy for you to override the PR check and merge without another round of review.
nzgmdb/data_retrieval/sites.py
Outdated
| def fill_gaps_with_nearest( | ||
| coords: np.ndarray, | ||
| values: np.ndarray, | ||
| invalid_mask: np.ndarray = None, |
There was a problem hiding this comment.
| invalid_mask: np.ndarray = None, | |
| invalid_mask: np.ndarray | None = None, |
nzgmdb/data_retrieval/sites.py
Outdated
| values: np.ndarray, | ||
| invalid_mask: np.ndarray = None, | ||
| k: int = 8, | ||
| ): |
There was a problem hiding this comment.
| ): | |
| ) -> np.ndarray: |
nzgmdb/data_retrieval/sites.py
Outdated
| tect_merged_df.loc[vs30_mask, "Q_Vs30"] = "Q3" | ||
|
|
||
| except (FileNotFoundError, ValueError, RuntimeError) as e: | ||
| print(f"Warning: Could not compute thresholds for missing Z1.0 values: {e}") |
There was a problem hiding this comment.
Rather than printing a warning here, perhaps raise a UserWarning? Advantage of this is that such warnings can be supressed and are visually distinct from other printed messages and so are easier to identify in the console.
nzgmdb/data_retrieval/sites.py
Outdated
| file_path: Path, | ||
| latlon_points: np.ndarray, | ||
| band: int = 1, | ||
| ): |
There was a problem hiding this comment.
| ): | |
| ) -> np.ndarray: |
| assert (site_df.loc[vs30_non_nan, "Vs30_Ref"] == "Foster et al. (2019)").all() | ||
| assert (site_df.loc[vs30_non_nan, "Q_Vs30"] == "Q3").all() | ||
| assert site_df.loc[~vs30_non_nan, "Vs30_Ref"].isna().all() | ||
| assert site_df.loc[~vs30_non_nan, "Q_Vs30"].isna().all() |
There was a problem hiding this comment.
My only concern with this test is that it is patching a lot of internals of the code it is testing, which makes it difficult to refactor that code because it will also break a lot of tests. This probably suggests that one of your functions is doing "too much" if it can't be tested in isolation of the NZCVM, numerical code, pooch fetching data, FDSN client and fiona. I would consider if this test is actually necessary or if the functions it tests can be unit tested in isolation without the need of an invasive integration test. Having said that, this is your codebase and if you feel the complexity of the test is not going to impose a refactor burden then I wouldn't hold up the PR for it.
f148282
Adds Vs30 and Z values for sites that do not have data from Liam site db.
Fetches Z values from velocity_modelling and Vs30 from Jaehwi v1.0 new map.
Adds testing for these added functions.
Few merge issues from other PR's