Skip to content

Multi event#85

Merged
joelridden merged 16 commits into4p4from
multi_event
Jan 23, 2026
Merged

Multi event#85
joelridden merged 16 commits into4p4from
multi_event

Conversation

@joelridden
Copy link
Contributor

@joelridden joelridden commented Dec 9, 2025

This pull request introduces a new multi-event detection feature based on STA/LTA (Short-Term Average/Long-Term Average) triggers, integrates it into the data processing and quality control pipeline, and updates related configuration and data handling. The changes add a new module for multi-event scoring, update the merging and filtering logic to include these scores, and revise configuration and preprocessing details to support the new workflow.

Multi-Event Detection and Scoring:

  • Added multi_event.py module implementing multi-event detection using STA/LTA triggers, including functions for computing scores and synchronizing events from waveform data.
  • Updated configuration (config.yaml) to include parameters for STA/LTA-based multi-event detection, such as window lengths, thresholds, and component weights.

Data Processing Pipeline Updates:

  • Modified merge_flatfiles.py to read, filter, and merge multi-event scores into the main ground motion flatfile, and to output a multi-event table as part of the merged results. [1] [2] [3] [4] [5]
  • Updated quality filtering logic in quality_db.py to use the new multi-event score and sync event flag for filtering records, replacing the previous multi-mean approach. [1] [2] [3] [4] [5] [6]

Waveform Preprocessing Improvements:

  • Improved waveform preprocessing to select inventory channels more robustly and to use remove_response instead of remove_sensitivity, ensuring correct handling of channel and response information. [1] [2] [3]

Site Table and Metadata Handling:

  • Enhanced site table creation to deduplicate on both network and station, and to merge elevation and date information more comprehensively. [1] [2]

Miscellaneous:

  • Adjusted the computation of ds_std_multiplier in waveform extraction to use a dynamic formula based on hypocentral distance.
  • Minor configuration cleanup (removal of unused ds_std_multiplier in config).
  • Updated imports to include the new multi-event module.

These changes collectively enhance the detection and filtering of multi-event records, improving the reliability of the ground motion database.

@joelridden joelridden marked this pull request as draft December 9, 2025 01:16
@gemini-code-assist
Copy link

Summary of Changes

Hello @joelridden, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a significant enhancement to the seismic data processing pipeline by implementing a multi-event detection mechanism using STA/LTA algorithms. It also includes a substantial refactoring effort to optimize FDSN inventory fetching, reducing redundant API calls and improving overall performance. The new multi-event scores are integrated into the data merging and quality control processes, allowing for more robust filtering of seismic records. Additionally, the changes improve the system's resilience to FDSN API rate limits and expand configuration options.

Highlights

  • Multi-Event Detection: Introduced a new module multi_event.py to implement Short-Term Average/Long-Term Average (STA/LTA) based multi-event detection and scoring for seismic waveforms. This includes functions to synchronize events from streams, compute STA/LTA triggers, and generate a weighted multi-trigger score.
  • Inventory Management Optimization: Refactored several functions across various modules (distances.py, snr.py, process_observed.py, waveform_manipulation.py, geonet.py, sites.py, mseed_to_gmprocess.py, and reading.py) to pass Inventory objects explicitly or fetch them once at a higher level. This significantly reduces redundant FDSN client calls, improving overall performance and efficiency.
  • Configuration Updates: Added new configuration parameters in config.yaml specifically for STA/LTA window lengths, trigger thresholds, minimum duration, gap merging, edge skipping, and component weights (H1, H2, Z) for multi-event detection. The channel_codes parameter was also updated from a list to a single comma-separated string format.
  • Data Integration and Quality Control: Integrated the newly computed multi-event scores (stalat_score, sync_event) into the main flatfile merging process (merge_flatfiles.py) and added corresponding columns to the output. The quality_db.py module was updated to filter records based on these new multi-event scores, replacing the previous multi_mean filter. Additionally, BH channels were added to the priority filtering logic.
  • FDSN Client Robustness: Implemented retry logic with a delay for FDSNTooManyRequestsException in waveform_extraction.py to handle FDSN API rate limits more gracefully, ensuring more reliable data retrieval.
  • Machine Configuration Expansion: A new machine configuration named rch was added to machine_config.yaml, alongside an adjustment to the hypocentre.extraction value, expanding the system's deployment options.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new multi-event detection feature, refactors inventory handling for performance, and updates configuration and data processing workflows. Key changes include adding a new multi_event.py module for STA/LTA trigger detection and event synchronization checks, and integrating these checks into the waveform extraction and quality filtering processes. The nzgmdb/config/config.yaml file now includes new parameters for STA/LTA windows and weights, and channel_codes are updated to include 'BH?'. Inventory fetching across several modules (distances.py, snr.py, process_observed.py, waveform_manipulation.py, geonet.py, sites.py, mseed_to_gmprocess.py, reading.py) has been optimized by passing pre-fetched Inventory objects or specifying level='station' where appropriate, reducing redundant FDSN calls. The waveform_extraction.py module now handles FDSNTooManyRequestsException with retries, passes event catalogs and extraction tables more efficiently, and collects multi-event scores. The merge_flatfiles.py module is updated to incorporate the new multi-event data into the final flatfile. In quality_db.py, the multi-event filtering logic has been updated to use the new stalta_score and sync_event fields, and the filter_duplicate_channels function now includes 'BH' channels. Review comments highlighted an incorrect syntax for catching multiple exceptions, an inverted logic for filtering multi-events based on stalta_score, a missing configuration key (multi_score_min), an unhandled case for selecting the 'Z' component in sync_event_from_stream, an incorrect type hint in a docstring, and a potential side effect from in-place DataFrame modification.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request implements multi-event detection functionality to identify seismic waveforms that may contain multiple seismic events. The implementation uses STA/LTA (Short-Term Average/Long-Term Average) trigger detection combined with a synchronous event check to flag potentially problematic records.

Changes:

  • Adds new multi_event.py module with STA/LTA trigger detection and synchronous event checking
  • Renames filter_multi_mean to filter_multi_event with updated filtering logic based on multi-event scores
  • Updates waveform extraction to compute and store multi-event scores for each record
  • Modifies response removal to use remove_response() instead of remove_sensitivity() with improved inventory handling
  • Adds dynamic ds_std_multiplier calculation based on hypocentral distance

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
nzgmdb/data_processing/multi_event.py New module implementing STA/LTA trigger detection and synchronous event checking
nzgmdb/data_processing/quality_db.py Renames filter function and updates logic to use multi-event scores
nzgmdb/data_retrieval/waveform_extraction.py Integrates multi-event scoring during extraction, adds retry logic for catalog fetching
nzgmdb/data_processing/waveform_manipulation.py Updates response removal to use remove_response() with better inventory handling
nzgmdb/phase_arrival/run_phasenet.py Updates response removal consistent with waveform_manipulation changes
nzgmdb/data_retrieval/sites.py Changes merge from left to outer join and improves duplicate handling
nzgmdb/data_processing/merge_flatfiles.py Adds multi-event data to flatfile merging process
nzgmdb/management/file_structure.py Adds MULTI_EVENT_TABLE to file structure enums
nzgmdb/config/config.yaml Adds multi-event configuration parameters
tests/test_quality_filters.py Updates test to use new filter_multi_event function
tests/quality_db_testing.csv Adds stalta_score and sync_event columns to test data

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@joelridden joelridden marked this pull request as ready for review January 19, 2026 00:25
ds_std_multiplier = config.get_value("ds_std_multiplier")

# Compute the ds multiplier time
ds_std_multiplier = 0.8 / (1 + np.exp(-0.035 * (r_hyp - 140))) + 2.2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to see a reference for this equation, ideally extracted as a function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to get an answer to this from Aaron as he developed this himself by looking at data, have got nothing. I imagine it will come out in the paper he will write, but that will be a while away.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just add a placeholder like [Publication under way : Aaron Rampersad et al]?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy with the comment that is there for now, provided we update it later.

# Conflicts:
#	nzgmdb/data_processing/waveform_manipulation.py
#	nzgmdb/data_retrieval/waveform_extraction.py
#	nzgmdb/phase_arrival/run_phasenet.py
@joelridden joelridden merged commit c2fbef6 into 4p4 Jan 23, 2026
5 checks passed
@joelridden joelridden deleted the multi_event branch January 23, 2026 01:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants