-
Notifications
You must be signed in to change notification settings - Fork 0
implemented NISAR adapation and provide template for future project adaptation #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Move json, logging, requests, sys, getpass, datetime, and urllib imports to top - Remove duplicate imports from main block - Enables functions to be imported and used by other scripts
- Move NISAR-specific scripts to dedicated nisar/ directory - Create comprehensive project organization with hierarchical breakdown - Add job execution time extractor with wall_time analysis - Create PROJECT_TEMPLATE.md for future project adaptations - Add QUICK_START.md for easy reference - Clean up obsolete debugging and testing scripts - Update all documentation to reflect new structure
- Add sys.path.append to correctly import from parent metrics_extractor directory - Fixes ModuleNotFoundError when running scripts from nisar/ directory - Both hysds_metrics_es_extractor_enhanced.py and job_execution_time_extractor.py updated
- CSV files are created in CWD when scripts are run - Empty generated_csv_files directory was not needed - Simplifies directory structure
…ctory - CSV files are created in CWD when scripts run - Remove unnecessary directory references from README.md - Clean up directory structure documentation
- Ignore macOS .DS_Store files - Ignore Python __pycache__ directories and compiled files - Ignore CSV files generated by scripts - Ignore common IDE, backup, and temporary files - Ignore Python virtual environments and build artifacts - Comprehensive coverage of commonly ignored files/directories
- Remove trailing underscore from regex pattern to handle job IDs that continue after beam name - Pattern now matches: _full_individual_L_20_QP_05_QP_0_state-config-... - Previously only matched: _full_individual_L_20_QP_05_QP_ - Fixes issue where INSAR jobs were not being processed due to strict regex - Tested with example INSAR job ID: SCIFLO_INSAR__pcm_r4.0.7_pge_r4.1.0-network_pair_006_040_012_full_individual_L_20_QP_05_QP_0_state-config-20251010T201535.245443Z
| ```bash | ||
| cp /Users/gmanipon/dev/metrics_extractor/nisar/hysds_metrics_es_extractor_enhanced.py /Users/gmanipon/dev/metrics_extractor/your_project/ | ||
| cp /Users/gmanipon/dev/metrics_extractor/nisar/job_execution_time_extractor.py /Users/gmanipon/dev/metrics_extractor/your_project/ | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Template Contains Hardcoded Developer Paths
The PROJECT_TEMPLATE.md includes hardcoded personal development paths, like /Users/gmanipon/dev/metrics_extractor/. These paths make the template non-portable and expose specific developer environment details, which could hinder its general usability for other developers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the final PR Bugbot will review for you during this billing cycle
Your free Bugbot reviews will reset on January 9
Details
You are on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle.
To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.
| patterns = { | ||
| "beam_name": r"_(?P<coverage>full|partial)_(?P<acquisition_mode>individual|mixed)_(?P<beam_name>L_\d{2}_\w{2}_\d{2}_\w{2})_", # Primary: beam_name | ||
| "coverage": r"_(?P<coverage>full|partial)_(?P<acquisition_mode>individual|mixed)_(?P<beam_name>L_\d{2}_\w{2}_\d{2}_\w{2})_", # Secondary: coverage | ||
| "acquisition_mode": r"_(?P<coverage>full|partial)_(?P<acquisition_mode>individual|mixed)_(?P<beam_name>L_\d{2}_\w{2}_\d{2}_\w{2})_", # Tertiary: acquisition_mode |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Regex excludes S-band jobs, only matches L-band
The beam_name regex pattern uses L_\d{2}_\w{2}_\d{2}_\w{2} which only matches L-band modes (e.g., L_20_DH_05_DH), but the NISAR mission includes both L-band and S-band modes. The documentation in README_ENHANCED.md correctly states the regex should be [LS]_\d{2}_\w{2}_\d{2}_\w{2} to match both bands, and the load_nisar_config function also correctly uses this pattern. However, the parse_job_id_patterns function only matches L-band, causing all S-band job metrics (e.g., S_37_QP_00_NA) to be silently skipped. This affects both extractor scripts.
🚀 Pull Request: NISAR Project Adaptation and Enhanced Metrics Extraction
📋 Overview
This PR introduces a comprehensive NISAR-specific adaptation of the HySDS metrics extractor, providing enhanced hierarchical job breakdown capabilities and specialized execution time analysis. The changes organize the codebase for scalability while maintaining backward compatibility with the original functionality.
🎯 Key Features
1. NISAR-Specific Enhancements
L_40_DH_05_DH,L_20_QP_05_QP)full,partial)individual,mixed)2. Project Organization
3. Code Quality Improvements
📁 File Changes
.gitignorePROJECT_TEMPLATE.mdQUICK_START.mdnisar/nisar/hysds_metrics_es_extractor_enhanced.pynisar/job_execution_time_extractor.pynisar/NISAR_MIXED_MODES_CONFIG_20200101T000000_01.jsonnisar/README.mdnisar/README_ENHANCED.mdmetrics_extractor/hysds_metrics_es_extractor.py🔧 Technical Details
Enhanced Metrics Extractor
beam_name→coverage→acquisition_mode_(?P<coverage>full|partial)_(?P<acquisition_mode>individual|mixed)_(?P<beam_name>L_\d{2}_\w{2}_\d{2}_\w{2})_Execution Time Extractor
wall_timefromjob.job_info.metrics.usage_statsexecution_time_minutes: Lesser of two wall_time valuespcm_container_runtime_m: Larger of two wall_time valuesSecurity & Best Practices
🚀 Usage Examples
NISAR Enhanced Metrics
NISAR Execution Time Analysis
📊 Output Files
job_three_level_breakdown_job_SCIFLO_RSLC_*.csvjob_execution_times_job_SCIFLO_RSLC_*.csv🔮 Future Adaptations
The project template (
PROJECT_TEMPLATE.md) provides a clear pattern for adapting this structure to other projects:your_project/✅ Testing
nisar/directory📈 Impact
🎯 Ready for Review
This PR provides a solid foundation for NISAR-specific analysis while establishing patterns for future project adaptations. All security concerns have been addressed, documentation is comprehensive, and the codebase maintains backward compatibility.
Branch:
feature/nisar-adaptationTarget:
mainAuthor: pymonger pymonger@gmail.com
Note
Introduces NISAR-specific metrics suite (3-level breakdown, execution-time extractors, PGE version comparison) with docs/template, while refactoring the core extractor for reuse and adding a .gitignore.
nisar/):hysds_metrics_es_extractor_enhanced.pyadds 3-level job breakdown bybeam_name → coverage → acquisition_mode, aggregates metrics, and exports CSV.job_execution_time_extractor.py(hierarchical wall_time stats) andpge_execution_time_extractor.py(data-day filter, credential caching, CSV output).compare_pge_versions.pygenerates Excel report comparing PGE versions.NISAR_MIXED_MODES_CONFIG_*.jsonand README files.PROJECT_TEMPLATE.mdandQUICK_START.mdfor adapting to new projects and quick usage.metrics_extractor/hysds_metrics_es_extractor.pyfor reuse (imports/formatting), preserves existing behavior..gitignore.Written by Cursor Bugbot for commit e08beba. This will update automatically on new commits. Configure here.