|
| 1 | +# The PHANGS-ALMA Pipeline |
| 2 | + |
| 3 | +## Preface |
| 4 | + |
| 5 | +### Contents |
| 6 | + |
| 7 | +This is the [PHANGS](https://sites.google.com/view/phangs/home) post-processing and science-ready data product pipeline. |
| 8 | +This pipeline processes data from calibrated visibilities to science-ready spectral cubes and maps. |
| 9 | +The procedures and background for key parts of the pipeline are discussed in the Astrophysical Journal Supplements |
| 10 | +Paper [PHANGS-ALMA Data Processing and Pipeline](https://ui.adsabs.harvard.edu/abs/2021ApJS..255...19L/abstract) |
| 11 | +Please consult that paper for more background and details. |
| 12 | + |
| 13 | +### What this pipeline is for |
| 14 | + |
| 15 | +This pipeline is devised to process data from radio interferometer observations (from, e.g., ALMA or VLA). |
| 16 | +It is applied to calibrated visibilities, as those generated by the CASA software, and delivers science-ready spectral |
| 17 | +cubes and moment maps, along with associated uncertainty maps. In this regard, the PHANGS-ALMA pipeline offers a |
| 18 | +flexible alternative to the `scriptForImaging` script distributed by ALMA. |
| 19 | +A detailed list of the derived data products can be found in Section 7 of the paper mentioned above. The pipeline can |
| 20 | +also process Total Power data from ALMA. |
| 21 | + |
| 22 | +### Pipeline and configuration files |
| 23 | + |
| 24 | +This repository contains the scripts that comprise the PHANGS-ALMA pipeline. |
| 25 | +Configuration files for a large set of PHANGS projects, including the live version of the files for the |
| 26 | +PHANGS-ALMA CO survey, exist in a [separate repository](https://github.com/PhangsTeam/phangs_pipeline_configs). |
| 27 | +We include a frozen set of files that can be used to reduce PHANGS-ALMA as examples here. |
| 28 | +If you need access to those other repositories or need examples, please request access as needed. |
| 29 | + |
| 30 | +### Contact |
| 31 | + |
| 32 | +For issues, the preferred method is to open an issue on the |
| 33 | +[GitHub issues page](https://github.com/akleroy/phangs_imaging_scripts/issues). |
| 34 | + |
| 35 | +## Installation |
| 36 | + |
| 37 | +We recommend installing the pipeline in a separate [Conda](https://www.anaconda.com/) environment. |
| 38 | + |
| 39 | +The pipeline works in Python version 3.12, and is pip installable: |
| 40 | + |
| 41 | +```bash |
| 42 | +pip install git+https://github.com/akleroy/phangs_imaging_scripts.git |
| 43 | +``` |
| 44 | + |
| 45 | +To check this has installed, in Python you can then import the pipeline: |
| 46 | +```python |
| 47 | +import phangsPipeline as ppl |
| 48 | +``` |
| 49 | + |
| 50 | +You will also need to download [analysisUtils](https://doi.org/10.5281/zenodo.7502159). Make sure to grab the latest version, and append the location of |
| 51 | +these scripts in your PATH. |
| 52 | + |
| 53 | +On the first run, you may get an error about downloading CASA data. In this case, ensure the directory it lists exists |
| 54 | +and rerun. You can change this data path by editing config.py in ~/.casa |
| 55 | + |
| 56 | +## Running the pipeline |
| 57 | + |
| 58 | +There are two ways that this pipeline might be useful. First, it provides an end-to-end path to process calibrated |
| 59 | +ALMA data (or VLA data) of the sort produced by the scriptForPI script distributed by ALMA into spectral cubes and maps. |
| 60 | +That end-to-end approach is described in "Workflow for most users." Second, the `phangsPipeline` directory contains a |
| 61 | +number of modules for use inside and outside CASA that should have general utility. These are written without requiring |
| 62 | +any broader awareness of the pipeline infrastructure and should just be generally useful. These are files named |
| 63 | +`casaSOMENAME.py` and `scSOMEOTHERNAME.py` and, to a lesser extent, `utilsYETANOTHERNAME.py`. |
| 64 | + |
| 65 | +## Workflow for most users |
| 66 | + |
| 67 | +If you just want to *use* the pipeline then you will need to do three things: |
| 68 | + |
| 69 | +0. Run `scriptForPI.py` to apply the observatory-provided calibration to your data (this is outside the pipeline remit). |
| 70 | + The pipeline picks up from there, it does not replace the ALMA observatory calibration and flagging pipeline. |
| 71 | +1. Make configuration files ("key files") that describe your project. |
| 72 | + Usually you can copy and modify an existing project to get a good start. We provide PHANGS-ALMA as an example. |
| 73 | +2. Run the pipeline scripts |
| 74 | + |
| 75 | +**The Easiest Way** This release includes the full PHANGS-ALMA set of keys and the scripts we use to run the pipeline |
| 76 | +for PHANGS-ALMA. These are *heavily documented* - copy them to make your own script and configuration and follow the |
| 77 | +documentation in those scripts to get started. To be specific: |
| 78 | + |
| 79 | +- The PHANGS-ALMA keys to reduce the data end-to-end from the archive are in: `phangs-alma_keys/` |
| 80 | +- The script to run the pipeline is: `run_pipeline_phangs-alma.py` |
| 81 | + |
| 82 | +These can run the actual PHANGS-ALMA reduction, though in practice we used slightly more complex versions of a few |
| 83 | +programs to manage the workflow. Copying and modifying these are your best bet, especially following the patterns in |
| 84 | +the key files. |
| 85 | + |
| 86 | +## A few details on procedure |
| 87 | + |
| 88 | +The full procedure is described in our ApJ Supplements paper and the programs themselves are all in this repository, |
| 89 | +so we do not provide any extremely detailed docs here. Many individual routines are documented, though we also intend |
| 90 | +to improve the documentation in the future. Therefore, we just note that broadly, the pipeline runs in four stages: |
| 91 | + |
| 92 | +1. **Staging** Stage and process uv-data. This step includes continuum subtraction, line extraction, and spectral |
| 93 | + regridding. |
| 94 | +2. **Imaging** Image and deconvolve the uv-data. This runs in several steps: dirty imaging, clean mask alignment, |
| 95 | + multi-scale deconvolution, re-masking, and single convolution. |
| 96 | +3. **Post-Process** Process deconvolved data into science-ready data cubes. This stage includes merging with the |
| 97 | + Total Power and mosaicking. |
| 98 | +4. **Derived Products** Convolution, noise estimation, masking, and calculation of science-ready data products. |
| 99 | + |
| 100 | +The simplest way to run these is to edit `run_pipeline_phangs-alma.py` to point at your key files, and run. |
| 101 | + |
| 102 | +## Chunked imaging |
| 103 | + |
| 104 | +For large cubes, it may be beneficial to farm out each cube slice to a different machine (within some HPC environment) |
| 105 | +and work on them in serial. For this, the `ImagingChunkedHandler` exists. There are some example scripts on how to use |
| 106 | +this in the `examples_on_clusters/` directory, and a script to run end-to-end in `run_casa_imaging_chunked_example.py` |
| 107 | +in the base directory. |
| 108 | + |
| 109 | +## Contents of the pipeline in more detail |
| 110 | + |
| 111 | +**Architecture**: The pipeline is organized and run by a series of "handler" objects. These handlers organize the list |
| 112 | +of targets, array configurations, spectral products, and derived moments and execute loops. |
| 113 | + |
| 114 | +The routines to process individual data sets are in individual modules, grouped by theme (e.g., casaImagingRoutines or |
| 115 | +scNoiseRoutines). These routines do not know about the larger infrastructure of arrays, targets, etc. They generally |
| 116 | +take an input file, output file, and various keyword arguments. |
| 117 | + |
| 118 | +A project is defined by a series of text key files in a "key_directory". These define the measurement set inputs, |
| 119 | +configurations, spectral line products, moments, and derived products. |
| 120 | + |
| 121 | +**User Control**: For the most part the user's job is to *define the key files* and to run some scripts. |
0 commit comments