cgm_ml

Predicting the physical conditions in the CGM from absorption observables using random forests.

Data

The CGM absorber dataset was generated for Appleby+2022 (see cgm_physical_conditions) using a sample of galaxies across a range of stellar masses and star formation rates, and with lines of sight for a range of impact parameters. For this machine learning project we increased the data volume by (roughly) doubling the galaxy sample, so I needed to combine the original absorber dataset with the extra dataset:

python make_dataset.py <Simba model> <Simba wind> <Simba snapshot> <pygad line>

where the Simba model is m100n1024, the simba wind is s50, and the Simba snapshot is 151. The pygad line can be any of the following: H1215, MgII2796, CII1334, SiIII1206, CIV1548 or OVI1031.

Random forests

I use the Scikit-Learn implementation of the random forest regressor algorithm to produce a mapping between the underlying physical conditions of the CGM absorbers and the observable absorber parameters.

python random_forest.py <simba model> <simba wind> <simba snapshot> <pygad line> <target feature>

where the target feature can be delta_rho, T or Z (overdensity, temperature or metallicity).

Models

Trained models are saved in the `models' directory; see plotting routines for examples of how to read these.

Distribution scaling

Since the random forest models make predictions that are in general too concentrated towards the mean, we have two methods of post-processsing the predictions to reproduce the intrinsic scatter in the truth data.

python scale_predictions.py <simba model> <simba wind> <simba snapshot> <pygad line>

Plotting

For joint plots of predictions against truth data:

python plot_joint_lines.py <Simba model> <Simba wind> <Simba snapshot> <pygad line>

For feature importance matrices:

python plot_feature_importance.py <Simba model> <Simba wind> <Simba snapshot> <pygad line>

For feature accuracy importance:

python plot_delta_scatter.py <Simba model> <Simba wind> <Simba snapshot>

For plotting predictions in phase space:

python plot_phase_space.py <Simba model> <Simba wind> <Simba snapshot> <mode>

where mode can be orig, scatter, or trans and refers to the prediction data version.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

cgm_ml

Data

Random forests

Models

Distribution scaling

Plotting

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
data		data
models		models
LICENSE		LICENSE
README.md		README.md
make_dataset.py		make_dataset.py
plot_delta_scatter.py		plot_delta_scatter.py
plot_feature_importance.py		plot_feature_importance.py
plot_joint_lines.py		plot_joint_lines.py
plot_phase_space.py		plot_phase_space.py
random_forest.py		random_forest.py
scale_predictions.py		scale_predictions.py
sub_random_forest.sh		sub_random_forest.sh

Folders and files

Latest commit

History

Repository files navigation

cgm_ml

Data

Random forests

Models

Distribution scaling

Plotting

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages