Skip to content
Sophia Flury edited this page Jan 18, 2024 · 10 revisions

LyCsurv

Module containing methods for training statistical survival models using Lyman continuum observations to predict Lyman continuum escape fractions

AFT

LyCsurv.AFT(dat, resp='f_esc(LyC)', verbose=False, intercept=True, StatsVerbose=False)
Name:

AFT

Purpose:

Perform parametric survival regression using the Accelerated Failure method assuming a generic Weibull distribution.

Arguments:
dat (pandas.DataFrame):

pandas DataFrame containing columns named according to conventions in params.lis with values corresponding to the galaxy sample which the user desires to fit

Keyword Arguments:
resp (str):

string indicating the desired response variable. Options are ‘f_esc(LyC)’, ‘f_esc(LyA)’, ‘f(LyC)’, and ‘f(LyA)’. Default is ‘f_esc(LyC)’.

verbose (bool):

boolean indicating whether to print out details of the accelerated failure regression. Default is False.

intercept (bool):

boolean indicating whether to include an intercept in the parametric hazard function

StatsVerbose (bool):

boolean indicating whether to perform and return statistical assessments of the model using the training set. Default is False.

Returns:
aft_fit (np.ndarray):

n by 3 array containing the median, lower and upper uncertainties corresponding to 0.16 and 0.84 quantiles

ModAssess (tuple):

(_optional_) 4x1 tuple of model assessments containing the R^2, adjusted R^2, RMS, and concordance index. Only returned if StatsVerbose set to True.

CoxPH

LyCsurv.CoxPH(dat, resp='f_esc(LyC)', verbose=False, StatsVerbose=False)
Name:

CoxPH

Purpose:

Perform a Cox proportional hazards regression on the input reference data

Arguments:
dat (pandas.DataFrame):

pandas DataFrame containing columns named according to conventions in params.lis with values corresponding to the galaxy sample which the user desires to fit

Keyword Arguments:
resp (str):

string indicating the desired response variable. Options are ‘f_esc(LyC)’, ‘f_esc(LyA)’, ‘f(LyC)’, and ‘f(LyA)’. Default is ‘f_esc(LyC)’.

verbose (bool):

boolean indicating whether to print out details of the Cox proportional hazards regression. Default is False.

StatsVerbose (bool):

boolean indicating whether to perform and return statistical assessments of the model using the training set. Default is False.

Returns:
cph_fit (numpy.ndarray):

n by 4 array containing the median, lower and upper uncertainties corresponding to 0.16 and 0.84 quantiles, and an indicator of whether the survival function is always below (-1) or above (+1) the median of the predicted distribution

ModAssess (tuple):

(_optional_) 4x1 tuple of model assessments containing the R^2, adjusted R^2, RMS, and concordance index. Only returned if StatsVerbose set to True.

InterpPH

LyCsurv.InterpPH(dat, part, base)
Name:

InterpPH

Purpose:

Interpolate over the survival function for each input observation to predict the appropriate escape fraction.

Arugments:
dat (pandas.DataFrame):

pandas DataFrame of observables used to train model

part (**):

Cox partial proportional hazard values for survival function

base (**):

predicted response corresponding to full range of observations

Returns:
predict (np.ndarray):

Nx3 array of response values predicted by the Cox PH model. First row is the lower uncertainty. Middle row is the median. Last row is the upper uncertainty.

ModAssess

LyCsurv.ModAssess(trn, mod, cens, concord='harrell')
Name:

ModAssess

Purpose:

Assess the quality of a survival model by testing it against the training data set (in this case, the LzLCS).

Arguments:
trn (np.ndarray):

Nx1 array of observed values

mod (np.ndarray):

Nx1 array of predicted values

cens (np.ndarray):

Nx1 array of censors as booleans

Keyword Arguments:
concord (str):

string indicating the method to use for concordance calculation. Currently supports Harrell+ 1996 and Uno+ 2011 methods. Default is ‘harrell’.

Returns:
R2 (float):

R^2 metric of the residuals

R2a (float):

adjusted R^2 metric of the residuals

RMS (float):

root-mean-square of the residuals

C (float):

concordance index

Train

class LyCsurv.Train(resp='f_esc(LyC)', method='CoxPH', intercept=True, verbose=False)

Bases: object

Name:

Train

Purpose:

Train a specified survival model on reference data and assess the results.

Keyword Arguments:
resp (str):

string indicating the desired response variable. Options are ‘f_esc(LyC)’, ‘f_esc(LyA)’, ‘f(LyC)’, and ‘f(LyA)’. Default is ‘f_esc(LyC)’.

method (str):

string indicating the method of survival analysis to be used in the training run: ‘CoxPH’ or ‘AFT’. Default is ‘CoxPH’.

Attributes:
train (numpy.ndarray):

88x2 array containing the observed and predicted response variable

stats (tuple):

(_optional_) 4x1 tuple of model assessments containing the R^2, adjusted R^2, RMS, and concordance index. Only returned if StatsVerbose set to True.

resp (str):

response variable (corresponds to resp input)

meth (str):

method used for training (corresponds to method input)

Clone this wiki locally