I think I have a workaround for this, so this is more of a heads up than an issue that needs solving at this point.
I have a data frame holding beta values. When I put that into methylize.diff_meth_pos(df, phenotype) it breaks, giving me a math domain error.
[/data/projects/classifiers/src/exploration/methylation/differentialMethylation.ipynb](https://vscode-remote+ssh-002dremote-002bmtbnotes-002ddev-002ezerochildhoodcancer-002ecloud.vscode-resource.vscode-cdn.net/data/projects/classifiers/src/exploration/methylation/differentialMethylation.ipynb) Cell 17 in ()
----> [1](vscode-notebook-cell://ssh-remote%2Bmtbnotes-dev.zerochildhoodcancer.cloud/data/projects/classifiers/src/exploration/methylation/differentialMethylation.ipynb#X22sdnNjb2RlLXJlbW90ZQ%3D%3D?line=0) methylize.diff_meth_pos(meth_data, phenotype)
File [/data/projects/classifiers/bin/envs/classifiers/lib/python3.10/site-packages/methylize/diff_meth_pos.py:283](https://vscode-remote+ssh-002dremote-002bmtbnotes-002ddev-002ezerochildhoodcancer-002ecloud.vscode-resource.vscode-cdn.net/data/projects/classifiers/bin/envs/classifiers/lib/python3.10/site-packages/methylize/diff_meth_pos.py:283), in diff_meth_pos(meth_data, pheno_data, regression_method, impute, **kwargs)
281 def beta2m(val):
282 return math.log2(val[/](https://vscode-remote+ssh-002dremote-002bmtbnotes-002ddev-002ezerochildhoodcancer-002ecloud.vscode-resource.vscode-cdn.net/)(1-val))
--> 283 meth_data = meth_data.apply(np.vectorize(beta2m))
284 if verbose: LOGGER.info(f"Converted your beta values into M-values; {meth_data.shape}")
286 # Check that the methylation and phenotype data correspond to the same number of samples; flip if necessary
File [/data/projects/classifiers/bin/envs/classifiers/lib/python3.10/site-packages/pandas/core/frame.py:9568](https://vscode-remote+ssh-002dremote-002bmtbnotes-002ddev-002ezerochildhoodcancer-002ecloud.vscode-resource.vscode-cdn.net/data/projects/classifiers/bin/envs/classifiers/lib/python3.10/site-packages/pandas/core/frame.py:9568), in DataFrame.apply(self, func, axis, raw, result_type, args, **kwargs)
9557 from pandas.core.apply import frame_apply
9559 op = frame_apply(
9560 self,
9561 func=func,
(...)
9566 kwargs=kwargs,
9567 )
-> 9568 return op.apply().__finalize__(self, method="apply")
File [/data/projects/classifiers/bin/envs/classifiers/lib/python3.10/site-packages/pandas/core/apply.py:764](https://vscode-remote+ssh-002dremote-002bmtbnotes-002ddev-002ezerochildhoodcancer-002ecloud.vscode-resource.vscode-cdn.net/data/projects/classifiers/bin/envs/classifiers/lib/python3.10/site-packages/pandas/core/apply.py:764), in FrameApply.apply(self)
761 elif self.raw:
762 return self.apply_raw()
...
File [/data/projects/classifiers/bin/envs/classifiers/lib/python3.10/site-packages/methylize/diff_meth_pos.py:282](https://vscode-remote+ssh-002dremote-002bmtbnotes-002ddev-002ezerochildhoodcancer-002ecloud.vscode-resource.vscode-cdn.net/data/projects/classifiers/bin/envs/classifiers/lib/python3.10/site-packages/methylize/diff_meth_pos.py:282), in diff_meth_pos..beta2m(val)
281 def beta2m(val):
--> 282 return math.log2(val/(1-val))
ValueError: math domain error
There doesn't appear to be anything wrong with the data, if I do the following (which is replicating the beta2m() function):
df = meth_data.apply(lambda x: x / (1 - x))
df = df.applymap(lambda x: np.log2(x))
and plug the resulting df into diff_meth_pos, then it will process that resulting df quite happily. Though it does give me two "Warning: invalid value encountered in subtract" warnings and then tell me that "No DMPs were found within the q < 1 (the significance cutoff level specified)".
I am assuming that's something else though.
I think I have a workaround for this, so this is more of a heads up than an issue that needs solving at this point.
I have a data frame holding beta values. When I put that into methylize.diff_meth_pos(df, phenotype) it breaks, giving me a math domain error.
There doesn't appear to be anything wrong with the data, if I do the following (which is replicating the beta2m() function):
and plug the resulting df into diff_meth_pos, then it will process that resulting df quite happily. Though it does give me two "Warning: invalid value encountered in subtract" warnings and then tell me that "No DMPs were found within the q < 1 (the significance cutoff level specified)".
I am assuming that's something else though.