Rodin is a Python library specifically designed for the comprehensive processing and analysis of metabolomics data and other omics data. It is a class-methods based toolkit, facilitating a range of tasks from basic data manipulation to advanced statistical evaluations, visualization, and metabolic pathway analysis.
Now, most of its functionality is available in the Web App at https://rodin-meta.com.
- Efficient Data Handling: Streamlined manipulation and transformation of metabolomics data and other omics.
- Robust Statistical Analysis: Includes ANOVA, t-tests, and more.
- Machine Learning Methods: Random Forest, Logistic and Linear regressions.
- Advanced Dimensionality Reduction: Techniques like PCA, t-SNE, UMAP.
- Interactive Data Visualization: Tools for effective data visualization.
- Pathway Analysis: Features for metabolic pathway analysis.
We recommend installing Rodin in a separate environment for effective dependency management.
- Python (3.10 or higher)
pip install rodinor install Rodin directly from GitHub:
pip install git+https://github.com/BM-Boris/rodin.gitHere's a basic example demonstrating the usage of Rodin for data analysis. Comprehensive Jupyter notebook guides can be found in the 'guides' folder
import rodin
# Assume 'features.csv' and 'class_labels.csv' are your datasets
features_path = 'path/to/features.csv'
classes_path = 'path/to/class_labels.csv'
# Creating an instance of Rodin_Class
rodin_instance = rodin.create(features_path, classes_path)
# Transform the data (imputation, normalization, and log-transformation steps)
rodin_instance.transform()
# Run t-test comparing two groups based on 'age'
rodin_instance.ttest('age')
# Run two-way anova test comparing groups based on 'age' and 'region'
rodin_instance.twoway_anova(['age','region'])
# Run multiple logistic regressions and linear regressions to get pvalues for each feature
rodin_instance.sf_lg('sex')
rodin_instance.sf_lr('age')
#Run a random forest classifier and regressor to obtain the metrics of the trained model using k-fold validation, with assigned feature importance scores to each variable
rodin_instance.rf_class('region')
rodin_instance.rf_regress('age')
#Slice the whole object using the pattern from pandas
rodin_instance = rodin_instance[rodin_instance.features[rodin_instance.features['imp(rf) age']>0]]
# Perform PCA with 2 principal components (UMAP and t-SNE are available as well)
rodin_instance.run_pca(n_components=2)
# Plotting the PCA results
# 'region' column in the 'samples' DataFrame is used for coloring the points
rodin_instance.plot(dr_name='pca', hue='region', title='PCA Plot')
# Volcano Plot
rodin_instance.volcano(p='p_adj(owa) region', effect_size='lfc (New York vs Georgia)', sign_line=0.01)
# Box Plot
rodin_instance.boxplot(rows=[9999,4561], hue='region')
# Clustergram
rodin_instance.clustergram(hue='sex',standardize='row')
# Pathway analysis
rodin_instance.analyze_pathways(pvals='p_value', stats='statistic',mode='positive')
# Replace 'p_value' and 'statistic' with the actual column names in your 'features' DataFrame(rodin_instance.features) and provide Mass spectrometry analysis mode.The updated guide can be accessed here: https://bm-boris.github.io/rodin_guide/basics.html. Test data from the guide can be found at https://github.com/BM-Boris/rodin_guide/tree/main/data.
For questions, suggestions, or feedback, please contact boris.minasenko@emory.edu
If you use Rodin in your research, please cite the following paper:
Minasenko B, Wang D, Cirillo P, Krigbaum N, Cohn B, Jones DP, Collins JM, Hu X.
Rodin: a streamlined metabolomics data analysis and visualization tool. Bioinformatics Advances. 2025; 5(1): vbaf088.
https://doi.org/10.1093/bioadv/vbaf088