CRC Radiotherapy

This repository contains the code used for the project. The project involves data preprocessing and model building, including MOFA (Multi-Omics Factor Analysis) and Random Forest model.

Overview of Scripts

The scripts are organised to follow the workflow of the project:

1. Data Preparation

data_split.Rmd
Splits the dataset into two subsets:
- One subset for training the MOFA model.
- One subset for training the Random Forest model.

2. Data Preprocessing

RNA_preprocessing.Rmd
Preprocessing of the RNA dataset.
mutation_preprocessing.Rmd
Preprocessing of the mutational dataset.
methylation_preprocessing.Rmd
Preprocessing of the methylation dataset.
cna_preprocessing.Rmd
Preprocessing of the Copy Number Alteration (CNA) dataset.

3. MOFA Model Development

MOFA_models.Rmd
- Identify the optimal MOFA model.
- Build the optimal MOFA model.
- Characterize the factors in the model.

4. Feature Extraction

extract_features.Rmd
Extract informative features from the MOFA model, particularly in relation to the treatment response covariate.

5. Functional Analysis

GSEA.Rmd
Perform Gene Set Enrichment Analysis (GSEA) using important factors identified from the MOFA model.
pathway_enrichment.Rmd
Conduct pathway enrichment analysis using the informative factors identified from the MOFA model.

6. Machine Learning

Random_Forest.ipynb
- Build and evaluate a Random Forest model using the features extracted from the MOFA model.

Data avalibility

Preprocessed datasets are provided here to allow running of the later scripts as well as a construced MOFA model.

Preprocessed Datasets

1. RNA Datasets

RNA_supervised_preprocessed
RNA_unsupervised_preprocessed

2. CNA Datasets

CNA_supervised_preprocessed
CNA_unsupervised_preprocessed

3. Mutation Datasets

mutation_supervised_preprocessed
mutation_unsupervised_preprocessed

4. Methylation Datasets

methylation_supervised_preprocessed
methylation_unsupervised_preprocessed

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
Figures		Figures
signature_testing		signature_testing
validation		validation
GSEA.Rmd		GSEA.Rmd
MOFA_model_15.hdf5		MOFA_model_15.hdf5
MOFA_models.Rmd		MOFA_models.Rmd
ORA_analysis.Rmd		ORA_analysis.Rmd
README.md		README.md
RNA_preprocessing.Rmd		RNA_preprocessing.Rmd
RNA_supervised_preprocessed		RNA_supervised_preprocessed
RNA_top0.8_4ft_mod15.csv		RNA_top0.8_4ft_mod15.csv
RNA_unsupervised_preprocessed		RNA_unsupervised_preprocessed
Random_Forest.ipynb		Random_Forest.ipynb
binary_patient_metadata.csv		binary_patient_metadata.csv
cna_preprocessing.Rmd		cna_preprocessing.Rmd
cna_supervised_preprocessed		cna_supervised_preprocessed
cna_unsupervised_preprocessed		cna_unsupervised_preprocessed
data_split.Rmd		data_split.Rmd
extract_features.Rmd		extract_features.Rmd
h.all.v2024.1.Hs.symbols.gmt		h.all.v2024.1.Hs.symbols.gmt
hugo_symbol_sex_genes.txt		hugo_symbol_sex_genes.txt
methylation_processing.Rmd		methylation_processing.Rmd
methylation_supervised_preprocessed		methylation_supervised_preprocessed
methylation_unsupervised_preprocessed		methylation_unsupervised_preprocessed
mutation_preprocessing.Rmd		mutation_preprocessing.Rmd
mutation_supervised_preprocessed		mutation_supervised_preprocessed
mutation_unsupervised_preprocessed		mutation_unsupervised_preprocessed
pathway_enrichment.Rmd		pathway_enrichment.Rmd
supervised_ID		supervised_ID
unsupervised_ID		unsupervised_ID
ws3_grampian_patient_data.txt		ws3_grampian_patient_data.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CRC Radiotherapy

Overview of Scripts

1. Data Preparation

2. Data Preprocessing

3. MOFA Model Development

4. Feature Extraction

5. Functional Analysis

6. Machine Learning

Data avalibility

Preprocessed Datasets

1. RNA Datasets

2. CNA Datasets

3. Mutation Datasets

4. Methylation Datasets

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

CRC Radiotherapy

Overview of Scripts

1. Data Preparation

2. Data Preprocessing

3. MOFA Model Development

4. Feature Extraction

5. Functional Analysis

6. Machine Learning

Data avalibility

Preprocessed Datasets

1. RNA Datasets

2. CNA Datasets

3. Mutation Datasets

4. Methylation Datasets

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages