Skip to content

unclbc/nrf2_nsclc_diff_paper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RNAseq Analysis

Environment set up

Assumes a UNIX based environment. If you have a SLURM environment, the shell scripts can be submitted with sbatch, assuming nodes with the needed resources exist.

  1. Clone repo from GitHub.
  2. From a terminal, make sure the current directory is the root of the cloned repo.
  3. Set up a self-contained Conda (Micromamba) and environment (analysis) in Scripts/mm based on the yaml file Scripts/base.mm.yaml by running mmSetup.sbatch:

To run locally (from the base dir of the cloned repo):

./Scripts/mmSetup.sbatch ./Scripts/base.mm.yaml

To submit to a SLURM cluster (specifying the partition to run on as ):

sbatch --partition <PART> ./Scripts/mmSetup.sbatch ./Scripts/base.mm.yaml

Note: This is not a minimal environment, it is a generic environment and includes packages not needed here.

The ./Scripts/mm directory now contains a bin/micromamba executable and an envs/analysis environment.

Get needed data

The directory ./RawData/ has a sampleMeta.tsv file that describes the samples used in the analysis and corresponding to the samples uploaded to GEO as GSE289043.

The gene expression matrix file GSE289043_salmon_gene_matrix.tsv.gz must be downloaded, un-gzipped, and saved to the ./RawData/ directory.

The TPM (transcripts per million) version of the gene expression matrix file GSE289043_salmon_gene_matrix_tpm.tsv.gz must be downloaded, un-gzipped, and saved to the ./RawData/ directory.

The c2 (go), c5 (curated) and h (hallmark) MSigDB geneset files with HUGO symbols need to be downloaded and saved to the ./RawData/ directory. The v2023.1 versioned files can be downloaded from the MSigDB archive:

Run the analysis.

Scripts can be knit individually in order, or all of them can be run by executing the analysis script ./Scripts/runAnalysis.sbatch.

To run locally (from the base dir of the cloned repo):

./Scripts/runAnalysis.sbatch

To submit to a SLURM cluster (specifying the partition to run on as ):

sbatch --partition <PART> ./Scripts/runAnalysis.sbatch

Note: To avoid overwriting previous runs, the working directory (./Working) and the results directory (./Results) must be missing or empty. Rename or delete them to rerun.

Results

As is the default behavior, the *.html output for the Rmd scripts are written in the same directory. Output data files and important plots are saved to subdirectories in the ./Results directory. Intermediate files are saved to subdirectories in the ./Working directory.

About

Code used with Activating NRF2E79Q mutation alters the differentiation of human non-small cell lung cancer

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors