Skip to content

flexycode/BIOF-102

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

3 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🧬 BIOF-102: Bioinformatics for Beginners - RNA-Seq

License: MIT

πŸ“– Course Description

Bioinformatics integrates biology, statistics, and computer science to develop and apply theory, methods, and tools for the collection, storage, and analysis of biological and related data. Some key application areas in bioinformatics include:

  • 🧬 Genomic and molecular analysis
  • πŸ’Š Drug discovery and development
  • 🩺 Medical diagnosis and treatment
  • 🌾 Agricultural biotechnology
  • 🌍 Environmental monitoring

The National Cancer Institute (NCI) uses bioinformatics extensively in its research efforts to combat cancer, including research on the "origin, evolution, progression, and treatment of cancer".

This course was designed to teach the basic skills needed for bioinformatics, including working on the Unix command line. This course primarily focuses on RNA-Seq analysis. All steps of the RNA-Seq workflow, from raw data to differential expression and gene ontology analysis, are covered. However, many of the skills learned are foundational to most bioinformatics analyses and can be applied to other types of next generation sequencing experiments.


πŸš€ Why learn bioinformatics?

Here are a few compelling reasons to explore the world of bioinformatics:

  • πŸ“Š Analyze your data: Empower yourself to delve into your own biological data, gaining valuable insights.
  • πŸ”¬ Enhancing Scientific Skills: Broaden your scientific knowledge and skills by mastering bioinformatics tools and techniques. By understanding the principles involved with data collection and analysis, you'll be better equipped to design robust experiments and interpret their results effectively.
  • πŸ’Ό Career Opportunities: Open doors to exciting career paths in the rapidly growing field of bioinformatics.
  • 🀝 Understand the Data Landscape: Gain a deeper appreciation for how others analyze biological data, fostering collaboration and critical thinking.

πŸ“š Modules Overview

Lessons focus on developing command line skills, getting started and working on Biowulf (the NIH HPC cluster), and downloading data from NCBI.

  • Lesson 1 - What is Biowulf?
  • Lesson 2 - Navigating file systems with Unix
  • Lesson 3 - Useful Unix
  • Lesson 4 - Working on Biowulf
  • Lesson 5 - Downloading data from the SRA

Lessons focus on RNA-Seq analysis including experimental design and best practices, quality control, trimming, alignment based methods, feature counts, differential expression analysis, and biological interpretation.

  • Lesson 6 - Introduction to RNA-Seq
  • Lesson 7 - Introduction to NGS Data and Quality Control
  • Lesson 8 - Cleaning and Preparing NGS Data for Downstream Analysis
  • Lesson 9 - Aligning NGS Data to Genome
  • Lesson 10 - Quantifying Gene Expression from Bulk RNA Sequencing Data
  • Lesson 11 - Visualizing Genomic Data: Preparing Files
  • Lesson 12 - Visualizing Genomic Data with the IGV
  • Lesson 13 - Differential Expression Analysis for Bulk RNA Sequencing: QC
  • Lesson 14 - Differential Expression Analysis for Bulk RNA Sequencing: The Actual Analysis

Lessons focus on gene ontology and pathway analysis.

  • Lesson 15 - Introduction to gene ontology and pathway analysis
  • Lesson 16 - Functional enrichment with DAVID
  • Lesson 17 - Pathway Analysis with Reactome

βœ… Course Requirements & Prerequisites

Who can take this course? There are no prerequisites to take this course. This course is open to NCI-CCR researchers interested in learning bioinformatics skills, especially those relevant to analyzing bulk RNA sequencing data.

How will we work through lesson content? For the hands-on sessions, participants will use Biowulf student accounts. To sign up for a student account, click here. Student accounts are only available to course registrants.

Class Usage:


πŸ’Ύ Class Data

Below are the links for the class data in case participants would like to practice outside of and after this course series. There is no need to download these for this course as the instructors have made them available on Biowulf.

Module 1

You can find compressed Module 1 data here. Download the data and unzip.

unzip module_1.zip

Module 2

All Module 2 data were obtained from the Griffith lab RNA sequencing tutorial and renamed for this course series.


πŸ”— Reference

https://bioinformatics.ccr.cancer.gov/docs/bioinformatics-for-beginners-2025/


πŸ“‚ Folder Structure

BIOF-102
β”œβ”€β”€ Module 1 - Unix and Biowulf
β”‚   β”œβ”€β”€ Lesson 1 - What is Biowulf
β”‚   β”œβ”€β”€ Lesson 2 - Navigating file systems with Unix
β”‚   β”œβ”€β”€ Lesson 3 - Useful Unix
β”‚   β”œβ”€β”€ Lesson 4 - Working on Biowulf
β”‚   └── Lesson 5 - Downloading data from the SRA
β”œβ”€β”€ Module 2 - RNA-Seq Analysis
β”‚   β”œβ”€β”€ Lesson 6 - Introduction to RNA-Seq
β”‚   β”œβ”€β”€ Lesson 7 - Introduction to Next Generation Sequencing (NGS) Data and Quality Control
β”‚   β”œβ”€β”€ Lesson 8 - Cleaning and Preparing Next Generation Sequencing (NGS) Data for Downstream Analysis
β”‚   β”œβ”€β”€ Lesson 9 - Aligning Next Generation Sequencing (NGS) Data to Genome
β”‚   β”œβ”€β”€ Lesson 10 - Quantifying Gene Expression from Bulk RNA Sequencing Data
β”‚   β”œβ”€β”€ Lesson 11 - Visualizing Genomic Data - Preparing Files
β”‚   β”œβ”€β”€ Lesson 12 - Visualizing Genomic Data with the Integrative Genomics Viewer
β”‚   β”œβ”€β”€ Lesson 13 - Differential Expression Analysis for Bulk RNA Sequencing - QC
β”‚   └── Lesson 14 - Differential Expression Analysis for Bulk RNA Sequencing - The Actual Analysis
└── Module 3 - Pathway Analysis
    β”œβ”€β”€ Lesson 15 - Introduction to gene ontology and pathway analysis
    β”œβ”€β”€ Lesson 16 - Functional enrichment with DAVID
    └── Lesson 17 - Pathway Analysis with Reactome

πŸ‘€ Author

Flexycode

Thanks for visiting! ❀️

Thank You

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •