A course in the theory and practice of phylogenetic inference from DNA sequence data. Students will learn all the necessary components of state-of-the-art phylogenomic analyses and apply the knowledge to the data analyses of their own organisms.
- Spring 2022: Wednesday and Friday 2:30-3:45pm (Russell A228)
- Instructor: Claudia Solis-Lemus, PhD
- Email: solislemus@wisc.edu
- website: https://solislemuslab.github.io/
- Office hours: Wednesday 3:45-4:30pm, or by appointment
By the end of the course, you will be able to
- Explain in details all the steps in the pipeline for phylogenetic inference and how different data and model choices affect the inference outcomes
- Plan and produce reproducible scripts with the analysis of your own biological data
- Justify the data and model choices in your own data analysis
- Interpret the results of the most widely used phylogenetic methods in biological terms
- Orally present the results of your own phylogenomic data analyses based on the best scientific and reproducibility practices
- Phylogenetics in the Genomic Era (open access book) by Celine Scornavacca, Frederic Delsuc and Nicolas Galtier (denoted HAL in the schedule)
- Tree thinking: an introduction to phylogenetic biology by David Baum and Stacey Smith (optional: denoted Baum in the schedule)
- The Phylogenetic Handbook by Philippe Lemey, Marco Salemi and Anne-Mieke Vandamme (optional: denoted HB in the schedule)
- The full list of papers used in this class can be found in this link
| Session | Topic | Pre-class work | At the end of the session | Lecture notes | Homework | HW due |
|---|---|---|---|---|---|---|
| 01/26 | Introduction | You will know what will be the structure of the class, the learning outcomes and the grading | lecture1.md | Go over ready-for-class checklist | ||
| 01/28 | Motivation: why learning phylogenomics? | Read HAL 2.1 | You will identify the different components in phylogenomic analyses | lecture2.md | Read HAL 2.1 and do canvas quiz and read Jermiin2020 | 01/28 |
| 02/02 | Reproducibility crash course | Review shell resources and do canvas quiz | You will prioritize reproducibility and good computing practices throughout the semester (and beyond) | lecture3.md | ||
| 02/04 | Continue with reproducibility | Have git installed | Reproducibility HW | 02/09 | ||
| 02/09 | Introduction to sequences | Watch video1, video2, and do canvas quiz | You will be able to describe the next-generation sequencing pipeline (and UCE pipeline) as well as the post-processing bioinformatics steps for quality control | lecture4.md | Sequencing HW | 02/18 |
| 02/11 | Alignment | You will be able to explain the most widely used algorithms for multiple sequence alignment | lecture5.md | Needleman-Wunsch HW and canvas quiz | 02/23 | |
| 02/16 | Continue with alignment | One paper assigned per student: 1) ClustalW, 2) MUSCLE, 3) T-Coffee | lecture5-2.md | 1) Read Alignathon paper; 2) Choose and run an alignment method on your data (github commit) | 03/02 | |
| 02/18 | Filtering and Orthology detection | HAL 2.2; Optional HAL 2.4 | You will know about the different filtering and orthology inference methods | lecture6.md | 1) Read Nichio2017; 2) Choose one orthology detection method, read its paper and run it on your data (git commit) | 03/09 |
| 02/23 | Overview of phylogenetic inference | You will be able to explain the overall methodology of phylogenetic inference as well as the main weaknesses | lecture7.pdf | |||
| 02/25 | Distance and parsimony methods | Install R and optional readings: HB Ch 5-6, Baum Ch 7-8 | You will be able to explain both algorithms to reconstruct trees: 1) based on distances and 2) based on parsimony | lecture8.md | ||
| 03/02 | Continue with distance and parsimony methods | Run distance and parsimony methods on your own data | 03/23 | |||
| 03/04 | Models of evolution | HAL 1.1 and canvas quiz | You will be able to explain the main characteristics and assumptions of the substitution models | lecture9.pdf | ||
| 03/09 | Continue with models of evolution | |||||
| 03/11 | Maximum likelihood | HAL 1.2 and canvas quiz | You will be able to explain the main steps in maximum likelihood inference and the strength/weaknesses of the approach | lecture10.pdf | ||
| 03/16 | Spring break | |||||
| 03/18 | Spring break | |||||
| 03/23 | Continue maximum likelihood | Two papers assigned per student: 1) IQ-Tree papers: one, two; 2) RAxML papers: one, two | Choose a ML method to run in your own data | 04/06 | ||
| 03/25 | Bayesian inference | HAL 1.4 and canvas quiz | You will be able to explain the main components of Bayesian inference and their effect on the inference performance | lecture12.pdf | Read Nascimento et al, 2017 and quiz | |
| 03/30 | Continue Bayesian inference | Read YangRannala1997 | ||||
| 04/01 | Continue Bayesian inference | Read MrBayes papers: one, two, three | Run MrBayes on your own data | 04/15 | ||
| 04/06 | Model selection: Guest lecture by Rob Lanfear | |||||
| 04/08 | The coalescent | HAL 3.1 and quiz, 3.3 and quiz | You will be able to explain the coalescent model for species trees and networks | lecture14.pdf | ||
| 04/13 | Continue with the coalescent | One paper per student: ASTRAL or BUCKy | Run ASTRAL or BUCKy on your own data | 04/27 | ||
| 04/15 | Continue with the coalescent | SNaQ chapter and quiz | ||||
| 04/20 | Co-estimation methods | Optional reading: HB 18 | You will be able to explain the main components of co-estimation methods and follow the BEAST tutorial | lecture15.md | ||
| 04/22 | Continue with co-estimation methods | Read BEAST papers: one, two | ||||
| 04/27 | Discussion: Measures of support | One per group: 1) Stenz2015, 2) Lemoine2018, 3) Anisimova2006, 4) Sayyari2016 | You will be able to compare and contrast the different ways in which we can measure confidence in our phylogenetic estimates | Slides | ||
| 04/29 | Discussion: Coalescent vs concatenation | All: HAL 3.4. One per group: 1) Springer2018, 2) Mendes2018, 3) Philippe2017, 4) Springer2016, 5) Edwards2016 | You will be able to justify the choice of concatenation vs coalescent in specific scenarios | Slides | ||
| 05/04 | Discussion: Phylogenomics pitfalls | One per group: 1) Bravo2019, 2) Shen2017, 3) Young2020, 4) Steel2005 | You will be able to describe and analyze some of the main pitfalls of phylogenomic analysis of big data | Slides | ||
| 05/06 | What else is out there? | Read Jermiin2020 | You will hear a brief overview of topics not covered in this class and will have access to resources to learn more | lecture16.md | ||
| 05/09 | Final project due | |||||
| 05/11 | Project presentations | |||||
| 05/13 | Project presentations |
See list of topics, grading and academic policies in the syllabus