Lessons and exercises for STEMinist R Workshop. This ~10 hour introduction to R course was developed for UC Berkeley STEMinist, a free data science mentorship and mini-bootcamp program for women and students of color. The STEMinist R Workshop has been taught 8+ times at UC Berkeley since 2017 and was brought to the UC Davis campus in 2018. These R lessons have also been taught annually since 2018 as part of the Evolution and Ecology Graduate Admission Pathways (EEGAP) program, a summer research experience at UC Davis for Howard University undergraduate students.
Erin Calfee*, Serena Caplins, Melissa Kardish*, Kristin Lee*, Kelsey Lyberger, Michelle Stitzer
*Course developers
This course is designed as a 3-day introduction to R and most of the content focuses on manipulating data and plotting, with more advanced topics (apply, for loops) covered on the 3rd day. All lessons are taught in base R, but exercises could be completed using other common packages (e.g. tidyverse). Files
- Lessons (blank lessons as we'd distribute to students and example annotated copies)
- Short Exercises (completed by student in-between short lessons)
- Challenge Problems (mini-projects introducing more advanced skills and packages for plotting, simulations, and geospatial analysis)
We recommend instructors 'live-code' in front of the class along with students during the lessons. This way, instructors can demonstrate how to identify and quickly recover from typos and small coding mistakes, an important part of learning how to code. Short 10-15 min lesson chunks are interspersed with exercises, and half or more of the lesson time should be for students to complete these exercise sets, ideally with multiple teaching assistants to help students while they work. Annotated files show the completed file after a full lesson. No homework was assigned and for the second half of the last day students chose one of 3 challenge problems to work on.
Participants were all interested in data science and started the course with a range of coding experience (from undergraduate students with no coding experience to graduate students using Python). We structured the course for mixed abilities by including additional optional harder exercises and by giving students optional 'hints' files for the harder challenge projects at the end of the course.
Thank you to all of the amazing STEMinist participants for your enthusiasm and hard work! Diana Lizarraga and CalNERDS created a stellar supportive learning environment and recruited students for this workshop. Kyle Christie and Marshall McMunn generously shared R exercises from their own graduate student R workshop at UC Davis for re-use in these lessons.
Sleep Dataset. Allison, Truett and Cicchetti, Domenic V. (1976), "Sleep in Mammals: Ecological and Constitutional Correlates", Science, November 12, vol. 194, pp. 732-734. http://lib.stat.cmu.edu/datasets/sleep (sleep.csv)
Crabs Dataset from R MASS package. Campbell, N.A. and Mahon, R.J. (1974) "A multivariate study of variation in two species of rock crab of genus Leptograpsus", Australian Journal of Zoology, 22, 417–425. https://stat.ethz.ch/R-manual/R-devel/library/MASS/html/crabs.html (crabs.csv)
Chick Weight Dataset from R. https://www.rdocumentation.org/packages/datasets/versions/3.6.2/topics/ChickWeight (ChickWeight.csv)
States Dataset from R. U.S. Department of Commerce, Bureau of the Census (1977) Statistical Abstract of the United States. https://stat.ethz.ch/R-manual/R-devel/library/datasets/html/state.html (datasets::states)
Cereals Dataset from R MASS package. https://stat.ethz.ch/R-manual/R-devel/library/MASS/html/UScereal.html. Original Data http://lib.stat.cmu.edu/datasets/1993.expo/. (MASS::USCereal)
Earthquake Data from USGS. https://earthquake.usgs.gov/earthquakes/search/ (Earthquake Challenge)
Attribution-NonCommercial 4.0 International (CC BY-NC 4.0). You may teach, build and share from this course content for non-commercial purposes as long as you give appropriate credit (e.g. a link to this github repository!).