Welcome to my FishellLab Research Intern Application repository. This repository houses a Jupyter notebook crafted with the purpose of identifying cell clusters and marker genes in a biological dataset. This project was submitted as part of the application to join Fishell Laboratory in October 2023.
The Jupyter notebook within this repository represents an analysis pipeline that utilizes various bioinformatics tools and machine learning algorithms to dissect complex biological data. The focus is on unsupervised clustering of cell populations and the subsequent identification of their characteristic marker genes.
- Data Preprocessing: Clean and prepare your single-cell RNA-seq data for analysis.
- Clustering Analysis: Implement clustering algorithms to detect distinct cell populations.
- Marker Gene Identification: Discover marker genes for each cluster using differential expression analysis.
- Visualization Tools: Explore the data with interactive plots and graphs to interpret the results visually.
- Reproducibility: All steps from raw data processing to final analysis are contained within the Jupyter notebook for full reproducibility.
To get started with this repository, you will need to have a Python environment capable of running Jupyter notebooks.
Before you run the notebook, make sure you have the following packages installed:
- numpy
- pandas
- scikit-learn
- scipy
- matplotlib
- seaborn
- scanpy (specifically for single-cell analysis)