Skip to content

BioSeq/Metgenomics-Analysis

Repository files navigation

Metagenomics Analysis

BioSeq Group, Tufts University

BioSeq Program Administrator: Matt Fierman (bioseq@tufts.edu)
Author: Philip Braunstein (pbraunstein12@gmail.com)

Introduction

This analysis pipeline analyses the results from the Metagenomics Workflow from the Illumina MiSeq.

Setup

Put all of the Classification files generated by the MiSeq in the data folder. The program uses all files in this folder that start with "Classification," so make sure there are no other files with this name in the data folder.

Use

./run.py (very simple)

Output

For each run, the run.py script creates a uuid that serves as the runId.This runId is included in the name of every ouput file so that you know which output files were generated from which run. All of the output files are placed in the output directory. These are the output files.

  • otutable-[runId].txt: OTU Table from all the classification files
  • otutable-random-[runId].txt: OTU table from above with the sample-Ids randomized to protect the identity of the participants
  • secretMap-[runId].txt: Mapping between the original sample-Ids and the randomly generated Ids
  • aggregateTable_Genus-[runId].txt: Amount of each microbe in each sample - used in generation of distance matrix
  • distMatrix-Genus-[runId].txt: Distance matrix of samples from one another
  • clustered-Genus-[runId].pdf: Clusterded tree of samples showing similarity between them

Further Analysis

We recommend that you use the otu table generated from this pipeline in PICRUSt and then LEfSe. to learn which functions are different in the metagenomics communities. Both of these programs can be accessed on at Galaxy

Questions? Comments? Suggestions? Don't hesitate to contact us!

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published