BioSeq Program Administrator: Matt Fierman (bioseq@tufts.edu)
Author: Philip Braunstein (pbraunstein12@gmail.com)
This analysis pipeline analyses the results from the Metagenomics Workflow from the Illumina MiSeq.
Put all of the Classification files generated by the MiSeq in the data folder. The program uses all files in this folder that start with "Classification," so make sure there are no other files with this name in the data folder.
./run.py (very simple)
For each run, the run.py script creates a uuid that serves as the runId.This runId is included in the name of every ouput file so that you know which output files were generated from which run. All of the output files are placed in the output directory. These are the output files.
otutable-[runId].txt: OTU Table from all the classification filesotutable-random-[runId].txt: OTU table from above with the sample-Ids randomized to protect the identity of the participantssecretMap-[runId].txt: Mapping between the original sample-Ids and the randomly generated IdsaggregateTable_Genus-[runId].txt: Amount of each microbe in each sample - used in generation of distance matrixdistMatrix-Genus-[runId].txt: Distance matrix of samples from one anotherclustered-Genus-[runId].pdf: Clusterded tree of samples showing similarity between them
We recommend that you use the otu table generated from this pipeline in PICRUSt and then LEfSe. to learn which functions are different in the metagenomics communities. Both of these programs can be accessed on at Galaxy
Questions? Comments? Suggestions? Don't hesitate to contact us!