Skip to content

Quick Start

hollybik edited this page Oct 17, 2016 · 30 revisions

Phinch currently supports downstream analyses of BIOM 1.0 files ("sparse" table format only), Biological Observation Matrix format, a JSON file type (.biom) used to represent diverse types of genomic data. The most typical user applications are environmental rRNA amplicons or shotgun metagenomic data, although any type of sample/observation data can be represented as .biom files (RNA-seq, gene variants, morphological character matrices, etc.). See below for file conversion instructions.

Note: Phinch does NOT support BIOM 2.0 files or higher (HDF5 format, the new default output format for QIIME 1.9 and higher); these files MUST be converted to JSON-formatted BIOM 1.0 files before they can be loaded into Phinch.

For users having trouble with QIIME 1.9 file conversion, please troubleshoot according to this thread: https://github.com/PitchInteractiveInc/Phinch/issues/46

If your BIOM 1.0 file is correctly formatted and still not working, then can you try the suggestions in this thread (converting back to classic OTU table and then back to BIOM with metadata re-added): https://groups.google.com/forum/#!topic/phinch/rsIk8DCQ0VM


To prepare your files for visualization, follow these steps:

Step 1: Prepare a QIIME-style mapping file for sample metadata

Sample metadata is defined as any descriptive information about your biological samples or the environment where they were collected; you should include any type of metadata that may be useful for interpreting and analyzing patterns in your data. Some common types of sample metadata include geographic coordinates (latitude/longitude), collection date, state/country, sampling matrix (water, air, soil, sediment), etc. Mapping files can contain as much or as little sample metadata as is useful or necessary. For example, sample metadata for a human microbiome study might also include information about patient gender, body site where samples were collected, or patient age. Mapping files should be prepared according to these QIIME guidelines: http://qiime.org/documentation/file_formats.html

To label your samples in Phinch and export graphics with human-readable IDs, include a column in your metadata mapping file with the header labelled as phinchID (these entries can be the same or different as the first SampleID column). The phinchID values will be pulled through into the visualizations to populate graph axes. If this column is not included, an arbitrary numerical ID will be assigned to each sample. For optimal visualization, phinchIDs should be no longer than 15 characters.

An example sample mapping file might look like this:

#SampleID BarcodeSequence LinkerPrimerSequence CollectionDate Material phinchID Description
0.SandCoralPond1.1 ACTGAAGT TATGGTAATTGTGTGCCAGCMGCCGCGGTAA 2012-11-30T11:00:00 Sand CP.Day0.Sand aquarium
0.WaterCoralPond1.1 ACTGGGG TATGGTAATTGTGTGCCAGCMGCCGCGGTAA 2012-11-30T11:00:00 Water CP.Day0.Water aquarium
0.WipesCoralPond1.1 ACTGAAAA TATGGTAATTGTGTGCCAGCMGCCGCGGTAA 2012-11-30T11:00:00 Wipes CP.Day0.Wipes aquarium

Some notes on metadata formatting:

In order to be properly detected, all date/time metadata must be standardized according to MIxS standardized format (more information at the Genomic Standards Consortium wiki), and entered into one column in your original sample metadata mapping file, as follows:

[YYYY]-[MM]-[DD]T[hh]:[mm]:[ss]-[Z]

This date format lists the year, month, and day, followed by a 24hr timestamp with a UTC offset (Z). Inclusion of timestamp and UTC offset are both optional; metadata columns can include date only. For example, metadata for a sample collected at 2:30pm EST on May 4, 2007 would be entered as: 2007-04-05T14:30:00-05:00

Similarly, any geographic coordinates or GPS data must be entered as decimal degrees (the format used by GoogleMaps, e.g. -90.017926). We recommend using separate columns labeled “Latitude” and “Longitude” in your original sample metadata mapping file, to ensure that GPS metadata is correctly detected.


Step 2: Prepare biological matrix data as a .biom file

BIOM files are now the default output for rRNA amplicon workflows (OTU picking) and analysis of shotgun metagnome data in the QIIME software package. Users wanting to visualize other data types should follow these file conversion instructions

In QIIME (version 1.7 or later), users can prepare a .biom file for visualization by executing the following commands.

First, construct an OTU table:

make_otu_table.py -i final_otu_map_mc2.txt -o otu_table_mc2_w_tax.biom -t rep_set_tax_assignments.txt

Where your input file (-i) is your OTU Map (defining clusters of raw sequences reads), and taxonomy file (-t) contains the taxonomy or gene ontology strings that correspond to each OTU.

Second, add your sample metadata to your .biom file.

All sample metadata and taxonomy/ontology information MUST be embedded in the .biom file before being uploaded into Phinch.

In QIIME version 1.8 and above this can be done using the following command:

biom add-metadata -i otu_table_mc2_w_tax.biom -o otu_table_mc2_w_tax_and_metadata.biom -m sample_metadata_mapping_file.txt

In QIIME version 1.7 or below, you can add metadata with the following command:

add_metadata.py -i otu_table_mc2_w_tax.biom -o otu_table_mc2_w_tax_and_metadata.biom -m sample_metadata_mapping_file.txt

Where your input file (-i) is your .biom file from the previous step, and your mapping file (-m) is the tab-delimited mapping file that you prepared in Step 1 (formatted according to QIIME instructions).


Step 3: Upload .biom file with embedded observation/sample data to http://phinch.org

For best results, we recommend using Phinch in the Google Chrome browser.


File Conversion Instructions

If you want to visualize biological data currently formatted as a tab-delimited text file (e.g. the style of OTU tables produced by older versions of QIIME, or any other type of genomic/morphological data that can be represented in matrix format, please refer to this BIOM documentation for conversion instructions. Phinch currently supports only "sparse" BIOM formats ("dense" files are much larger in size and will not load properly in the current visualization framework). Full documentation for the BIOM file format can be found at http://biom-format.org

Clone this wiki locally