-
Notifications
You must be signed in to change notification settings - Fork 5
Plot
BinaRena is an interactive scatter plot of contigs, with five display items:
- x-axis, y-axis, size (radius of contig), opacity (alpha value), and color.
Each display item can be changed and tweaked in the display panel. When the user moves the mouse of an item, two buttons will emerge, one letting the user select a data transformation method, and the other letting the user display a legend.
What data should be assigned to these five display items? Well, be creative. Many pieces of information are useful for exploring metagenomes and binning contigs. Examples are GC x coverage, abundance profile by sample, k-mer frequency-based dimensionality reduction, taxonomic assignment, functional capacity, etc, etc. BinaRena ships with multiple scripts for generating some of these. Toggle these properties to gain a better understanding of your data.
BinaRena provides interactive legends to inform the user of the (numeric) data. Move the mouse over the legend to see the original value in real time. Meanwhile, two brackets will show up at the edges of the legend. Drag them to modify the displayed range of data. The lower limit is clickable. It lets the user toggle between zero or minimum value in the data.
Biological data are usually highly skewed. To effectively display them, proper transformation is usually necessary. BinaRena provides various transformation methods that can be easily selected from a dropdown menu next to the field selection box.
Specifically, BinaRena supports the following transformations:
- Square, cube, 4th power.
- Square root, cube root, 4th root.
- Logarithm and exponential.
- Logit and arcsine (for proportion data).
- Ranking.
Note: Certain values may become invalid after certain transformation. For example, zero and negative numbers cannot be log-transformed. In such cases, the contigs will be displayed using the default setting (e.g., color is gray).
The color panel has an additional dropdown menu to let the user choose from multiple standard color palettes.
For categorical data, BinaRena automatically identifies and colors the most frequent categories in the dataset, while leaving all remaining categories in black. One may use the floating + and − buttons to increase / decrease the number of colored categories.
When a dataset is loaded, BinaRena attempts to make the best guess of appropriate display items based on the names and types of individual fields. Specifically, it will look for these keywords in the column names:
-
X-axis (numeric):
x,xaxis,x1,axis1,dim1,pc1,tsne1,umap1, etc. -
Y-axis (numeric):
y,yaxis,x2,axis2,dim2,pc2,tsne2,umap2, etc. -
Length (bp) (numeric):
length,size,len,bp, etc. -
Coverage (x) (numeric):
coverage,depth,cov, etc. -
GC content (%) (numeric):
gc,g+c,gc%,gc-content, etc. -
Taxonomy (categorical):
phylum,class,order,family,genus, etc.
The matching process is case-insensitive. Suffixes after common delimiters (" ", "/", "_", ".") are stripped. For example, Length (bp) and size_of_scaffold will be recognized as length.
If all columns are found, BinaRena will render x- and y-axes in linear scale, length as marker size (radius) in cube root scale (because a sphere's volume is proportional to the cube of radius), coverage as marker opacity in square root scale, and highest taxonomic group as marker color.
If x- and/or y-axes are not found, the program will render a length by coverage plot.
If none is found, the program will take the first two numeric columns as x- and/or y-axes.
It's very likely that BinaRena is not smart enough to hit the desired view. You will need to tweak the display items as needed.
See also how to export views and images.
Contact: Dr. Qiyun Zhu (qiyunzhu@gmail.com).