Skip to content

🔬 ♌ Bacterial ribosomal RNA predictor

License

GPL-3.0, Unknown licenses found

Licenses found

GPL-3.0
LICENSE
Unknown
LICENSE.Rfam
Notifications You must be signed in to change notification settings

tseemann/barrnap

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

CI GitHub release License: GPL v3 Conda Language: Perl 5

Barrnap

Annotate all the bacterial RNA in your genome

Description

Barrnap is an annotation tool for identifying RNA features in microbial genomes. It can find:

  • rRNA - ribosomal RNA (5S,16S,23S)
  • tRNA - transfer RNA
  • tmRNA - transfer messenger RNA
  • ncRNA - non-coding RNA
  • mRNA - messenger RNA, inc. RBS, CDS, terminator
  • operon - specifically the rRNA/tRNA operon

You provide a FASTA file, you get a GFF3 file. Too easy.

Installation

conda install -c bioconda -c conda-forge barrnap

Quick start

# Backward compatible with the old versions - just rRNA

% barrnap --legacy test/small.fna
##gff-version 3
small	infernal:1.1.5	rRNA	293312	294796	1.7e-49	+	.	Name=16S_rRNA;Alias=SSU_rRNA_bacteria;Dbxref=Rfam:RF00177;product=16S ribosomal RNA
small	infernal:1.1.5	rRNA	295463	298336	4.8e-07	+	.	Name=23S_rRNA;Alias=LSU_rRNA_bacteria;Dbxref=Rfam:RF02541;product=23S ribosomal RNA
small	infernal:1.1.5	rRNA	298432	298548	1.1e-13	+	.	Name=5S_rRNA;Alias=5S_rRNA;Dbxref=Rfam:RF00001;product=5S ribosomal RNA

# By default we find all the RNA

% barrnap --threads 8 test/small.fna
##gff-version 3
mall   infernal:1.1.5    ncRNA          128     274  5.4e-05  +  .  Name=Cobalamin;Dbxref=Rfam:RF00174;product=Cobalamin riboswitch aptamer
small   aragorn:1.2.41   tmRNA        15305   15616  .        -  .  Name=tmRNA;product=transfer-messenger RNA (non-canonical) ANKIVSFSRQTAPVAA*
small  aragorn:1.2.41    tRNA         86968   87039  .        +  .  Name=tRNA-Asn;product=transfer RNA (gtt)
small  barrnap:1.6.0     mRNA        188710  189808  .        +  .  product=messenger RNA
small  pyrodigal:3.7.0   RBS         188710  188715  119.0    +  .  product=ribosome binding site AGGAG
small  pyrodigal:3.7.0   CDS         188726  189808  85.6     +  0  productr=hypothetical protein
small  TransTermHP:2.09  terminator  189857  189880  100      +  .  product=Rho-independent terminator
small  barrnap:1.6.0     operon      295463  298548  .        +  .  Name=rRNA operon;product=rRNA operon: rRNA-rRNA
small  infernal:1.1.5    rRNA        295463  298336  4.8e-07  +  .  Name=23S_rRNA;Alias=LSU_rRNA_bacteria;Dbxref=Rfam:RF02541;product=23S ribosomal RNA
small  infernal:1.1.5    rRNA        298432  298548  1.1e-13  +  .  Name=5S_rRNA;Alias=5S_rRNA;Dbxref=Rfam:RF00001;product=5S ribosomal RNA

# You can make full GFFs with header and sequence

% barrnap --incseq --incseqreg test/fake.fna
##gff-version 3
##sequence-region contig001 1 733412
##sequence-region contig002 1 542170
##sequence-region contig003 1 31088
...
##FASTA
>contig001
CCGATTAGACCACTTTGCTGATAACAGTATTCATATCAATTGATTAGAAAGATTTCTTTT
TTGGTCACATTTTGATCACTTTTGAAGAAAACAATTTTTCTTCTAGGTTTTCCTTATGAG
AAGGAATTAGAATATTGACTAGATAGGTTCTAATGGGAATCAGCCATTGGAGGTAACGGG
...

Options

General

  • --help show help and exit
  • --version print version in form barrnap X.Y and exit
  • --citation print a citation and exit
  • --debug will write all tempfiles to '.' and print debug ingo

Database management

  • --listdb to see what DBs are installed
  • --updatedb to update DBs from internet
  • --dbdir to use a different DB folder

Search

  • --kingdom is the database to use: Bacteria:bac, Archaea:arc, Fungi:fun
  • --legacy only does rRNA scan, like versions < 1.0 did.
  • --no-rrna disables rRNA scan
  • --no-trna disables tRNA scan
  • --no-ncrna disables ncRNA scan
  • --no-mrna disables mRNA scan (inc CDS,RBS,terminator)
  • --no-operon disables RNA-operon annotation

Speed

  • --threads is how many CPUs to uase
  • --fast uses simpler HMMs instead of CMs and it less accurate

Filtering

  • --evalue is the cut-off for hits to keep
  • --lencutoff is the proportion of the full length that qualifies as partial match (IGNORED)
  • --reject will not include hits below this proportion of the expected length (IGNORED)

Output

  • --quiet will not print any messages to stderr
  • --incseq will include the full input sequences in the output GFF
  • --incseqreg will include ##sequence-region headers in the GFF
  • --outseq creates a FASTA file with the hit sequences
  • --adids will add unique ID= tags to each GFF3 feature

FAQ

What has changed since the 0.9 version?

  • Barrnap now finds all RNA, not just rRNA. Use the --legacy option for backward compatiblity
  • I no longer use nucleotide HMMs and local alignment. To get that behaviour use --fast.
  • The mito model is gone, the fun model is in.
  • The --reject and --lencutoff paramters are ignored now, as we use global CMs now.
  • SILVA is no longer used, all models are from Rfam.

Where does the name come from?

The name Barrnap was originally derived from Bacterial/Archaeal Ribosomal RNA Predictor. However it has since been extended to support mitochondrial and eukaryotic rRNAs, and has been given the new backronym BAsic Rapid Ribosomal RNA Predictor. The project was originally spawned at CodeFest 2013 in Berlin, Germany by Torsten Seemann and Tim Booth.

References

Feedback

File questions, bugs, or ideas on the Issues page

License

GPLv3

Author

Torsten Seemann