Skip to content
/ SynORFan Public

Align and subset genomic sequences based on ORF positions

License

Notifications You must be signed in to change notification settings

oacar/SynORFan

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

37 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SynORFan

This package is designed to find overlapping open reading frames in a multiple sequence alignment(MSA) given a reference alignment. Two main scripts are bioconductor.py and analysis.py which are both designed to be used as CLI programs.

python bioconductor.py --help

will give you the options and input files needed to use this program.

usage: bioconductor.py [-h] -p PATH -n ORF_NAME [-a] -y YEAST [-alg ALGORITHM]

optional arguments:
  -h, --help      show this help message and exit
  -p PATH         Directory path for alignment and output folder
  -n ORF_NAME     ORF name for output names
  -a              Is the sequence is annotated?
  -y YEAST        Fasta file containing dna sequence for annotated yeast genes
  -alg ALGORITHM  Select alignment algorithm. Default is mafft

Example Usage:

python bioconductor.py -p input_data/ -n YBR196C-A -y input_data/orf_genomic_all.fasta -a

Requirements:

Python requirements are in requirements.txt file however you also need mafft to be on your system path and a tmp folder on your Home folder.(i.e. $HOME/tmp/ should be available) -y argument needs orf_genomic_all.fasta file which can be downloaded from SGD for yeast to get the sequence if -a is specified or the ORF sequence can be given directly to -y.

About

Align and subset genomic sequences based on ORF positions

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages