@@ -45,29 +45,34 @@ cp yaha /usr/local/bin/.
4545
4646** COMMON USAGE SCENARIOS:**
4747
48- To create an index. NOTE: The genome file can be a FASTA file, or a nib2 file (created by a previous * yaha * index operation) :
48+ To create an index for a reference genome :
4949```
50- yaha -g genomeFilename [ -H maxHits (65525)] [ -L wordLen (15)] [ -S Skip-distance (1)]
50+ yaha -g < genomeFilename> -H < maxHits> -L <seedLength> -S <Skipdistance>
5151```
5252
53- To align queries. NOTE: The query file can be either a FASTA file or a FASTQ file.
53+ To align sequencing data:
5454```
55- yaha -x yahaIndexFile [ -q queryFile|(stdin)] [( -osh)|-oss outputFile|(stdout)][AdditionalOptions ]
55+ yaha -x < yahaIndexFile> -q < queryFile> -osh < outputFile> [Additional Options... ]
5656```
5757
5858---
5959** OPTIONS:**
6060Default values enclosed in square brackets [ ]
6161```
62- Output Options:
63- -g FILE input genome file to use during index creation
64- -q FILE input file of sequence reads to align [STDIN]
62+ Input/ Output Options:
63+ -g FILE input genome file to use during index creation (FASTA or nib2)
64+ -q FILE input file of sequence reads to align (FASTA or FASTQ) [STDIN]
6565-osh FILE output file for alignment output in SAM format with hard clipping(default) [STDOUT]
6666-oss FILE output file for alignment output in SAM format with soft clipping [STDOUT]
6767-x FILE reference index file to use during alignment
68+ NOTE: At most one of -osh or -oss should be specified.
6869
70+ Index Creation Options:
71+ -H INT maxHits: During index creation, seeds occuring more than maxHits times will be sampled [65565]
72+ -L INT seedLength: Length of seed to use. During alignment, seed length is taken from index file [15]
73+ -S INT Skipdistance: Number of bases to skip ahead before forming next seed [1]
6974
70- Additional General Alignment Options:
75+ General Alignment Options:
7176-BW INT BandWidth: band size on each side of the diagonal of banded Smith Waterman [5]
7277-G INT maxGap: maximum indel size allowed with a single alignment [50]
7378-H INT maxHits: maximum times a seed is in the reference before it is ignored as too repetitive [650]
@@ -77,26 +82,28 @@ Additional General Alignment Options:
7782-X INT Xdropoff: maximum score dropoff before terminating alignment extensions [25]
7883-t INT numThreads: number of threads used to parallel process reads [1]
7984
80- Affine Gap Scoring Parameters:
81- -AGS BOOL (Y|N) controls use of Affine Gap Scoring [Y].
85+ Affine Gap Scoring Options:
8286If -AGS is off, a simple edit distance calculation is done.
83- If on, the following are used:
87+ If on, the remaining options are used:
88+ -AGS BOOL (Y|N) controls use of Affine Gap Scoring [Y].
8489-GEC INT GapExtensionCost: cost for extending a gap (indel) [2]
8590-GOC INT GapOpenCost: cost for starting a new gap (indel) [5]
8691-MS INT MatchScore: score added for each matching base [1]
8792-RC INT ReplacementCost: score subtracted for each mismatched base [3]
8893
89- -OQC BOOL (Y|N) controls use of the Optimal Query Coverage Algorithm.
94+ Optimal Query Coverage Options:
9095If -OQC if off, all alignments meeting above criteria are output.
91- If on, a set of alignments are found that optimally cover the query, using the following options:
96+ If -OQC is on, a set of alignments are found that optimally cover the query, using the remaining options.
97+ -OQC BOOL (Y|N) controls use of the Optimal Query Coverage Algorithm.
9298-BP INT BreakpointPenalty: penalty for inserting a breakpoint in split-read alignment [5]
9399-MGDP INT MaxGenomicDistancePenalty (5)]
94100-MNO INT MinNonOverlap: minimum number of unshared bases required in each split alignment [minMatch]
95- NOTE: The total cost of inserting a breakpoint in a split-read is:
101+ NOTE: The total cost of adding a breakpoint in a split-read mapping is:
96102 BP*MIN(MGDP, Log10(genomic distance between reference loci))
97103
98- -FBS BOOL (Y|N) controls inclusion of alignments similar to best alignment found using OQC.
99- If -FBS is on, the following are used. A alignemnt must satisfy BOTH criteria to be "similar".
104+ Filter By Similarity Options:
105+ If -FBS is on, the remaining options are used. An alignemnt must satisfy BOTH criteria to be "similar".
106+ -FBS BOOL (Y|N) controls output of alignments similar to best alignment found using OQC.
100107-PRL REAL PercentReciprocalLength: minimum ratio of overlapping length between similar alignemnt [0.9]
101108-PSS REAL PercentSimilarScore: minimum ratio of scores between similar alignments [0.9]
102109```
0 commit comments