Skip to content
/ akhal Public

Command-line tool designed to process and analyze r/GFA (Graphical Fragment Assembly) files

License

Notifications You must be signed in to change notification settings

akmami/akhal

Repository files navigation

AKHAL: Assembly Graph Analysis Tool

Overview

akhal is a command-line tool designed to process and analyze r/GFA (Graphical Fragment Assembly) files. It provides functionality for validating, analyzing statistics, and converting GAF (Graph Alignment Format) to SAM (Sequence Alignment Map).

Installation

You can build this tool by running:

make

Usage

akhal provides three main commands:

./akhal <PROGRAM> [...ARGS]

Commands

1. parse

Validates an r/GFA file and ensures its correctness. It checks segments and links and makes sure that everything is consistent. It also checks the overlapings, if it is presented.

Usage:

./akhal parse <r/GFA file>

2. stats

Computes and outputs statistics about an r/GFA file.

Usage:

./akhal stats <r/GFA file>

The statistics include:

  • Segment count: Number of segments in the graph.
  • Segment avg length: Average segment length.
  • Segment std length: Standard deviation of segment lengths.
  • Segment min length: Minimum segment length.
  • Segment max length: Maximum segment length.
  • Link count: Number of links between segments.
  • Link overlapping avg length: Average length of overlapping links.
  • Link overlapping std length: Standard deviation of link overlap lengths.
  • Minimum in degree: Minimum number of incoming links.
  • Maximum in degree: Maximum number of incoming links.
  • Minimum out degree: Minimum number of outgoing links.
  • Maximum out degree: Maximum number of outgoing links.

3. extract

Extract information from the r/GFA file.

Usage:

./akhal gaf2sam extract [OPTION] <r/GFA file> <OUTPUT file>

Options:

  • fa: Reference genome. Output file should end with .fa or .fasta

4. gaf2sam

Converts a GAF file to a SAM file.

Usage:

./akhal gaf2sam <r/GFA file> <GAF file>  <FASTA file> <OUTPUT file>  [--simple]

Note: The reads should be stored in FASTA format and provided to the program. GAF file does not store sequences, hence, reads are needed when converting to SAM.

Note: simple option is optional. If provided, CIGAR string matches (=) and mismatches (X) will be replaced with sequence match (M).

5. sampoke

Validate SAM file (converted from gaf). It takes reference file and SAM to process CIGAR strings. Optionally, it can print the filtered SAM file that contains valid lines.

Usage:

./akhal sampoke <FASTA file> <GAF file> <OUTPUT file>

Note: Output file here is optional.

License

is released under the BSD 3-Clause License, which allows for redistribution and use in source and binary forms, with or without modification, under certain conditions. For more detailed terms, please refer to the license file.

Author

Developed by Akmuhammet Ashyralyyev.

Note from the author:

This tool is named after one of the most elegant horses, the Akhal-Teke. This breed is one of the oldest domesticated animals and is considered one of the most beautiful and intelligent horses in the world.

About

Command-line tool designed to process and analyze r/GFA (Graphical Fragment Assembly) files

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published