Skip to content

Panclade alignment#89

Merged
aineniamh merged 4 commits intomainfrom
updating-cI-ref
Jan 27, 2026
Merged

Panclade alignment#89
aineniamh merged 4 commits intomainfrom
updating-cI-ref

Conversation

@aineniamh
Copy link
Copy Markdown
Owner

  • Adding in pan clade composite consensus reference generated from an alignment of PX667572 and NC_063383.
  • The gene boundaries file for pan clade contains genes from clade I and clade II with coordinates mapped from the original reference sequence to the pan clade coordinate system
  • The ITR is trimmed from 193610, which is the ITR mask site for clade II mapped onto the new coordinate system
  • The mask file is currently empty
  • Some translation functions are included that can generate the coordinate system map (in utils/panclade_map.py). To access the coordinate map run the following:
import squirrel.utils.panclade_map as pm
coordinates = pm.get_coordinate_map()
  • This function extracts coordinate mappings from predefined CIGAR strings.
    Retrieves a dictionary of CIGAR strings for reference sequences and
    converts each CIGAR string into bidirectional coordinate mappings
    between genomic and alignment positions.
    Returns:
    dict: Dictionary mapping sequence IDs to coordinate information dictionaries,
    each containing:
    - cigar (str): CIGAR string representation.
    - genome_to_aln (dict): Maps genome positions to alignment positions.
    - aln_to_genome (dict): Maps alignment positions to genome positions,
    with None for gap positions.

@aineniamh aineniamh merged commit 1375e00 into main Jan 27, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant