Skip to content

Feature request – add support for CSI (*.csi) BAM indices #24

@jantusan

Description

@jantusan

Hello,

Thank you for developing BAMscale, it has become my go-to tool for generating bigwigs. While processing a large wheat ChIP-seq dataset I ran into a limitation that I hope could be addressed (or perhaps you already have a workaround):

Summary

When a BAM is indexed with CSI (needed for large chromosomes or many contigs), BAMscale fails with cannot find *.bai.

Steps to reproduce

samtools index -c sample.bam    # creates sample.bam.csi
BAMscale scale --bam sample.bam --binsize 10
# ERROR: cannot find sample.bam.bai

Expected behaviour

  • Automatically load sample.bam.csi, or
  • Allow specifying the index path (e.g. --index sample.bam.csi).

Feasibility notes (from HTSlib docs)

  • HTSlib loads BAI or CSI transparently via sam_index_load() after opening with hts_open/sam_open. See the sam.h API docs.
  • Explicit index path is supported in HTSlib ≥1.10 using the ##idx## syntax (e.g. sample.bam##idx##/path/to/sample.bam.csi) or via hts_idx_load2(fn, fnidx). See the 1.10 release notes and hts.h.
  • Background: BAI indexes are limited to chromosomes ≤512 Mbp, hence CSI for large genomes.

Environment

  • BAMscale v0.0.9
  • samtools 1.21

Reference

Thanks for considering this.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions