Skip to content

flair transcriptome: crashing with just one (of 20) alignments #623

@sklages

Description

@sklages

command used

I ran flair just like this:

flair transcriptome \
  --genomealignedbam "$SOUT/${BAM_PFX}.bam" \
  --genome "$REF_FA" \
  --gtf "$GTF" \
  --output "$SOUT/$SAMPLE" \
  --threads 1

Usually I run with 20 threads, I reduced to 1 just for debugging.

How did you install Flair?

  1. conda env create -f misc/flair_basic_conda_env.yaml

The YAML looks like:

name: Flair_v3.0.0

channels:
  - conda-forge
  - bioconda
  - conda

# not yet tested with 3.13
dependencies:
  - python=3.12
  - minimap2=2.30
  - bedtools=2.31.1
  - samtools=1.22.1
  - R>=4.0,<5.0.0
  - r-ggplot2
  - r-qqman
  - bioconductor-deseq2
  - bioconductor-drimseq
  - bioconductor-stager
  - pip
  - pip:
      - flair-brookslab==v3.0.0b1
      - matplotlib>=3.10.0,<4.0.0

What happened?

I have a bunch of ONT direct RNA samples, 20 fastq files. I use flair align for alignment to hg38+gencode/basic. This worked fine.

Then I used flair transcriptome just like described before.
All fastq files were successfully run through flair transcriptome, except one sample. I reduced the number of threads step-wise, 20,12,8,1, but this seems not to be source of the problem.

error message / traceback

loading genome
making temp dir
Getting regions
Number of regions 1703
Generating splice site database
Extracting annotation from GTF
splitting by chunk

done running chunk 1 of 1703
<..>
done running chunk 1651 of 1703

done running chunk 1652 of 1703
failure: minimap2 -a -t 1 -N 4 --MD /path/to/rna_ont/10_flair/8289db29-92b9-48f9-9da9-b5026a482322/chrX-47422039-49301457.annotated_transcripts.fa /path/to/rna_ont/10_flair/8289db29-92b9-48f9-9da9-b5026a482322/chrX-47422039-49301457.reads.fasta | filter_transcriptome_align.py --sam - -o /path/to/rna_ont/10_flair/8289db29-92b9-48f9-9da9-b5026a482322/chrX-47422039-49301457.matchannot.counts.tsv -t 1 --quality 0 -w 100 --generate_map /path/to/rna_ont/10_flair/8289db29-92b9-48f9-9da9-b5026a482322/chrX-47422039-49301457.matchannot.read.map.txt --stringent -i /path/to/rna_ont/10_flair/8289db29-92b9-48f9-9da9-b5026a482322/chrX-47422039-49301457.annotated_transcripts.bed 2>[DataReader]
Traceback (most recent call last):
  File "/path/to/fs_links/software/virtual/Flair_v3.0.0/lib/python3.12/site-packages/pipettor/processes.py", line 412, in _raise_if_failed
    raise p.procExcept
pipettor.exceptions.ProcessException: process exited 1: filter_transcriptome_align.py --sam - -o /path/to/rna_ont/10_flair/8289db29-92b9-48f9-9da9-b5026a482322/chrX-47422039-49301457.matchannot.counts.tsv -t 1 --quality 0 -w 100 --generate_map /path/to/rna_ont/10_flair/8289db29-92b9-48f9-9da9-b5026a482322/chrX-47422039-49301457.matchannot.read.map.txt --stringent -i /path/to/rna_ont/10_flair/8289db29-92b9-48f9-9da9-b5026a482322/chrX-47422039-49301457.annotated_transcripts.bed:
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/path/to/virtual/Flair_v3.0.0/lib/python3.12/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
                    ^^^^^^^^^^^^^^^^^^^
  File "/path/to/virtual/Flair_v3.0.0/lib/python3.12/site-packages/flair/filter_transcriptome_align.py", line 62, in process_read_chunk
    assignedts = getbesttranscript(filteredtranscriptaligns, args, transcripttoexons, transcripttobpssindex)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/path/to/virtual/Flair_v3.0.0/lib/python3.12/site-packages/flair/count_sam_transcripts.py", line 236, in getbesttranscript
    indel_detected, coveredpos, queryclipping, blockstarts, blocksizes, tendpos = process_cigar(args, matchvals, thist.cigar, thist.startpos)
                                                                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/path/to/virtual/Flair_v3.0.0/lib/python3.12/site-packages/flair/count_sam_transcripts.py", line 194, in process_cigar
    coveredpos[-1] += blen
    ~~~~~~~~~~^^^^
IndexError: list index out of range
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/path/to/virtual/Flair_v3.0.0/lib/python3.12/site-packages/flair/filter_transcriptome_align.py", line 177, in <module>
    process_alignments(args, transcripttoexons, transcripttobpssindex)
  File "/path/to/virtual/Flair_v3.0.0/lib/python3.12/site-packages/flair/filter_transcriptome_align.py", line 144, in process_alignments
    for r in p.imap_unordered(process_read_chunk, bam_to_read_aligns(samfile, chunksize, tempDir, transcripttoexons,
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/path/to/virtual/Flair_v3.0.0/lib/python3.12/multiprocessing/pool.py", line 873, in next
    raise value
IndexError: list index out of range

multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/path/to/fs_links/software/virtual/Flair_v3.0.0/lib/python3.12/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
                    ^^^^^^^^^^^^^^^^^^^
  File "/path/to/fs_links/software/virtual/Flair_v3.0.0/lib/python3.12/site-packages/flair/flair_transcriptome.py", line 1061, in runcollapsebychrom
    goodaligntoannot, firstpasssingleexons, supannottranscripttojuncs = identify_good_match_to_annot(args, tempprefix,
                                                                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/path/to/fs_links/software/virtual/Flair_v3.0.0/lib/python3.12/site-packages/flair/flair_transcriptome.py", line 593, in identify_good_match_to_annot
    transcriptomealignandcount(args, tempprefix + '.reads.fasta',
  File "/path/to/fs_links/software/virtual/Flair_v3.0.0/lib/python3.12/site-packages/flair/flair_transcriptome.py", line 488, in transcriptomealignandcount
    pipettor.run([mm2_cmd, count_cmd])
  File "/path/to/fs_links/software/virtual/Flair_v3.0.0/lib/python3.12/site-packages/pipettor/__init__.py", line 17, in run
    Pipeline(cmds, stdin=stdin, stdout=stdout, stderr=stderr, logger=logger, logLevel=logLevel).wait()
  File "/path/to/fs_links/software/virtual/Flair_v3.0.0/lib/python3.12/site-packages/pipettor/processes.py", line 459, in wait
    self._wait_guts()
  File "/path/to/fs_links/software/virtual/Flair_v3.0.0/lib/python3.12/site-packages/pipettor/processes.py", line 452, in _wait_guts
    self._raise_if_failed()
  File "/path/to/fs_links/software/virtual/Flair_v3.0.0/lib/python3.12/site-packages/pipettor/processes.py", line 415, in _raise_if_failed
    raise ex
  File "/path/to/fs_links/software/virtual/Flair_v3.0.0/lib/python3.12/site-packages/pipettor/processes.py", line 412, in _raise_if_failed
    raise p.procExcept
pipettor.exceptions.ProcessException: process exited 1: filter_transcriptome_align.py --sam - -o /path/to/rna_ont/10_flair/8289db29-92b9-48f9-9da9-b5026a482322/chrX-47422039-49301457.matchannot.counts.tsv -t 1 --quality 0 -w 100 --generate_map /path/to/rna_ont/10_flair/8289db29-92b9-48f9-9da9-b5026a482322/chrX-47422039-49301457.matchannot.read.map.txt --stringent -i /path/to/rna_ont/10_flair/8289db29-92b9-48f9-9da9-b5026a482322/chrX-47422039-49301457.annotated_transcripts.bed:
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/path/to/virtual/Flair_v3.0.0/lib/python3.12/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
                    ^^^^^^^^^^^^^^^^^^^
  File "/path/to/virtual/Flair_v3.0.0/lib/python3.12/site-packages/flair/filter_transcriptome_align.py", line 62, in process_read_chunk
    assignedts = getbesttranscript(filteredtranscriptaligns, args, transcripttoexons, transcripttobpssindex)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/path/to/virtual/Flair_v3.0.0/lib/python3.12/site-packages/flair/count_sam_transcripts.py", line 236, in getbesttranscript
    indel_detected, coveredpos, queryclipping, blockstarts, blocksizes, tendpos = process_cigar(args, matchvals, thist.cigar, thist.startpos)
                                                                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/path/to/virtual/Flair_v3.0.0/lib/python3.12/site-packages/flair/count_sam_transcripts.py", line 194, in process_cigar
    coveredpos[-1] += blen
    ~~~~~~~~~~^^^^
IndexError: list index out of range
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/path/to/virtual/Flair_v3.0.0/lib/python3.12/site-packages/flair/filter_transcriptome_align.py", line 177, in <module>
    process_alignments(args, transcripttoexons, transcripttobpssindex)
  File "/path/to/virtual/Flair_v3.0.0/lib/python3.12/site-packages/flair/filter_transcriptome_align.py", line 144, in process_alignments
    for r in p.imap_unordered(process_read_chunk, bam_to_read_aligns(samfile, chunksize, tempDir, transcripttoexons,
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/path/to/virtual/Flair_v3.0.0/lib/python3.12/multiprocessing/pool.py", line 873, in next
    raise value
IndexError: list index out of range

"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/path/to/fs_links/software/virtual/Flair_v3.0.0/lib/python3.12/site-packages/flair/flair_cli.py", line 85, in main
    flair_module_run(opts, args.module, args.module_args)
  File "/path/to/fs_links/software/virtual/Flair_v3.0.0/lib/python3.12/site-packages/flair/flair_cli.py", line 55, in flair_module_run
    collapsefrombam()
  File "/path/to/fs_links/software/virtual/Flair_v3.0.0/lib/python3.12/site-packages/flair/flair_transcriptome.py", line 1151, in collapsefrombam
    for i in p.imap(runcollapsebychrom, chunkcmds):
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/path/to/fs_links/software/virtual/Flair_v3.0.0/lib/python3.12/multiprocessing/pool.py", line 873, in next
    raise value
pipettor.exceptions.ProcessException: process exited 1: filter_transcriptome_align.py --sam - -o /path/to/rna_ont/10_flair/8289db29-92b9-48f9-9da9-b5026a482322/chrX-47422039-49301457.matchannot.counts.tsv -t 1 --quality 0 -w 100 --generate_map /path/to/rna_ont/10_flair/8289db29-92b9-48f9-9da9-b5026a482322/chrX-47422039-49301457.matchannot.read.map.txt --stringent -i /path/to/rna_ont/10_flair/8289db29-92b9-48f9-9da9-b5026a482322/chrX-47422039-49301457.annotated_transcripts.bed:
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/path/to/virtual/Flair_v3.0.0/lib/python3.12/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
                    ^^^^^^^^^^^^^^^^^^^
  File "/path/to/virtual/Flair_v3.0.0/lib/python3.12/site-packages/flair/filter_transcriptome_align.py", line 62, in process_read_chunk
    assignedts = getbesttranscript(filteredtranscriptaligns, args, transcripttoexons, transcripttobpssindex)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/path/to/virtual/Flair_v3.0.0/lib/python3.12/site-packages/flair/count_sam_transcripts.py", line 236, in getbesttranscript
    indel_detected, coveredpos, queryclipping, blockstarts, blocksizes, tendpos = process_cigar(args, matchvals, thist.cigar, thist.startpos)
                                                                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/path/to/virtual/Flair_v3.0.0/lib/python3.12/site-packages/flair/count_sam_transcripts.py", line 194, in process_cigar
    coveredpos[-1] += blen
    ~~~~~~~~~~^^^^
IndexError: list index out of range
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/path/to/virtual/Flair_v3.0.0/lib/python3.12/site-packages/flair/filter_transcriptome_align.py", line 177, in <module>
    process_alignments(args, transcripttoexons, transcripttobpssindex)
  File "/path/to/virtual/Flair_v3.0.0/lib/python3.12/site-packages/flair/filter_transcriptome_align.py", line 144, in process_alignments
    for r in p.imap_unordered(process_read_chunk, bam_to_read_aligns(samfile, chunksize, tempDir, transcripttoexons,
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/path/to/virtual/Flair_v3.0.0/lib/python3.12/multiprocessing/pool.py", line 873, in next
    raise value
IndexError: list index out of range
Exception ignored in: <function Pool.__del__ at 0x7bac13eb4ae0>
Traceback (most recent call last):
  File "/path/to/fs_links/software/virtual/Flair_v3.0.0/lib/python3.12/multiprocessing/pool.py", line 271, in __del__
    self._change_notifier.put(None)
  File "/path/to/fs_links/software/virtual/Flair_v3.0.0/lib/python3.12/multiprocessing/queues.py", line 399, in put
    self._writer.send_bytes(obj)
  File "/path/to/fs_links/software/virtual/Flair_v3.0.0/lib/python3.12/multiprocessing/connection.py", line 200, in send_bytes
    self._send_bytes(m[offset:offset + size])
  File "/path/to/fs_links/software/virtual/Flair_v3.0.0/lib/python3.12/multiprocessing/connection.py", line 427, in _send_bytes
    self._send(header + buf)
  File "/path/to/fs_links/software/virtual/Flair_v3.0.0/lib/python3.12/multiprocessing/connection.py", line 384, in _send
    n = write(self._handle, buf)
        ^^^^^^^^^^^^^^^^^^^^^^^^
OSError: [Errno 9] Bad file descriptor

What else do we need to know?

I ran these jobs on a Slurm cluster, this is probably irrelevant, as 19 of 20 finished successfully (and reproducibly).

Let me know if you need any more information. I cannot share any sequence/alignment data though (though metadata is possible) ..

thank you.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions