Skip to content

Assertion failure about scoreEnd vs. scoreHere in GraphAligner 1.0.20 #111

@adamnovak

Description

@adamnovak

The GraphAligner 1.0.20 release fixed several assertion failure errors (warnings) and a crash which I was getting with the previous release, but I can still get it to fail an assertion. Based on the crashing behavior after assertion failures with the previous release, I'm not sure it's really safe to continue when assertion failures come up, so I'm interested in getting it fixed.

I ran:

/usr/bin/time -v singularity run -B /private:/private docker://quay.io/biocontainers/graphaligner:1.0.20--h06902ac_0 GraphAligner -t 62 -g /private/groups/patenlab/anovak/projects/hprc/lr-giraffe/graphs/hprc-v2.0-mc-chm13-eval.d46.gfa -f /private/groups/patenlab/anovak/projects/hprc/lr-giraffe/reads/real/r10y2025/HG002/HG002_PAW70337.full.fq.gz --seeds-mxm-length 30 --seeds-mem-count 10000 --bandwidth 15 --multimap-score-fraction 0.99 --precise-clipping 0.85 --min-alignment-score 100 --clip-ambiguous-ends 100 --overlap-incompatible-cutoff 0.15 --max-trace-count 5 --mem-index-no-wavelet-tree -a ./output/graphaligner.gam 2>&1 | tee ./output/graphaligner.log

(Most of the parameters came from the suggestion to my collaborator @xchang1 in #106 (comment). I'm using Singularity and Biocontainers here because I'm not certain about having the required licences to use Conda.)

I let it run overnight and I got this:

INFO:    Using cached SIF image
GraphAligner bioconda 1.0.20-
GraphAligner bioconda 1.0.20-
Load graph from /private/groups/patenlab/anovak/projects/hprc/lr-giraffe/graphs/hprc-v2.0-mc-chm13-eval.d46.gfa
Build MUM/MEM seeder from the graph
Build alignment graph
MEM seeds, min length 30, max count 10000
Seed cluster size 1
Extend up to 5 seed clusters
Alignment bandwidth 15
Clip alignment ends with identity < 85%
X-drop DP score cutoff 33333
Backtrace from 5 highest scoring local maxima per cluster
write alignments to ./output/graphaligner.gam
Align
src/GraphAlignerBitvectorCommon.h:1134: Assertion 'previous.node(neighbor).endSlice.scoreEnd >= scoreHere-(eq?0:1)' failed. Read: 14e0a73d-736e-4c42-abac-dbfd965f07d6. Seed: 0+,0,0,0

At that point I stopped the run to report the issue.

I've uploaded the GFA file (gzip-compressed) and the offending read as a single-read FASTQ to:

https://public.gi.ucsc.edu/~anovak/outbox/tracks/big/graphaligner_assert/

Our security certificate expired last week, but it should hopefully be renewed soon.

I'm working with 2025-era R10 reads, and a prototype Human Pangenome Reference Consortium v2.0 graph.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions