Skip to content

Zero position in VCF #343

@drtconway

Description

@drtconway

Hi Noodles,

I think this might be a bug in Noodles. At first I thought it was a bug in the upstream SV caller, but on reflection, I think they might be doing the right thing; well more or less.

I have a SV that begins:

chr4_GL000008v2_random  0   cuteSV.INS.8444 G   GGTACTTTTTGTTTCTCAG...

It's a corner case that the VCF specification doesn't handle gracefully.

It's seems clear that the intent is that the ALT sequence should be inserted before the beginning of the contig.

Now there is obviously the problem of the context base in the REF column - putting the first base G is kind of misleading - I posit that using N would be preferable.

However my main concern is with the 0 position. The VCF parser gives a None for rec.variant_start(), which given that positions are 1-based, is kind of reasonable, except for this specific edge case, where 0 would actually be more accurate.

The result of rec.variant_start() is easy to deal with - I can map None onto 0 and get on with things.

Where things run into trouble is if I call rec.variant_end() which produces an error:

invalid INFO END position

If rec.variant_start() returns None for this case, would it be more consistent for rec.variant_end() to do the same here?

It's a bit of an annoying corner case!

Cheers,
Tom.

Metadata

Metadata

Assignees

Labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions