Skip to content

Conversation

@ainefairbrother
Copy link
Contributor

@ainefairbrother ainefairbrother commented Nov 6, 2025

Description

This PR extends the VEP plugin EVE.pm to be able to handle popEVE data. As such, the plugin can now annotate using EVE data OR popEVE data OR both.

Main changes

  • Added a new parameter: popeve_file
    • Backwards compatibility enabled by leaving the file parameter as is - this param is still for the EVE file
  • Plugin now outputs popEVE_* fields when a popEVE record matches the user's input, and continues to emit EVE_* fields when an EVE record matches
  • Added handling of the popEVE file and handling of the matching results due to popEVE/EVE differences:
    • popEVE = genomic SNV scores, so POS is the variant position
    • EVE = codon-level substitutions (3-bp alleles), so POS is the codon start and the user's variant might be at POS, POS+1, POS+2
    • As such, the EVE logic is retained:
      • A codon-sized window 2nt upstream of the input variant is defined, with tabix returning all records present in that range (typically 0|1 popEVE rows and 0-3 EVE rows (1 per possible codon start))
      • Every returned row is tested agains the user's input allele with get_matched_variant_alleles
      • EVE rows match only if the user's change can produce the ALT codon from the REF codon (a 1bp SNV won't match codons requiring 2-3 base changes)
    • New logic added for popEVE and EVE/popEVE merging
      • For popEVE, rows match on the exact SNV at POS
      • All matches found for popEVE & EVE are then merged into a single output row

Testing

  • It would be beneficial if the reviewer would test with a range of input variants
  • See header of EVE.pm for EVE/popEVE input file download and prep instructions, however, the EVE file is available on nfs at ensembl/variation/data/EVE

in.vcf:

##fileformat=VCFv4.2
##source=test
#CHROM	POS	ID	REF	ALT	QUAL	FILTER	INFO
chr1	961387	.	G	C	.	PASS	.

VEP command:

EVE_file="eve_merged.vcf.gz"
POPEVE_file="grch38_popEVE_ukbb_20250715.vcf.gz"

./vep \
-i in.vcf \
-o out.tsv \
--dir_cache /nfs/production/flicek/ensembl/variation/data/VEP/tabixconverted \
--cache_version 115 \
--format vcf --symbol --tab --offline \
--cache \
--fasta Homo_sapiens.GRCh38.dna.toplevel.fa.gz \
--assembly GRCh38 \
--force_overwrite \
--no_escape \
--show_ref_allele \
--no_stats \
--plugin EVE,file=$EVE_file,popeve_file=$POPEVE_file

Output:

  • 76-row TSV file with 2 EVE_ columns and 9 popEVE_ columns
  • 7 rows should be annotated with both EVE and popEVE values

@ainefairbrother ainefairbrother changed the title Make EVE plugin able to handle popEVE data Extend EVE plugin to handle popEVE data Nov 6, 2025
@ainefairbrother ainefairbrother marked this pull request as ready for review November 14, 2025 11:40
@nakib103 nakib103 self-requested a review November 17, 2025 08:05
@nakib103 nakib103 self-assigned this Dec 3, 2025
my $class_key = "Class" . $self->{class_number};

# prefer INFO descriptions pulled from VCF but fall back to hard-coded descriptions
my $h = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if only popEVE file is given they will be still be in the header.

alts => [$variant->{alt}],
pos => $variant->{start},
}
{ ref => $ref_allele, alts => $alt_alleles, pos => $vf->{start}, strand => $vf->strand },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be better to keep the formatting same as before, as other plugins use the same format.

Copy link
Contributor

@nakib103 nakib103 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ainefairbrother ,

Besides the other comments, it is also better to add -

  • an example command for popEVE in the synopsis.
  • instead of using Bio::DB::HTS can we hardcode the header? in some cases the user might not have installed the library and it would restrict them to use the plugin.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants