Skip to content

Error while working on hg38 vcf #4

@vodkatad

Description

@vodkatad

Hello, I've just obtained via docker the image:
pp535/metamutationalsigs latest a09b904af41e 15 months ago 10.3GB

And tried it with this command:
egrassi@godot:/scratch/trcanmed/AF_spectra/datasetV2$ docker run --rm -v /scratch/trcanmed/AF_spectra/datasetV2/stash/:/app/input_vcf_dir pp535/metamutationalsigs --mutationalSignatures
stash contains several vcf files (that I analysed with MutationalSignatures without any issues, so the format should be ok).
I get this output:

/app/input_vcf_dir GRCh37 FALSE TRUE TRUE TRUE
Starting matrix generation for SNVs and DINUCs...

Looking in the logs I see:

egrassi@godot:/scratch/trcanmed/AF_spectra/datasetV2/stash/logs$ cat SigProfilerMatrixGenerator_MetaMutationalSigs_GRCh372023-03-31.err 
Traceback (most recent call last):
  File "metaMutatationalSignatures.py", line 105, in <module>
    matGen.SigProfilerMatrixGeneratorFunc("MetaMutationalSigs", genome_ref , input_dir)
  File "/usr/local/lib/python3.8/dist-packages/SigProfilerMatrixGenerator/scripts/SigProfilerMatrixGeneratorFunc.py", line 444, in SigProfilerMatrixGeneratorFunc
    mutation_pd, skipped_mut, total, total_DINUC, mutation_dinuc_pd_all = matGen.catalogue_generator_single (lines, chrom, mutation_pd, mutation_dinuc_pd_all, mutation_types_tsb_context, vcf_path, vcf_path_original, vcf_files, bed_file_path, chrom_path, project, output_matrix, context, exome, genome, ncbi_chrom, functionFlag, bed, bed_ranges, chrom_based, plot, tsb_ref, transcript_path, tsb_stat, seqInfo, gs, log_file)
  File "/usr/local/lib/python3.8/dist-packages/SigProfilerMatrixGenerator/scripts/SigProfilerMatrixGenerator.py", line 359, in catalogue_generator_single
    mnv_seq += tsb_ref[chrom_string[l-1]][1]
IndexError: index out of range

Can you help me troubleshooting the error?
I've got some contigs other than chr1-22 in my vcf, I've filtered them from the mutational entries but they are still found in the header, could this be the issue?

Other info:

THIS FILE CONTAINS THE METADATA ABOUT SYSTEM AND RUNTIME


-------System Info-------
Operating System Name: Linux
Nodename: e95904a92b3b
Release: 4.19.0-6-amd64
Version: #1 SMP Debian 4.19.67-2+deb10u2 (2019-11-11)

-------Python and Package Versions------- 
Python Version: 3.8.5
SigProfilerMatrixGenerator Version: 1.1.27
SigProfilerPlotting version: 1.1.15
matplotlib version: 3.4.1
statsmodels version: 0.12.2
scipy version: 1.6.2
pandas version: 1.2.3
numpy version: 1.20.2

-------Vital Parameters Used for the execution -------
Project: MetaMutationalSigs
Genome: GRCh37
Input File Path: /app/input_vcf_dir/
exome: False
bed_file: None
chrom_based: False
plot: False
tsb_stat: False
seqInfo: True

-------Date and Time Data------- 
Date and Clock time when the execution started: 2023-03-31 15:26:10.928826

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions