Skip to content

polars.exceptions.ComputeError: could not parse - as dtype i64 at column 'score' (column number 6) #35

@jazsakr

Description

@jazsakr

Hello! I am running the latest version of the FIRE pipeline with ONT data. I have not had any issues before, but after the update I got the following error when running a new sample:

localrule fdr_track_chromosome:
    input: temp/260Fiber/v0.1.1-GL456354.1.pileup.bed.gz, results/260Fiber/additional-outputs-v0.1.1/fire-peaks/260Fiber-v0.1.1-fire-score-to-fdr.tbl
    output: temp/260Fiber/fire-peaks/v0.1.1-GL456354.1-FDR.track.bed
    jobid: 0
    reason: Forced execution
    wildcards: sm=260Fiber, v=v0.1.1, chrom=GL456354.1
    threads: 4
    resources: mem_mb=94208, mem_mib=89844, disk_mb=4096, disk_mib=3907, tmpdir=/tmp/jsakr/36408502, runtime=120, slurm_partition=standard, slurm_account=seyedam_lab

Activating conda environment: ../../../../../../../data/homezvol3/jsakr/snakemake-conda-envs/fc5018ac87025e0ae5ebc0630d3241a7_
[INFO][Time elapsed (ms) 874]: Read FDR table:
      threshold       FDR  shuffled_bp      real_bp
0         -1.00  1.000000          1.0          1.0
1          4.69  0.996092  192991831.0  193748972.0
2          4.70  0.996064  192981454.0  193744101.0
3          4.71  0.991812  191478713.0  193059441.0
4          4.72  0.991793  191464091.0  193048458.0
...         ...       ...          ...          ...
2147      45.11  0.000611        336.0     550139.0
2148      45.76  0.000419        219.0     522586.0
2149      47.61  0.000379        168.0     443844.0
2150      54.92  0.000219         46.0     209682.0
2151      55.13  0.000000          0.0     205122.0

[2152 rows x 4 columns]
Traceback (most recent call last):
  File "/data/homezvol3/jsakr/.cache/snakemake/snakemake/source-cache/runtime-cache/tmp5ngbjd72/file/data/homezvol3/jsakr/packages/FIREv0.1.1/workflow/rules/../scripts/fdr-table.py", line 333, in <module>
    defopt.run(main, show_types=True, version="0.0.1")
  File "/data/homezvol3/jsakr/snakemake-conda-envs/fc5018ac87025e0ae5ebc0630d3241a7_/lib/python3.10/site-packages/defopt.py", line 356, in run
    return call()
  File "/data/homezvol3/jsakr/.cache/snakemake/snakemake/source-cache/runtime-cache/tmp5ngbjd72/file/data/homezvol3/jsakr/packages/FIREv0.1.1/workflow/rules/../scripts/fdr-table.py", line 324, in main
    apply_fdr_table(infile, outfile, fdr_table, nrows)
  File "/data/homezvol3/jsakr/.cache/snakemake/snakemake/source-cache/runtime-cache/tmp5ngbjd72/file/data/homezvol3/jsakr/packages/FIREv0.1.1/workflow/rules/../scripts/fdr-table.py", line 205, in apply_fdr_table
    pileup = read_pileup_file(infile, nrows)
  File "/data/homezvol3/jsakr/.cache/snakemake/snakemake/source-cache/runtime-cache/tmp5ngbjd72/file/data/homezvol3/jsakr/packages/FIREv0.1.1/workflow/rules/../scripts/fdr-table.py", line 48, in read_pileup_file
    header = my_read_csv(infile, separator="\t", n_rows=1).columns
  File "/data/homezvol3/jsakr/.cache/snakemake/snakemake/source-cache/runtime-cache/tmp5ngbjd72/file/data/homezvol3/jsakr/packages/FIREv0.1.1/workflow/rules/../scripts/fdr-table.py", line 31, in my_read_csv
    result = pl.read_csv(*args, **kwargs)
  File "/data/homezvol3/jsakr/.local/lib/python3.10/site-packages/polars/_utils/deprecation.py", line 135, in wrapper
    return function(*args, **kwargs)
  File "/data/homezvol3/jsakr/.local/lib/python3.10/site-packages/polars/_utils/deprecation.py", line 135, in wrapper
    return function(*args, **kwargs)
  File "/data/homezvol3/jsakr/.local/lib/python3.10/site-packages/polars/_utils/deprecation.py", line 135, in wrapper
    return function(*args, **kwargs)
  [Previous line repeated 1 more time]
  File "/data/homezvol3/jsakr/.local/lib/python3.10/site-packages/polars/io/csv/functions.py", line 422, in read_csv
    df = _read_csv_impl(
  File "/data/homezvol3/jsakr/.local/lib/python3.10/site-packages/polars/io/csv/functions.py", line 568, in _read_csv_impl
    pydf = PyDataFrame.read_csv(
polars.exceptions.ComputeError: could not parse `-` as dtype `i64` at column 'score' (column number 6)

The current offset in the file is 8191 bytes.

You might want to try:
- increasing `infer_schema_length` (e.g. `infer_schema_length=10000`),
- specifying correct dtype with the `dtypes` argument
- setting `ignore_errors` to `True`,
- adding `-` to the `null_values` list.

Original error: ```remaining bytes non-empty```
[Tue Mar 11 18:22:31 2025]
Error in rule fdr_track_chromosome:
    jobid: 0
    input: temp/260Fiber/v0.1.1-GL456354.1.pileup.bed.gz, results/260Fiber/additional-outputs-v0.1.1/fire-peaks/260Fiber-v0.1.1-fire-score-to-fdr.tbl
    output: temp/260Fiber/fire-peaks/v0.1.1-GL456354.1-FDR.track.bed
    conda-env: /data/homezvol3/jsakr/snakemake-conda-envs/fc5018ac87025e0ae5ebc0630d3241a7_
    shell:
        
        python /data/homezvol3/jsakr/.cache/snakemake/snakemake/source-cache/runtime-cache/tmp5ngbjd72/file/data/homezvol3/jsakr/packages/FIREv0.1.1/workflow/rules/../scripts/fdr-table.py -v 1             --fdr-table results/260Fiber/additional-outputs-v0.1.1/fire-peaks/260Fiber-v0.1.1-fire-score-to-fdr.tbl             temp/260Fiber/v0.1.1-GL456354.1.pileup.bed.gz temp/260Fiber/fire-peaks/v0.1.1-GL456354.1-FDR.track.bed
        
        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message

Any suggestions? Thank you!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions