Skip to content

Processing SRA data. #37

@TCLamnidis

Description

@TCLamnidis

Example entry: https://www.ebi.ac.uk/ena/browser/view/PRJNA883375

When downloading data from the ENA that was originally uploaded to the SRA, it seems 3 FastQ files are made available.

I had assumed these are {merged, R1, R2}, like with ENA-uploaded data, but further inspections suggests otherwise. SRR21678398.fastq.gz contained only ~15M reads out of the total ~580M. The rest are apparently in SRR21678398_{1,2}.fastq.gz, which are presumably paired.

Need to look deeper into it, but it seems like I will need to pick up all three files for processing with eager in cases where the data was uploaded to the SRA. Or perhaps skip the third file, as it might be unpaired reads?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions