Skip to content

Better handling for compressed downloads #1891

@Dav1dde

Description

@Dav1dde

Currently Symbolicator tries to download multiple parts of a file concurrently using range requests. An http server which implements this incorrectly, e.g. yielding Content-Encoding: gzip for parts from the original compressed file will fail the download (S3 may do that). But even if the parts were correctly re-compressed chunks, it will break in Symbolicator as Symbolicator needs to know the exact file size and chunk size to assemble the final file.

#1890 disables transparent decompression on a client level for S3, but Symbolicator can implement this better, but always disabling transparent decompression (for all clients), and manually adding Accept-Encoding headers. Then dealing with decompression after the file is fully downloaded.

For the most part fetch_file already does that, and presumably the "maybe decompressed" logic was exactly added because of S3. We can do better and use the actual header given to us, instead of guessing (guessing may still be a good fallback though).

The only problem are other code paths which don't go through fetch_file, but we can turn maybe_decompress_file into a Read or Stream implementation which now handles decompression transparently again.

Metadata

Metadata

Assignees

No one assigned
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions