Skip to content

created xz file has many streams, making it necessary to fully scan file to know details #26

@peterdk

Description

@peterdk

I am using pxz to backup large databases. The resulting xz file needs to be fully scanned by 7zip to be able to give basic info like compression rate and original filesize. xz -l also takes a very long time on huge files created by pxz. The same huge file (40GB xz file) can be opened instantaneous when compressed by xz itself, whether single threaded with 5.1 or multithreaded with xz 5.2

Difference in output of xz -l (example data)

xz, single threaded compression:

Strm  Blocks   Gecompr Gedecompr  Verh  Contr   Best.n
1       1     39,8 GiB    258,5 GiB  0,154  CRC64   backup-2016_12_28.sql.xz 

pxz, multithreaded compression:

Strm  Blocks   Gecompr Gedecompr  Verh  Contr   Best.n
340       348     39,8 GiB    258,5 GiB  0,154  CRC64   backup-2016_12_28.sql.xz 

xz 5.2 (multithreaded) compression:

Strm  Blocks   Gecompr Gedecompr  Verh  Contr   Best.n
1       348     39,8 GiB    258,5 GiB  0,154  CRC64   backup-2016_12_28.sql.xz 

Based on this data I assume that having multiple streams result in full file scan when opening the files or retrieving info. It would be great if this could be fixed in pxz. Since development seems to be halted I would advise people to use the new xz 5.2.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions