Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #75 +/- ##
==========================================
- Coverage 86.33% 86.22% -0.12%
==========================================
Files 20 20
Lines 1303 1307 +4
==========================================
+ Hits 1125 1127 +2
- Misses 178 180 +2
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| @@ -147,11 +148,15 @@ def decompress(src: bytes | BinaryIO) -> bytes: | |||
| bitstring = BitString() | |||
|
|
|||
| while src.tell() - start_offset < size: | |||
There was a problem hiding this comment.
Can use a walrus operator here.
| bitstring = BitString() | ||
|
|
||
| while src.tell() - start_offset < size: | ||
| if size - (src.tell() - start_offset) <= 256: |
There was a problem hiding this comment.
Is there a spec you can link to for this?
There was a problem hiding this comment.
MS-XCA does not really talk about this case. Some other resources do take it into account and kind of mention it (linked below). Its not part of the Microsoft spec, so this edge-case might be a better fit in CompressedStream, or somewhere in the WOF code. Thoughts?
- https://github.com/wbenny/woftool - here something is mentioned about a block being stored uncompressed.
- https://wimlib.net/git/?p=wimlib;a=blob;f=src/xpress-compress.c;hb=d66b5c805c4e9a660bac6f979d88c1820cb031f2#l170 - here a value of 261 is mentioned.
- https://github.com/jborean93/pyxca/blob/main/src/xca/_xpress_huffman/xpress.c#L2505C37-L2505C64 - here a value of 260 is mentioned.
There was a problem hiding this comment.
None of the code or README you link actually are relevant to what you're proposing to do here.
- This is just common data compression behaviour, don't compress if the compressed data is larger than the uncompressed - this is almost universal in any application (not algorithm) that utilizes compression.
- This is for trying to compress data that is smaller than 261 bytes due to it being technically impossible to produce a smaller output. It returns
1(error). Likely the application will take this error as a hint to store the data uncompressed (see point 1.). - This indeed checks if there's enough remaining space in the input buffer, but it only returns successfully if there's already the expected amount of data decompressed. If not, it also throws an error.
Are you actually fixing an algorithm bug or are you trying to work around an application bug (in the wrong place)?
|
Superseded by #76, fox-it/dissect.archive#16, and fox-it/dissect.ntfs#42 |
This PR addresses an edge-case found in LZXPRESS+huffman compressed files when uncompressed (original) data is appended to the end of the stream. This is commonly encountered in WOF compressed files.
This is in preparation for contributing the POC code mentioned in fox-it/dissect.ntfs#41.