V2 (extended flag) by BrianPugh · Pull Request #293 · BrianPugh/tamp

BrianPugh · 2026-02-03T19:43:00Z

No description provided.

…ains.

…cessarily bloats c-decompressor

Copilot

Pull request overview

Copilot reviewed 51 out of 53 changed files in this pull request and generated 5 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-09T03:12:08Z

tamp/decompressor.py

+                                index = self._bit_reader.read(self.window_bits)
+
+                                string = self._window_buffer.get(index, match_size)
+
+                                # Write up to end of buffer (no wrap)
+                                remaining = self._window_buffer.size - self._window_buffer.pos
+                                window_write = min(match_size, remaining)
+                                self._window_buffer.write_bytes(string[:window_write])


Extended-match tokens are specified as “no wrap-around” (source must not cross the window boundary). The implementation currently uses _window_buffer.get(index, match_size), which wraps modulo the window size, so a malformed stream could silently read across the boundary instead of failing. Consider validating index + match_size <= window_size for extended matches and raising an error if it would wrap, aligning behavior with the C implementation’s bounds checks.

Copilot · 2026-02-09T03:12:09Z

tools/print_compressed_sizes.py

+        print(f"{file_path.name}: {compressed_size:,} (**{ratio:.3f}**)")
+
+    avg = sum(ratios) / len(ratios)
+    print(f"Average Ratio: {avg}")


avg = sum(ratios) / len(ratios) will raise ZeroDivisionError if all files are empty/missing or otherwise skipped. Consider guarding for len(ratios) == 0 and exiting with a clearer message. Also, the per-file print includes Markdown formatting (**...**), which is surprising for a CLI script unless the output is intended to be pasted into Markdown.

Copilot · 2026-02-09T03:12:09Z

tamp/decompressor.py

+                                rle_count = self._bit_reader.read_huffman()
+                                rle_count <<= _LEADING_RLE_HUFFMAN_BITS
+                                rle_count += self._bit_reader.read(_LEADING_RLE_HUFFMAN_BITS)
+                                rle_count += 1 + 1


In the extended RLE path, read_huffman() can return the _FLUSH sentinel on malformed/corrupted input, which would raise a TypeError when shifting/adding. Consider explicitly rejecting _FLUSH here (e.g., raise ValueError / EOFError) before doing bit operations, so invalid streams fail deterministically and with a clearer error.

Copilot · 2026-02-09T03:12:09Z

tamp/decompressor.py

+                                match_size = self._bit_reader.read_huffman()
+                                match_size <<= _LEADING_EXTENDED_MATCH_HUFFMAN_BITS
+                                match_size += self._bit_reader.read(_LEADING_EXTENDED_MATCH_HUFFMAN_BITS)
+                                match_size += self.min_pattern_size + 11 + 1


In the extended-match path, the secondary read_huffman() used for the size payload can also return _FLUSH on malformed input, which would break the arithmetic and produce a confusing exception. Add an explicit check to disallow _FLUSH (and potentially validate the decoded size range) before computing match_size.

Copilot · 2026-02-09T03:12:09Z

tests/test_decompressor.py

-    decompresses.append(viper_decompress)
+        Decompressors.append(NativeDecompressor)
+        decompresses.append(native_decompress)
+    except ImportError:


'except' clause does nothing but pass and there is no explanatory comment.

BrianPugh added 30 commits January 31, 2026 11:56

v2 python prototype.

a2507d5

update pi pico firmware download to datasets.

33bc24a

remove wrap-around logic; adds additional complications for minimal g…

2c46184

…ains.

Prepare cython bindings for v2 flag.

e503cdd

add mssing extended-match-count flush.

8dba220

common.h: add TampConf.v2 attribute and associated macros

bd7762d

wip c decompressor

0b208ff

swap window/extended-match length in encoding

5dbcefb

more cleanup

b377d18

remove rle_last_written check; provides very small benefits, but unne…

667e3cf

…cessarily bloats c-decompressor

rename pending_symbol to token_state.

aea413d

Make decompression 1% slower to save 200 bytes in firmware

1846270

narrow variable scope

ce3bd94

remove extended-match wrapping logic.

34ad99c

move TAMP_OUTPUT_FULL logic to top of loop

4cdb50f

further reduce binary size by 56 bytes via a goto.

70fd739

use some math instead of if/else

6b59265

no need for v2_res

59e4090

simplify while-loop check with a union. reduces binary by 56 bytes

f59fd08

unified huffman decode.

a925f62

Add comment about HUFFMAN_TABLE being pretty optimized.

e69b826

Make some datatypes smaller; reduces binary by 36 bytes.

589a9a2

reduce some dtypes to uint8

e081f53

prep cython bindings for c-compressor-v2

66bdc09

don't wrap extended match

bfe9631

more robust window_copy

37c3608

simplify rle criteria

19baefb

simplify window_copy call

5ae0282

move window_copy to a better location.

983532c

more comments

131c25a

BrianPugh added 24 commits February 5, 2026 13:41

combine if-statement; saving 8 bytes.

538b8c0

save 48 bytes in tamp_compressor_flush using some gotos

66401ac

save 12 more bytes

8e710df

save 44 more bytes

e55710d

cleanup flush_done

65f9812

update expected javascript hash

9aefc6e

consolidate extended compression functions

f47b547

simplify find_extended_match, deduplicate checks from caller

ea38c62

don't need to reset extended_match_position.

178bf19

combine if-statements

330e5f4

consolidate write_extneded_match_token output size checks.

83d3d8e

consolidate output arithmatic to partial_flush.

d0c2b36

consolidate output arithmatic to write_extended_match_token.

7abf720

update docstrings

750b002

get rid of useless brackets

f68aa6f

update CLAUDE.md

f8b1bfc

note: always inline refill_bit_buffer.

238ec2d

further flushing optimization.

1ab913e

TAMP_OPTIMIZE_SIZE macro

627cb38

extract out extended bits to its own private polling function.

95c770a

more TAMP_OPTIMIZE_SIZE attributes

b31ce73

some more gcc pragmas to shrink implementation

cde27cf

update readme binary-size table

1d82747

avoid memset

c52f165

BrianPugh force-pushed the v2-develop-c2 branch from 4f2fe1b to c52f165 Compare February 8, 2026 15:31

BrianPugh added 2 commits February 8, 2026 21:08

TAMP_USE_MEMSET macro

bf1c377

xtensa-specific nonsense

9a66abf

BrianPugh requested a review from Copilot February 9, 2026 03:04

Copilot started reviewing on behalf of BrianPugh February 9, 2026 03:05 View session

Copilot AI reviewed Feb 9, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

V2 (extended flag)#293

V2 (extended flag)#293
BrianPugh wants to merge 109 commits intomainfrom
v2-develop-c2

BrianPugh commented Feb 3, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 9, 2026

Uh oh!

Copilot AI Feb 9, 2026

Uh oh!

Copilot AI Feb 9, 2026

Uh oh!

Copilot AI Feb 9, 2026

Uh oh!

Copilot AI Feb 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

BrianPugh commented Feb 3, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant