Skip to content

Corrected minor issues with raw versus interpreted patterns in lib.#3

Open
ryancmoon wants to merge 1 commit intoEmergingThreats:mainfrom
ryancmoon:pcre_raw_correction
Open

Corrected minor issues with raw versus interpreted patterns in lib.#3
ryancmoon wants to merge 1 commit intoEmergingThreats:mainfrom
ryancmoon:pcre_raw_correction

Conversation

@ryancmoon
Copy link

@ryancmoon ryancmoon commented Nov 3, 2025

pdf_lib.py - lines 56, 63, and 112. It looks like there is an invalid pattern here that can be interpreted correctly with raw binary (rb) vs regular binary (b) mode. There may be others in the list here that fail, but these three caused issues on my python 3.12 and python 3.15 installs using default re lib.

Feel free to disregard if this is not duplicating elsewhere, but I believe the change can be implemented without effecting matches based on testing so far.

(py312) <x>:~/Tools/pdf_object_hashing$ python ./pdf_obj_hash.py -f ./newattcfidocs20250701426787.pdf
053f5d39f2e131080053e8fd81cebf15de01ccc163cb4b826c6906f715ab0aa5,b1be82f770e07aa0e00a8769d3b14db8,0.012109041213989258
(py312) <x>:~/Tools/pdf_object_hashing$ python ./pdf_obj_hash.py -f ./9692613a9bae252b259625f7949f697a50f3bb9a3692f53a6cbb91ca069e29b5.pdf
9692613a9bae252b259625f7949f697a50f3bb9a3692f53a6cbb91ca069e29b5,3e3c675a2b42dcae3db1b3b11b7f4f57,0.01910257339477539

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant