fix(builtins): preserve raw bytes from /dev/urandom through pipeline#870
Merged
fix(builtins): preserve raw bytes from /dev/urandom through pipeline#870
Conversation
Three fixes for binary data handling: 1. read_text_file: encode /dev/urandom bytes as Latin-1 (each byte 0x00-0xFF maps to one char) instead of UTF-8 lossy conversion 2. head -c: use char-level truncation so Latin-1 encoded bytes are counted correctly (each char = one original byte) 3. tr -c/-C: expand complement set to full 0-255 range so non-ASCII bytes from /dev/urandom are properly filtered This makes `tr -dc 'a-z0-9' < /dev/urandom | head -c N` produce exactly N alphanumeric characters. Closes #811
af1364b to
f7a9edd
Compare
Add exemptions for cmake 0.1.57, console 0.15.11, insta 1.46.3, simd-adler32 0.3.8, and unicode-segmentation 1.13.1 alongside the existing exemptions for their newer versions. These older versions are in Cargo.lock and need exemptions for cargo-vet to pass in CI.
36b05fd to
7f60812
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
read_text_file: encode/dev/urandombytes as Latin-1 (each byte 0x00-0xFF maps to one char) instead of UTF-8 lossy conversionhead -c: use char-level truncation so Latin-1 encoded bytes are counted correctlytr -c/-C: expand complement set to full 0-255 range so non-ASCII bytes are properly filteredMakes
tr -dc 'a-z0-9' < /dev/urandom | head -c Nproduce exactly N alphanumeric characters.Closes #811
Test plan
urandom_no_replacement_chars— no UTF-8 replacement chars in outputurandom_head_char_count—head -c Nproduces exactly N charsurandom_tr_filter_alphanumeric—tr -dc 'a-z0-9'produces clean alphanumeric outputcargo fmtandcargo clippyclean