feat(bam): add raw byte access for bam::Record#373
Draft
nh13 wants to merge 1 commit intozaeleus:masterfrom
Draft
feat(bam): add raw byte access for bam::Record#373nh13 wants to merge 1 commit intozaeleus:masterfrom
nh13 wants to merge 1 commit intozaeleus:masterfrom
Conversation
Add AsRef<[u8]>, TryFrom<Vec<u8>>, and into_inner() to bam::Record, enabling direct access to the underlying BAM record bytes without the leading 4-byte block size. This supports tools like fgumi that need efficient raw byte manipulation while retaining the alignment::Record trait implementation.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Add
AsRef<[u8]>,TryFrom<Vec<u8>>, andinto_inner()tobam::Record, enabling directaccess to the underlying BAM record buffer (without the leading 4-byte block size).
AsRef<[u8]>— borrow the raw BAM bytesTryFrom<Vec<u8>>— construct aRecordfrom raw bytes (validates and indexes the buffer)into_inner(self) -> Vec<u8>— consume the record and extract the byte bufferMotivation
Tools like fgumi need to work with raw BAM record
bytes for performance-critical pipelines (e.g. parallel BGZF block processing, custom
serialization) while still being able to use the
alignment::Recordtrait for field access.Currently there is no public way to get bytes out of or put bytes into a
bam::Record.This pairs well with the codec
encode/decodefunctions (#364) to give a complete raw-bytesworkflow:
Why raw byte access rather than upstreaming all operations?
Tools like fgumi perform three categories of operations on BAM record bytes
that go beyond noodles' current (and appropriate) scope:
flag updates) —
bam::Recordis intentionally read-only, and a fullmutation API would be a significant design change
sorting, virtual hard clipping) — domain logic that doesn't belong in a
format library
zero-allocation comparators) — performance-tuned for specific workflows
Raw byte access via
AsRef/TryFrom/into_inneris the minimal API thatlets tools like fgumi leverage noodles for I/O and the
alignment::Recordtrait while performing these operations directly on the buffer.
Test plan
into_inner()andTryFrom<Vec<u8>>round-trip