A blueprint for seeking feature

I'm planning to add seeking feature to this library, as I mentioned in #28 .  This issue has been opened for tracking the design plan for this new feature.



## The methods to add

First, we are going to add new constructors `new_seekable` and `new_seekable_ext` in `impl <R: Read + Seek> FlacReader {`.  The purpose of creating a new struct is that we have to do an additional process right after reading all the metadata blocks: save the position of the reader.  This enables the reader to determine the range for binary searching for every seek.  This position is going to be stored in a new field of `FlacReader`, `audio_block_start_position: Option<u64>`.

Next, we are going to add new function `seek` to `FrameReader` and `FlacSamples`, both of which takes the sample number (0-indexed).
- `FrameReader::seek` seeks the reader to the beginning of the desired frame.  Here, the "desired frame" is the last frame whose frame number is less than or equal to the given sample number (in case of fixed block size stream, it is automatically converted appropriately).  Therefore, the next frame (this is equivalent to `claxon::frame::Block`, right?) yielded from `FrameReader` will include the desired sample (if the given sample number was less than the range of the length of the audio stream).
  - By default, this function uses binary search, assuming that any frame header is not broken (contains correct sample/frame number). 
  - The range of search is between the beginning of audio frame (that was obtained in the constructor described above) and the end of the stream `std::io::SeekFrom::End(0)`.  However, if possible, this function uses `SEEKTABLE` metadata to narrow down the range, and thus speed up the seeking process.
    - By the way, I found that `claxon::metadata::SeekTable` does not expose any API to access the data.  Shall I add some as well?
    - Also, I think that the data should be stored with something like `BTreeMap` rather than `Vec`.
  - For every step of binary searching, we find the first occurrence of `FRAME_HEADER`, by matching both sync code and 8-bit CRC provided at last.  Calculating CRC-16 for entire frame makes it even more robust, but I think it is excessive work, and there are very low possibility that CRC-8 accidentally matches.
  - In the course of seeking, when the search range has been narrowed down enough, we switch to linear searching.  This is in order to avoid searching frame header for almost the same place in the last phase of binary search (it is hard to explain with words; it's rather better to write some diagrams, but I'm reluctant to draw that.  Tell me if you want more detailed explanation).
- `FlacSamples::seek` seeks the reader to the desired sample.  That is, the next sample yielded from the iterator will precisely has the sample index that equal to the provided argument.   Internally, it calls `FrameReader::seek` and then discards the first few frames in the first block obtained.

## Concerns and future tasks
- As I explained above, `audio_block_start_position` will be stored as `Option<u64>`, as it may be vacant for non-seekable reader while filled for seekable one.  But this is independent of whether the reader is *actually* seekable or not.  That is, one can use the original constructor `FrameReader::new` while providing seekable reader, and then call `seek` function.  So, the field may be `None` for the first time `seek` function is called, in which case we have to go back to the beginning of the stream and actually fill it with appropriate value.  This may not be a big problem, but something that can be avoided by "type puzzle":  For example, we can define a marker trait `IsSeekable`, create two marker structs `Seekable` and `NonSeekable`, accept `S: IsSeekable` as additional type parameter for `FlacReader`, `FrameReader` and `FlacSamples`, and define associated type `IsSeekable::AudioBlockStartPosition` which is set to `()` for `NonSeekable` and `u64` for `Seekable`.  Whew.   Personally I love this kind of type puzzle but I doubt it's suitable for public library.  Alternatively, we can define seekable variants for `FlacReader`, `FrameReader` and `FlacSamples`, which may make API less understandable and make code confusing.
- As a future task, we can provide another function that scans entire audio stream and build "complete" seek table.  This allows us to avoid any binary searching, because we can jump to desired frame.  However it's not a necessary feature, so I'll leave it laters and not implement it at first.



Thanks for reading this lengthy draft!   Feel free to make questions or oppositions.  I'll get started in a few days anyway.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A blueprint for seeking feature #29

The methods to add

Concerns and future tasks

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

A blueprint for seeking feature #29

Description

The methods to add

Concerns and future tasks

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions