Skip to content

A blueprint for seeking feature #29

@TonalidadeHidrica

Description

@TonalidadeHidrica

I'm planning to add seeking feature to this library, as I mentioned in #28 . This issue has been opened for tracking the design plan for this new feature.

The methods to add

First, we are going to add new constructors new_seekable and new_seekable_ext in impl <R: Read + Seek> FlacReader {. The purpose of creating a new struct is that we have to do an additional process right after reading all the metadata blocks: save the position of the reader. This enables the reader to determine the range for binary searching for every seek. This position is going to be stored in a new field of FlacReader, audio_block_start_position: Option<u64>.

Next, we are going to add new function seek to FrameReader and FlacSamples, both of which takes the sample number (0-indexed).

  • FrameReader::seek seeks the reader to the beginning of the desired frame. Here, the "desired frame" is the last frame whose frame number is less than or equal to the given sample number (in case of fixed block size stream, it is automatically converted appropriately). Therefore, the next frame (this is equivalent to claxon::frame::Block, right?) yielded from FrameReader will include the desired sample (if the given sample number was less than the range of the length of the audio stream).
    • By default, this function uses binary search, assuming that any frame header is not broken (contains correct sample/frame number).
    • The range of search is between the beginning of audio frame (that was obtained in the constructor described above) and the end of the stream std::io::SeekFrom::End(0). However, if possible, this function uses SEEKTABLE metadata to narrow down the range, and thus speed up the seeking process.
      • By the way, I found that claxon::metadata::SeekTable does not expose any API to access the data. Shall I add some as well?
      • Also, I think that the data should be stored with something like BTreeMap rather than Vec.
    • For every step of binary searching, we find the first occurrence of FRAME_HEADER, by matching both sync code and 8-bit CRC provided at last. Calculating CRC-16 for entire frame makes it even more robust, but I think it is excessive work, and there are very low possibility that CRC-8 accidentally matches.
    • In the course of seeking, when the search range has been narrowed down enough, we switch to linear searching. This is in order to avoid searching frame header for almost the same place in the last phase of binary search (it is hard to explain with words; it's rather better to write some diagrams, but I'm reluctant to draw that. Tell me if you want more detailed explanation).
  • FlacSamples::seek seeks the reader to the desired sample. That is, the next sample yielded from the iterator will precisely has the sample index that equal to the provided argument. Internally, it calls FrameReader::seek and then discards the first few frames in the first block obtained.

Concerns and future tasks

  • As I explained above, audio_block_start_position will be stored as Option<u64>, as it may be vacant for non-seekable reader while filled for seekable one. But this is independent of whether the reader is actually seekable or not. That is, one can use the original constructor FrameReader::new while providing seekable reader, and then call seek function. So, the field may be None for the first time seek function is called, in which case we have to go back to the beginning of the stream and actually fill it with appropriate value. This may not be a big problem, but something that can be avoided by "type puzzle": For example, we can define a marker trait IsSeekable, create two marker structs Seekable and NonSeekable, accept S: IsSeekable as additional type parameter for FlacReader, FrameReader and FlacSamples, and define associated type IsSeekable::AudioBlockStartPosition which is set to () for NonSeekable and u64 for Seekable. Whew. Personally I love this kind of type puzzle but I doubt it's suitable for public library. Alternatively, we can define seekable variants for FlacReader, FrameReader and FlacSamples, which may make API less understandable and make code confusing.
  • As a future task, we can provide another function that scans entire audio stream and build "complete" seek table. This allows us to avoid any binary searching, because we can jump to desired frame. However it's not a necessary feature, so I'll leave it laters and not implement it at first.

Thanks for reading this lengthy draft! Feel free to make questions or oppositions. I'll get started in a few days anyway.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions