Skip to content
This repository was archived by the owner on May 20, 2024. It is now read-only.
This repository was archived by the owner on May 20, 2024. It is now read-only.

Switch filter and verify function #4

@lmas

Description

@lmas

Currently using a Scalable bloom filter from github.com/tylertreat/BoomFilters but suspect it might not be performing optimally/correctly?

  • Amount of hash functions doesn't seem to fit with optimal amount
  • Amount of cells doesn't seem to fit with optimal amount (see above)
  • Default hash function used results in inaccurate false probability rate
  • Repo seemed to be a little messy in general: some models not documented, models missing common features found in other models, unhandled errors and use of unsafe (impressions from the time I wrote feedloggr v3 and first found BoomFilters)

Might want to try to implement a Stable Scalable Bloom filter myself instead:

  • Stable Bloom won't keep growing forever (sort of deletes items from itself to stay with a fixed size)
  • Easier to track, verify and document correctness
  • Performance less relevant vs correctness
  • One less unknown dependency

Or there might be another filter model that fits better? Some kind of Cuckoo filter?
Think I found the original paper for a scalable bloom filter though, doi:10.1016/j.ipl.2006.10.007

Investigate:

References:

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions