Skip to content

Is the uniform histogram type wrong? #81

@jlouis

Description

@jlouis

Please take a look at the code at

https://github.com/boundary/folsom/blob/master/src/folsom_sample_uniform.erl#L50

which updates a sample uniformly in the histogram reservoir. The L46 clause is hit whenever we have fewer than 1028 samples and we insert a new sample in the table. Once we have 1028 samples, we look at N. Suppose N is 2056 since we have taken that many samples. We take a random value, which could be 1768 and then maybe update the reservoir. In half the cases, we won't be bumping the reservoir here, depending ont he random outcome.

I have reservoir's with N > 1000_1000_1000. They will almost never update the reservoir. Is this intended behavior of the uniform sample type? I am afraid some of the logic is wrong and we never ever replace entries in the reservoir for large N.

I could change to slide_uniform to fix this, but I want to make sure I understand how this is supposed to work.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions