Skip to content

Implement low-variance sampling algorithm#482

Open
larodriguez22 wants to merge 6 commits intomainfrom
larodriguez22/48-low-variance-sampling
Open

Implement low-variance sampling algorithm#482
larodriguez22 wants to merge 6 commits intomainfrom
larodriguez22/48-low-variance-sampling

Conversation

@larodriguez22
Copy link
Collaborator

@larodriguez22 larodriguez22 commented Apr 30, 2025

Proposed changes

This PR adds the strategy of low-variance sampling for the resampling step. Adds a new range adaptor called beluga::views::low_variance_sample. Related to: #48

Type of change

  • 🐛 Bugfix (change which fixes an issue)
  • 🚀 Feature (change which adds functionality)
  • 📚 Documentation (change which fixes or extends documentation)

Checklist

Put an x in the boxes that apply. This is simply a reminder of what we will require before merging your code.

  • Lint and unit tests (if any) pass locally with my changes
  • I have added tests that prove my fix is effective or that my feature works
  • I have added necessary documentation (if appropriate)
  • All commits have been signed for DCO

Additional comments

How to run it?

  • Run tests:
colcon build --packages-up-to beluga && ./build/beluga/test/beluga/test_beluga --gtest_filter="*LowVarianceSampleView*"
  • Run microbenchmarks
colcon build --packages-up-to beluga && ./build/beluga/test/benchmark/benchmark_beluga --benchmark_filter=Low

@larodriguez22 larodriguez22 requested a review from hidmic April 30, 2025 03:52
@larodriguez22 larodriguez22 force-pushed the larodriguez22/48-low-variance-sampling branch from 4a84380 to 614ed8f Compare April 30, 2025 04:09
Copy link
Collaborator

@hidmic hidmic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great first take @larodriguez22! I think adding some unit tests will help a lot.

@larodriguez22 larodriguez22 force-pushed the larodriguez22/48-low-variance-sampling branch from bd2804c to 3c63caf Compare May 12, 2025 19:51
Copy link
Collaborator

@hidmic hidmic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Second pass. Great job @larodriguez22 !


/// Position the current iterator.
constexpr void next() {
++m_;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@larodriguez22 meta: can we use complete names for variables? Or document the algorithm to put u, c, m, M in context?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I used those names because of the book. But I can change them

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the algorithm itself I don't mind using single letter variables so long as we have a comment block right next to it that explains what those variables mean.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Copy link
Collaborator

@glpuga glpuga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good work @larodriguez22 .

Which of the versions discussed in #48 is targetted here? this looks like the version in Prob Robotics, but that version is known not to work with KLD. Is this meant to be used with fixed resampling?

c_{*weights_begin_} {}

/// Access the current iterator.
[[nodiscard]] constexpr decltype(auto) read() const noexcept(noexcept(*range_begin_)) { return *it_; }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

noexcept(noexcept(*range_begin_))

C++ needs to burn in hell.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was like that in the example of SampleView, isn't?

Comment on lines 176 to 177
template <class T, class U, class V>
constexpr auto operator()(T&& t, U&& u, V& v) const {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not use single letter variables and template arguments, and even less if they are unrelated to their role in the code. TUV are just three consecutive letters. Names should be convey meaning. Use whole words, multiple words if needed, we are not billed for the characters we use.

https://google.github.io/styleguide/cppguide.html#General_Naming_Rules

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used those names, because of conventions to the book. But I'll change them. Thanks for the link

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is pending.

Comment on lines 190 to 192
static_assert(ranges::range<T>);
static_assert(std::is_lvalue_reference_v<U&&>); // Assume U is a URNG
return low_variance_sample_from_range(std::forward<T>(t), u);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this needs more explaining.

@hidmic hidmic requested review from glpuga and hidmic June 3, 2025 14:17
@larodriguez22
Copy link
Collaborator Author

I created the tests, but @hidmic or @glpuga. Do you have any additional scenarios I could make?

Copy link
Collaborator

@hidmic hidmic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work @larodriguez22. Second pass.

double r_;
uint64_t m_;
uint64_t i_;
double c_;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@larodriguez22 meta: why do we force this to use double types, instead of using the Range value type?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed c_ to ranges::range_value_t<Weights>, but I think the other ones must be fixed to be double or int numbers


/// Position the current iterator.
constexpr void next() {
++m_;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the algorithm itself I don't mind using single letter variables so long as we have a comment block right next to it that explains what those variables mean.

log/latest_test Outdated
@@ -0,0 +1 @@
test_2025-06-01_21-06-40 No newline at end of file
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@larodriguez22 let's remove this log directory. I wonder why it was picked up to begin with 🤔

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@larodriguez22 larodriguez22 requested a review from hidmic June 23, 2025 21:43
Copy link
Collaborator

@hidmic hidmic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Third pass. Great work @larodriguez22.

- Multinomial resampling from a particle range
- [Adaptive KLD resampling][fox2001]
- [Selective resampling][grisetti2007], on-motion resampling, and interval resampling policies
- Low Variance sampling
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@larodriguez22 nit: would be nice to add a citation like we do for KLD and selective resampling.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 40 to 41
/// selecting each element is proportional to its weight. It works by computing a single random number r is generated in
/// the range [0, 1/M), where M is the total number of elements in the range. Then, starting from this random number r,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@larodriguez22 nit:

Suggested change
/// selecting each element is proportional to its weight. It works by computing a single random number r is generated in
/// the range [0, 1/M), where M is the total number of elements in the range. Then, starting from this random number r,
/// selecting each element is proportional to its weight. It works with a single random number r in
/// the [0, 1/M) interval, where M is the total number of elements in the range. Then, starting from this random number r,

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

c_ += *weights_begin_;
}
++m_;
it_ = std::next(range_begin_, i_);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@larodriguez22 do we need i_? Why not increment it_ in the loop, checking it against the end iterator?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, fixed

Comment on lines 176 to 177
template <class T, class U, class V>
constexpr auto operator()(T&& t, U&& u, V& v) const {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is pending.

Comment on lines 223 to 231
auto lv_count_1 = ranges::count(lv_output, 1);
auto mult_count_1 = ranges::count(mult_output, 1);

double lv_ratio = static_cast<double>(lv_count_1) / num_samples;
double mult_ratio = static_cast<double>(mult_count_1) / num_samples;

// Store squared deviation from expected (0.1)
lv_variances.push_back(std::pow(lv_ratio - 0.1, 2));
multinomial_variances.push_back(std::pow(mult_ratio - 0.1, 2));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@larodriguez22 this is peculiar. We are computing variances over probabilities. I was expecting variances over distribution statistics (mean, variance, etc.). I wonder if results are equivalent (they should be, right?).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not necessarily the same


std::vector<double> lv_total_var, mult_total_var;

for (int trial = 0; trial < std::min(50, num_trials); ++trial) { // Subset for performance
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@larodriguez22 this should perhaps be a separate test.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines +93 to +111
auto first = ranges::begin(new_container);
auto last = ranges::copy(samples, first).out;
auto result = ranges::make_subrange(first, last);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@larodriguez22 why do we do this? Is it to force computation?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is to force computation

const auto particle_count = state.range(0);
state.SetComplexityN(particle_count);
const auto container_size = static_cast<std::size_t>(particle_count);
const auto sample_size = std::max(static_cast<std::size_t>(1), container_size / 10); // Sample 10% of particles
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@larodriguez22 nit:

Suggested change
const auto sample_size = std::max(static_cast<std::size_t>(1), container_size / 10); // Sample 10% of particles
const auto sample_size = std::max(1UL, container_size / 10UL); // Sample 10% of particles

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

for (auto&& [state, weight] : container) {
weight = dist(gen);
}
return container;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@larodriguez22 meta: don't we have to normalize weights? Same elsewhere.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, fixed

@hidmic
Copy link
Collaborator

hidmic commented Aug 26, 2025

@larodriguez22 may I take this to completion or are you still working on it?

Signed-off-by: ADEGA <la.rodriguez@uniandes.edu.co>
… sample

Signed-off-by: ADEGA <la.rodriguez@uniandes.edu.co>
Signed-off-by: ADEGA <la.rodriguez@uniandes.edu.co>
Signed-off-by: ADEGA <la.rodriguez@uniandes.edu.co>
Signed-off-by: ADEGA <la.rodriguez@uniandes.edu.co>
Signed-off-by: ADEGA <la.rodriguez@uniandes.edu.co>
@larodriguez22 larodriguez22 force-pushed the larodriguez22/48-low-variance-sampling branch from 323a77b to 60d5ebc Compare September 1, 2025 23:16
@larodriguez22
Copy link
Collaborator Author

@larodriguez22 may I take this to completion or are you still working on it?

I took longer than expected. I'm sorry, ready for another review

@larodriguez22 larodriguez22 requested a review from hidmic September 1, 2025 23:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants