Skip to content

Comments

[7] feat: add step size estimator#17

Draft
WojtAcht wants to merge 2 commits intomainfrom
7-compute-step-sizes
Draft

[7] feat: add step size estimator#17
WojtAcht wants to merge 2 commits intomainfrom
7-compute-step-sizes

Conversation

@WojtAcht
Copy link
Member

No description provided.

Copilot AI review requested due to automatic review settings February 22, 2026 18:01
@WojtAcht WojtAcht marked this pull request as draft February 22, 2026 18:01
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a step size estimation feature to help users determine optimal step sizes for basin-hopping sampling. The estimator searches for a step size that produces a target escape rate (default 50%), where the escape rate represents the proportion of perturbations that lead to different local optima.

Changes:

  • Added StepSizeEstimator class with configuration and result dataclasses to estimate optimal step sizes through a decimal refinement search algorithm
  • Exported new classes in the public API via __init__.py

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 8 comments.

File Description
src/lonpy/step_size.py Implements step size estimation using a search algorithm that progressively refines step sizes to achieve target escape rates
src/lonpy/init.py Exports new StepSizeEstimator, StepSizeEstimatorConfig, and StepSizeResult classes to the public API

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +149 to +152
domain_array = np.array(domain)
sampler = self._make_sampler()

step = 0.1
Copy link

Copilot AI Feb 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The estimate method doesn't validate the domain parameter. If an empty domain list is passed, domain_array will be empty and could lead to errors in _compute_escape_rate when computing p (line 97) or generating random initial points (line 102). Consider adding validation to ensure domain is non-empty and that each bound tuple has valid lower < upper values.

Copilot uses AI. Check for mistakes.
n_samples: int = 100
n_perturbations: int = 30
target_escape_rate: float = 0.5
search_precision: int = 4
Copy link

Copilot AI Feb 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The search_precision parameter controls the number of refinement iterations in the search algorithm (line 163), but this name is misleading. It's described in the docstring as "Decimal digits of precision for step size search" (line 19), which suggests it should control decimal precision. However, it's used as an iteration count, not precision digits. The actual decimal precision is applied on line 191 where round() is called with search_precision. This creates confusion because search_precision=4 means 4 iterations AND rounds to 4 decimal places, but these are different concepts. Consider renaming to search_iterations or clarifying the documentation to explain this dual purpose.

Copilot uses AI. Check for mistakes.

n_samples: int = 100
n_perturbations: int = 30
target_escape_rate: float = 0.5
Copy link

Copilot AI Feb 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The target_escape_rate configuration parameter has no validation. Values outside [0, 1] would not make sense for an escape rate (which is a probability/ratio), but there's no check to prevent invalid values. Consider adding validation in init or post_init to ensure 0 <= target_escape_rate <= 1.

Copilot uses AI. Check for mistakes.
Comment on lines +88 to +132
def _compute_escape_rate(
self,
func: Callable[[np.ndarray], float],
domain_array: np.ndarray,
step_size: float,
sampler: BasinHoppingSampler,
) -> float:
"""Compute the average escape rate for a given step size."""
bounds_array = domain_array if self.config.bounded else None
p = step_size * np.abs(domain_array[:, 1] - domain_array[:, 0])

escape_rates = []

for _ in range(self.config.n_samples):
x0 = np.random.uniform(domain_array[:, 0], domain_array[:, 1])
res = minimize(
func,
x0,
method=self.config.minimizer_method,
options=self.config.minimizer_options,
bounds=bounds_array,
)
optimum = res.x
optimum_hash = sampler._hash_solution(optimum)

escapes = 0
for _ in range(self.config.n_perturbations):
x_perturbed = sampler._perturbation(optimum, p, bounds_array)
res_perturbed = minimize(
func,
x_perturbed,
method=self.config.minimizer_method,
options=self.config.minimizer_options,
bounds=bounds_array,
)
new_hash = sampler._hash_solution(res_perturbed.x)
if new_hash != optimum_hash:
escapes += 1

escape_rates.append(escapes / self.config.n_perturbations)

return float(np.mean(escape_rates))

def estimate(
self,
Copy link

Copilot AI Feb 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The _compute_escape_rate method is called multiple times during the search (once per step size tested), and each call runs n_samples * n_perturbations minimizations. With defaults (n_samples=100, n_perturbations=30), that's 3000 minimizations per step size tested. With search_precision=4, this could be ~4-40+ step sizes tested (depending on when the upper bound is found), resulting in 12,000-120,000+ minimizations. For expensive objective functions, this could be very slow. Consider documenting the computational cost in the class or method docstring, or providing guidance on adjusting n_samples/n_perturbations for expensive functions.

Copilot uses AI. Check for mistakes.
last_tested: tuple[float, float] | None = None

for _ in range(self.config.search_precision):
while step <= 1.0 + 1e-12:
Copy link

Copilot AI Feb 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition allows step to exceed 1.0 (step <= 1.0 + 1e-12), but step_size is meant to represent a percentage of domain range. A step_size > 1.0 means perturbations larger than the entire domain range, which may not be meaningful for bounded problems. While this might be intentional to explore edge cases, it could lead to unexpected behavior. Consider documenting why values > 1.0 are allowed, or constraining the search to step <= 1.0 if that's the intended maximum.

Suggested change
while step <= 1.0 + 1e-12:
while step <= 1.0:

Copilot uses AI. Check for mistakes.
n_samples: int = 100
n_perturbations: int = 30
target_escape_rate: float = 0.5
search_precision: int = 4
Copy link

Copilot AI Feb 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The search_precision parameter has no validation. If set to 0 or negative, the outer loop (line 163) won't execute any iterations, leading to an empty candidates list and a ValueError when calling min() on line 188. Consider adding validation to ensure search_precision >= 1.

Copilot uses AI. Check for mistakes.
Comment on lines +110 to +126
optimum = res.x
optimum_hash = sampler._hash_solution(optimum)

escapes = 0
for _ in range(self.config.n_perturbations):
x_perturbed = sampler._perturbation(optimum, p, bounds_array)
res_perturbed = minimize(
func,
x_perturbed,
method=self.config.minimizer_method,
options=self.config.minimizer_options,
bounds=bounds_array,
)
new_hash = sampler._hash_solution(res_perturbed.x)
if new_hash != optimum_hash:
escapes += 1

Copy link

Copilot AI Feb 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The step size estimator accesses private methods (_perturbation, _round_value, _hash_solution) of BasinHoppingSampler. While this works, it creates tight coupling and fragility. If these private methods change signatures or behavior, this code will break. Consider either making these methods public (removing the underscore prefix), or creating public helper methods in BasinHoppingSampler that expose this functionality for reuse.

Copilot uses AI. Check for mistakes.
Comment on lines +172 to +181
# Refine: reduce increment and resume from lower bound
increment /= 10
if best_lower is not None:
step = best_lower[0] + increment
else:
step = increment
break
else:
# Reached step > 1.0 without finding upper bound
break
Copy link

Copilot AI Feb 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the search algorithm, when rate >= target (line 172-181), the code breaks out of the inner while loop and continues with the next iteration of the outer for loop. However, the step variable may not be reset appropriately if best_lower is None. In this case, step is set to increment (line 180), but increment was just divided by 10 (line 176). On subsequent outer loop iterations, if rate < target on the first test, step will be incremented by the small increment value and could remain below the previously tested values, causing redundant testing or inefficient search. Consider whether the initialization of step should account for previously explored ranges.

Copilot uses AI. Check for mistakes.
@WojtAcht WojtAcht linked an issue Feb 24, 2026 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add a new module to compute step sizes for continuous problems

1 participant