Skip to content

Does increasing the expansion factor ($\gamma$) guarantee monotonic improvement, or is there a risk of "search pathology"? #2

@lrw890123

Description

@lrw890123

Hi everyone,

First, thank you for sharing the code and this excellent paper. As a PhD student researching reinforcement learning and routing problems, I found the integration of simulation into a beam search framework very insightful.

I have a theoretical question regarding the expansion factor ($\gamma$) in the Simulation-guided Beam Search (SGBS) algorithm.

Intuitively, evaluating more candidate branches by increasing $\gamma$ should yield better, or at least equal, results. However, SGBS evaluates these expanded child nodes using single greedy rollouts, which act as imperfect heuristic estimates. Because the beam width ($\beta$) strictly limits the number of nodes that survive the pruning phase, I am wondering if increasing $\gamma$ might mathematically break the monotonic improvement of the search.

To illustrate this potential "search pathology" or "crowding out" effect, consider a simple toy example:

  • Suppose the beam width is $\beta = 2$.
  • Scenario A ($\gamma = 2$): The algorithm evaluates 2 child nodes, $a_1$ (greedy rollout score: 10) and $a_2$ (score: 15). Both survive the pruning phase. Deep in the tree, $a_1$ actually leads to the global optimum (score: 100). The search successfully finds it.
  • Scenario B ($\gamma = 3$): The algorithm evaluates 3 child nodes: $a_1$ (10), $a_2$ (15), and a new node $a_3$ (score: 20). Because $\beta = 2$, the algorithm keeps the top two rollout scorers ($a_3$ and $a_2$) and prunes $a_1$.
  • Result: The strictly larger $\gamma$ introduces a "deceptive" node ($a_3$) that crowds out the globally optimal path ($a_1$), resulting in a worse final solution.

In Appendix D, the empirical results show that the performance gain quickly becomes marginal as $\gamma$ increases.

My question for the authors:

During your experiments, did you ever observe specific instances where using a strictly larger $\gamma$ (with a fixed $\beta$) actually resulted in a worse final solution due to this "crowding out" effect?

Thank you for your time and for any insights you can share!

Best regards,

Ruowen Li

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions