Skip to content

Conversation

@MagellaX
Copy link

@MagellaX MagellaX commented Oct 6, 2025

This implements a meta-controller that dynamically predicts the optimal number of turns per problem based on complexity signals, and enables confidence-based early stopping to improve efficiency.

Key changes:

  • Created AdaptiveTurnPredictor class with heuristic complexity estimation using prompt length, token diversity, entropy, and historical performance
  • Integrated per-sample turn prediction into AgentHelper.run_llm_loop
  • Added confidence-based early stopping logic with convergence detection
  • Extended meta_info to track predicted turns, early stopping, and confidence scores
  • Added configuration section in simpletir_trainer.yaml for adaptive turns (disabled by default)
  • Scales to 50+ turns for hard problems while maintaining backward compatibility

This implements a meta-controller that dynamically predicts the optimal number of turns per problem based on complexity signals, and enables confidence-based early stopping to improve efficiency.

Key changes:
- Created AdaptiveTurnPredictor class with heuristic complexity estimation using prompt length, token diversity, entropy, and historical performance
- Integrated per-sample turn prediction into AgentHelper.run_llm_loop
- Added confidence-based early stopping logic with convergence detection
- Extended meta_info to track predicted turns, early stopping, and confidence scores
- Added configuration section in simpletir_trainer.yaml for adaptive turns (disabled by default)
- Scales to 50+ turns for hard problems while maintaining backward compatibility
@MagellaX MagellaX force-pushed the feature/adaptive-turn-budget branch from c97e061 to 74f3802 Compare October 6, 2025 15:04
@MagellaX
Copy link
Author

MagellaX commented Oct 6, 2025

Hey @ltzheng @AIDefender u can merge this, any thoughts from you side?

@MagellaX
Copy link
Author

Hey @ltzheng @AIDefender, u can merge this. Any thoughts from your side?

let me know your thoughts, will be deleting the branch later, so please merge this, it has been thoroughly tested...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant