Hi team,
Thanks for the great work on verl.
I have a question regarding the cost of rollout sampling during training.
In my training setup:
Dataset size: 80,000 prompts
Each prompt generates 16 rollouts
Using an external API (e.g., Serper)
The total cost ends up being more than $6000, just for the rollout stage, mainly due to the large number of API queries (80,000 × 16 × n = 1.28n million).
I’d like to know the cost of your training?
Thank you!