Thank you so much for your excellent work. I noticed that the user in the TAU2 benchmark uses GPT-4-1, but you used GPT-5 for both training and evaluation. I'd like to ask if you've conducted any additional experiments to explore the impact of different users? It seems to me that this impact is significant for more difficult tasks. Could you explain the considerations behind choosing GPT-5? And could you provide a rough estimate of the training cost? This is very important for us, given our limited resources. Thank you again for your outstanding open-source work!
Thank you so much for your excellent work. I noticed that the user in the TAU2 benchmark uses GPT-4-1, but you used GPT-5 for both training and evaluation. I'd like to ask if you've conducted any additional experiments to explore the impact of different users? It seems to me that this impact is significant for more difficult tasks. Could you explain the considerations behind choosing GPT-5? And could you provide a rough estimate of the training cost? This is very important for us, given our limited resources. Thank you again for your outstanding open-source work!