Hello, thank you for the great work on the RLVMR. I'm particularly interested in the ReAct results mentioned in the experiments (Qwen-1.5B/7B ReAct).
I noticed that your experimental setup and evaluation environment likely differ from the official ReAct implementation. Could you please clarify if the code for the prompting ReAct (without any fine-tuning) and its evaluation pipeline are included in this repository?