For some tasks, for example, [LiveCodeBench](https://github.com/mlfoundations/evalchemy/blob/ce5cea94f9f0f61388d2234afb01d811ff4357f4/eval/chat_benchmarks/LiveCodeBench/eval_instruct.py#L69), it use self.max_new_tokens=max_tokens For other tasks, for example, [HumanEval](https://github.com/mlfoundations/evalchemy/blob/ce5cea94f9f0f61388d2234afb01d811ff4357f4/eval/chat_benchmarks/HumanEval/eval_instruct.py#L48), it use self.max_token=max_tokens Is there any special consideration on this? or we should unify them instead?