Conversation
|
Does this branch test the "evaluate" code? I tested with the eval_flickr30 flag, and found it reported an error: My script is I printed the two corresponding length values via " # print(x.shape[1], media_locations.shape[1])" in helpers.py before line 240. At the last call, 48>47. And if I set batchsize=2, it will report another error. |
|
In the evaluate.py line 747, the code should be revised from to Because min_new_tokens and max_new_tokens are accepted arguments for the LLM generate(). |
|
Hi @liyongqi67, thanks for pointing out these issues! Sorry, I have not finished cleaning up this gnarly merge yet -- will get to it in the next few days. |
Many thanks for your effort. |
fix for none past_key_values, getting supported tasks
Modular eval code
TODOs: