When I evaluate the Tüba-D/Z test set from SemEval 2010 shared task, I get the following error: Found too many repeated mentions (> 10) in the response, so refusing to score. Please fix the output. Does anyone know how to fix it? Thanks