[Feature request] - Add metadata check to ensure a single benchmark version across results

At any given time, all results of a specific benchmark should refer to the same benchmark version. A test needs to be implemented which ensures (reading the results metadata) that all results are referring to the same version of the benchmark input. Different versions for different codes are of course accepted.