Currently it can be very misleading in that it will show a score like 100% for vllm/Qwen/Qwen3-Coder-30B-A3B-Instruct, while that's only because it got a 100% in a single category.
Two things need to happen, #22 so that folks can filter, but also the default display should also \show models that have completed all 23 categories even when "unofficial" is checked.
