I see a lot of exciting updates on X https://x.com/METR_Evals/status/1950740117020389870 and on the top chart here: https://metr.org/blog/2025-03-19-measuring-ai-ability-to-complete-long-tasks/ Would love to see all those juicy details in [`data/external/all_runs.jsonl`](https://github.com/METR/eval-analysis-public/blob/main/data/external/all_runs.jsonl)! Thanks for this incredibly important project.