about WOSAC evaluation

[wosac_rollout_pkl.zip](https://github.com/user-attachments/files/24452428/wosac_rollout_pkl.zip)

Hi, thanks for releasing this system 

Is the WOSAC evaluation pipeline here still consistent with the official WOSAC API?

In my experiments, I noticed that GPUdrive seems to discard scenarios involving traffic lights.

The road topology used during evaluation (road edge 2D) appears to differ from the original WOSAC (tfreord map features 3D) setup. 

As a result, the metric scores for the same scenarios are quite different compared to those obtained from the official WOSAC API.

Could you please clarify whether this behavior is expected, and whether the evaluation results are intended to be directly comparable to the original WOSAC metrics?
 
below is the sample result from 2 same pkl file with same agent id using WOSAC and Pufferdrive 
  - Scenario ff8b5c45b9b38bcc
      - Agent IDs: 1725, 1731
  - Scenario ff8d03ceda158ac6
      - Agent IDs: 2592, 2611, 2616, 2617, 2773

<img width="577" height="380" alt="Image" src="https://github.com/user-attachments/assets/1073edfc-7dfd-4c5a-9ba5-f967901a237b" />

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

about WOSAC evaluation #236

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

about WOSAC evaluation #236

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions