Skip to content

[STATSBOMB] Two Incorrect Transformations #483

@UnravelSports

Description

@UnravelSports

Hi all,

Just noticed two pretty significant issues when transforming StatsBomb data. This is an extension for some issues pointed out by @jan-swiatek in #464.

  1. StatsBomb does not have a pitch_length and pitch_width in the PitchDimensions?

When I try

API_URL = "https://raw.githubusercontent.com/statsbomb/open-data/master/data/"
dataset = statsbomb.load(
    event_data=f"{API_URL}/events/3794687.json",
    lineup_data=f"{API_URL}/lineups/3794687.json",
    three_sixty_data=f"{API_URL}/three-sixty/3794687.json",
    coordinates="statsbomb",
).transform(
    to_coordinate_system="sportec",
    to_orientation="ACTION_EXECUTING_TEAM",
)

I get MissingDimensionError: The pitch boundaries need to be fully specified to convert coordinates.

This ultimately is a convoluted error that stems from the fact that coordinates=“statsbomb” does something stupid to the underlying pitch length / width. Because if I do the below it works fine. Below we set coordinates="sportec" first and then still do to_orientation="ACTION_EXECUTING_TEAM".

API_URL = "https://raw.githubusercontent.com/statsbomb/open-data/master/data/"

"""Load StatsBomb data for Belgium - Portugal at Euro 2020"""
dataset = statsbomb.load(
    event_data=f"{API_URL}/events/3794687.json",
    lineup_data=f"{API_URL}/lineups/3794687.json",
    three_sixty_data=f"{API_URL}/three-sixty/3794687.json",
    coordinates="sportec",  # <! NOTE THIS 
).transform(
    to_orientation="ACTION_EXECUTING_TEAM",
)
  1. Freeze frames get double converted in StatsBomb deserializer.

Let's say we now do:

dataset_kl = statsbomb.load(
    event_data=f"{API_URL}/events/3794687.json",
    lineup_data=f"{API_URL}/lineups/3794687.json",
    three_sixty_data=f"{API_URL}/three-sixty/3794687.json",
    coordinates="kloppy",
)
post_transform_pass = dataset_kl.transform(
    to_coordinate_system="secondspectrum",
    to_orientation="ACTION_EXECUTING_TEAM"
).get_event_by_id(
    "8022c113-e349-4b0b-b4a7-a3bb662535f8"
)
assert post_transform_pass.coordinates.x == post_transform_pass.freeze_frame.ball_coordinates.x
assert post_transform_pass.coordinates.y == post_transform_pass.freeze_frame.ball_coordinates.y

We would expect these coordinates to be the same, since the freeze frame ball coordinates are taken directly from the event coordinates. However, in the StatsBomb deserializer we first transform the events to whatever format we have set, and then they are again converted again in the special freeze frames loop.

self.freeze_frame_transformer = self.get_transformer(
    pitch_length=metadata.pitch_dimensions.pitch_length,
    pitch_width=metadata.pitch_dimensions.pitch_width,
    provider=metadata.coordinate_system.provider,
)
for event in dataset:
    if "freeze_frame" in event.raw_event.get("shot", {}):
        event.freeze_frame = self.freeze_frame_transformer.transform_frame(
            parse_freeze_frame(
                freeze_frame=event.raw_event["shot"]["freeze_frame"],
                home_team=teams[0],
                away_team=teams[1],
                event=event,
                fidelity_version=data_version.shot_fidelity_version,
            )
        )
    if not event.freeze_frame and event.event_id in three_sixty_data:
        freeze_frame = three_sixty_data[event.event_id]
        event.freeze_frame = self.freeze_frame_transformer.transform_frame(
            parse_freeze_frame(
                freeze_frame=freeze_frame["freeze_frame"],
                home_team=teams[0],
                away_team=teams[1],
                event=event,
                fidelity_version=data_version.xy_fidelity_version,
                visible_area=freeze_frame["visible_area"],
            )
        )

It seems a bit tricky to fix. I tried adding an independent freeze frame transformer (as below), but this doesn't actually do the trick. Not quite sure how to fix this one.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions