-
Notifications
You must be signed in to change notification settings - Fork 82
Description
Hi all,
Just noticed two pretty significant issues when transforming StatsBomb data. This is an extension for some issues pointed out by @jan-swiatek in #464.
- StatsBomb does not have a pitch_length and pitch_width in the PitchDimensions?
When I try
API_URL = "https://raw.githubusercontent.com/statsbomb/open-data/master/data/"
dataset = statsbomb.load(
event_data=f"{API_URL}/events/3794687.json",
lineup_data=f"{API_URL}/lineups/3794687.json",
three_sixty_data=f"{API_URL}/three-sixty/3794687.json",
coordinates="statsbomb",
).transform(
to_coordinate_system="sportec",
to_orientation="ACTION_EXECUTING_TEAM",
)I get MissingDimensionError: The pitch boundaries need to be fully specified to convert coordinates.
This ultimately is a convoluted error that stems from the fact that coordinates=“statsbomb” does something stupid to the underlying pitch length / width. Because if I do the below it works fine. Below we set coordinates="sportec" first and then still do to_orientation="ACTION_EXECUTING_TEAM".
API_URL = "https://raw.githubusercontent.com/statsbomb/open-data/master/data/"
"""Load StatsBomb data for Belgium - Portugal at Euro 2020"""
dataset = statsbomb.load(
event_data=f"{API_URL}/events/3794687.json",
lineup_data=f"{API_URL}/lineups/3794687.json",
three_sixty_data=f"{API_URL}/three-sixty/3794687.json",
coordinates="sportec", # <! NOTE THIS
).transform(
to_orientation="ACTION_EXECUTING_TEAM",
)- Freeze frames get double converted in StatsBomb deserializer.
Let's say we now do:
dataset_kl = statsbomb.load(
event_data=f"{API_URL}/events/3794687.json",
lineup_data=f"{API_URL}/lineups/3794687.json",
three_sixty_data=f"{API_URL}/three-sixty/3794687.json",
coordinates="kloppy",
)
post_transform_pass = dataset_kl.transform(
to_coordinate_system="secondspectrum",
to_orientation="ACTION_EXECUTING_TEAM"
).get_event_by_id(
"8022c113-e349-4b0b-b4a7-a3bb662535f8"
)
assert post_transform_pass.coordinates.x == post_transform_pass.freeze_frame.ball_coordinates.x
assert post_transform_pass.coordinates.y == post_transform_pass.freeze_frame.ball_coordinates.yWe would expect these coordinates to be the same, since the freeze frame ball coordinates are taken directly from the event coordinates. However, in the StatsBomb deserializer we first transform the events to whatever format we have set, and then they are again converted again in the special freeze frames loop.
self.freeze_frame_transformer = self.get_transformer(
pitch_length=metadata.pitch_dimensions.pitch_length,
pitch_width=metadata.pitch_dimensions.pitch_width,
provider=metadata.coordinate_system.provider,
)
for event in dataset:
if "freeze_frame" in event.raw_event.get("shot", {}):
event.freeze_frame = self.freeze_frame_transformer.transform_frame(
parse_freeze_frame(
freeze_frame=event.raw_event["shot"]["freeze_frame"],
home_team=teams[0],
away_team=teams[1],
event=event,
fidelity_version=data_version.shot_fidelity_version,
)
)
if not event.freeze_frame and event.event_id in three_sixty_data:
freeze_frame = three_sixty_data[event.event_id]
event.freeze_frame = self.freeze_frame_transformer.transform_frame(
parse_freeze_frame(
freeze_frame=freeze_frame["freeze_frame"],
home_team=teams[0],
away_team=teams[1],
event=event,
fidelity_version=data_version.xy_fidelity_version,
visible_area=freeze_frame["visible_area"],
)
)It seems a bit tricky to fix. I tried adding an independent freeze frame transformer (as below), but this doesn't actually do the trick. Not quite sure how to fix this one.