-
Notifications
You must be signed in to change notification settings - Fork 82
New/to cdf #513
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
New/to cdf #513
Conversation
| dataset.metadata.date + frame.timestamp | ||
| ) | ||
| # Period | ||
| frame_data["period"] = periods.get(frame.period.id, "unknownn") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change "unknownn" to "MISSING_MANDATORY_PERIOD_ID"
| Orientation, | ||
| BallState, | ||
| ) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add a check to warn the user if they are progressing with only_alive=True in their kloppy dataset, because CDF expects all frames.
if all([True for x in tracking_dataset if x.ball_state == BallState.ALIVE]):
warnings.warn(
"All frames in 'tracking_dataset' are 'ALIVE', the Commond Data Format expects 'DEAD' frames as well. Set `only_alive=False` in your kloppy `.load_tracking()` call to include 'DEAD' frames.",
UserWarning,
)|
|
||
| # Teams and players | ||
| home_players = [] | ||
| for player, coordinates in frame.players_coordinates.items(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This duplicated code (for home and away players) to loop over players needs to be a function.
| } | ||
| ) | ||
| except KeyError: | ||
| continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What KeyError are we handling here? Should be handled without exception
|
|
||
| # teams within the tracking data. | ||
|
|
||
| home_players_id = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this loop for?
| "formation": ( | ||
| home_team.formations.at_start() | ||
| if home_team.formations.items | ||
| else self.get_starting_formation( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a small mistake in the common-data-format-validator, only "id" and "players" are mandatory here. So, we can remove "jersey_color" and "formation" as keys on the frame by frame level. If "home_team.name" is not None, we can add it too, but it's also not required.
| away_team.formations.at_start() | ||
| if away_team.formations.items | ||
| else self.get_starting_formation( | ||
| [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
| starters_ids.append(player.player_id) | ||
|
|
||
| for player in home_team.players: | ||
| try: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we catching KeyErrors here?
|
|
||
| meta_away_players = [] | ||
| for player in away_team.players: | ||
| try: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same, again, these two operations for home and away are the same, we can make a function out of this
| # Add to tracking list | ||
| outputs.tracking_data.append(frame_file) | ||
|
|
||
| ###################### build now the metadata. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Create two separate functions in the serializer.py, one for generating meta data and one for tracking data to clean it up.
|
@koenvo how do we want to make this functionality ultimately available to the end user? It should be captured in some type of |
Yes, or we can start with a helper like: |
Oh, yeah I thought you were against doing the |
I believe the initial idea was to put the actual implementation into the |
This adds comprehensive write support to the open_as_file() function with efficient memory management and streaming capabilities. Key features: - BufferedStream: SpooledTemporaryFile wrapper with chunked I/O (5MB memory threshold) - Write modes: 'wb' (write), 'ab' (append) - binary only - Adapter pattern: write_from_stream() method (opt-in for adapters) - Compression support: .gz, .bz2, .xz files handled automatically - Local files and S3 URIs supported via FSSpecAdapter - Protocols for type safety: SupportsRead, SupportsWrite Implementation details: - read_from()/write_to() methods use shutil.copyfileobj for chunked copying - Context manager pattern buffers writes and flushes on exit - No breaking changes to existing read functionality
Cleaned up CDF Serializer
No description provided.