Skip to content

Conversation

@DriesDeprest
Copy link
Contributor

Summary

This PR adds support for loading SciSports EPTS (Electronic Performance Tracking System) tracking data into Kloppy.

Changes

  • New Provider: Added Provider.SCISPORTS enum value
  • New Serializer: Created scisports_epts serializer module with deserializer and metadata parser
  • Coordinate System: Added SciSportsCoordinateSystem for proper coordinate transformation
  • Code Reuse: Extracted common EPTS logic into epts_common.py module for reuse between Metrica and SciSports
  • Tests: Added comprehensive test suite with sanitized fixtures
  • Documentation: Added user guide and updated provider index
  • Public API: Added scisports.load_tracking() function

Technical Details

  • Handles SciSports XML metadata format with different structure than Metrica
  • Parses raw tracking data using regex patterns from metadata specifications
  • Supports meter-based coordinates with proper pitch dimensions
  • Gracefully handles missing player data in frames
  • Sanitized test fixtures to remove real team/player names

Testing

All tests pass:

  • Basic deserialization functionality
  • Provider identification
  • Coordinate system transformation
  • Goalkeeper position verification for both halves

Files Changed

  • New serializer: kloppy/infra/serializers/tracking/scisports_epts/
  • Common logic: kloppy/infra/serializers/tracking/epts_common.py
  • Provider enum: kloppy/domain/models/common.py
  • Public API: kloppy/scisports.py, kloppy/_providers/scisports.py
  • Tests: kloppy/tests/test_scisports_epts.py
  • Documentation: docs/user-guide/loading-data/scisports.ipynb
  • Test fixtures: kloppy/tests/files/scisports_epts_*.xml/txt

- Fix ball_channel_map in reader.py to include all ball-related sensors (position, height-estimator, state)
- Update epts_common.py to use correct ball_z_estimate key instead of ball_z
- Ball Z coordinate now parsed from height-estimator sensor (z-estimate channel)
- Z coordinate range: 0.11m - 19.89m (realistic for football)
- All ball coordinates (X, Y, Z) now working correctly with coordinate swapping
- Backward compatible: all existing Metrica EPTS tests still pass
- Tests: 11/11 Metrica EPTS, 1/1 Metrica CSV, 5/5 Metrica Events, 2/2 SciSports EPTS
- Extract 100 frames before/after each key moment:
  - First half start (frames 89218-89418)
  - First half end (frames 113960-114160)
  - Second half start (frames 139305-139505)
  - Match end (frames 217463-217663)
- Reduced file size from 47MB to 223KB (213x smaller)
- Updated test expectations for reduced dataset
- All tests passing with 501 frames (301 period 1, 200 period 2)
@UnravelSports
Copy link
Contributor

UnravelSports commented Sep 1, 2025

@DriesDeprest just some initial observations:

  • We're grabbing the raw timestamp from the tracking data, this should be a timedelta object that resets with every period.
  • This would also go for setting the start and end timestamps for the periods, inside _load_periods(). Although I do see timdelta there (but I don't think it resets)
  • Should we make an attempt to parse the PositionType to PositionType.Goalkeeper and PositionType.Unknown? I see they provide "Goalkeeper" and "Field player" player types in the metadata. For tracking data applications this might be quite useful.

Additionally, do we know what these values mean in the meta data? They seem to indicate at least ball status, but since it's in the meta data it's confusing me.

<BallChannelRef channelId="air"/>
<BallChannelRef channelId="alive"/>

Is there a away we can use this to set ball_status for each frame, since that's now simply set to None.

@UnravelSports
Copy link
Contributor

@DriesDeprest a completely different question. I just sourced some SciSports tracking data, but it's JSON format. Do you happen to have any knowledge on the different formats they have?

@DriesDeprest
Copy link
Contributor Author

Thanks @UnravelSports! All feedback has been addressed:

✅ 1. Timestamp Reset per Period

Already working correctly in shared EPTS reader - timestamps reset to
0:00:00 at each period start. Added test
test_timestamp_reset_per_period() to verify Period 2 starts at
timedelta(0).

✅ 2. Player Position Type Mapping

Implemented _map_player_type() in metadata.py:

  • "Goalkeeper" → PositionType.Goalkeeper
  • "Field player" → PositionType.Unknown

Added test confirming 2 goalkeepers correctly identified.

✅ 3. Ball Status from BallChannelRef

Implemented ball state parsing using "alive" channel:

  • ball_alive=1 → BallState.ALIVE
  • ball_alive=0 → BallState.DEAD

Added test confirming 435 alive + 66 dead = 501 total frames.

🏗️ Architecture Improvement

Moved common EPTS functions to epts_common.py for better code reuse
between Metrica and SciSports.

📋 Different Data Formats

No knowledge of SciSports JSON format. Current implementation handles
EPTS XML. I would suggest that if someone needs another dataformat, he/she can work on generalizing it so it works with different data formats.

🧪 Tests: 18/18 EPTS tests passed

All changes include comprehensive test verification.

@UnravelSports UnravelSports added this to the 3.19.0 milestone Oct 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants