Skip to content

Conversation

@copybara-service
Copy link

@copybara-service copybara-service bot commented Dec 15, 2025

Use the same data source protocol for Dataset and DataLoader.

We used to support SupportsIndex to allow passing numpy int64 as index into the source. This feature is not used in any examples or use cases that we have visibility into. At the same time it requires an indirection of first converting to int by calling __index__(). It also required us to maintain two separate protocols for DataLoader and Dataset. This change allows us to reduce API confusion and simplify internal source handling.

This is a breaking change but it should only affect a small number of use cases that fully enable type checking and define a custom source by inheriting from grain.RandomAccessDataSource. Note that it is a protocol, so inheritance is not necessary. isinstance check on a protocol does not check argument names or types, so old sources that just implement the Protocol without inheritance will continue to work.

@copybara-service copybara-service bot force-pushed the test_844877160 branch 13 times, most recently from 3e1fd89 to e417a60 Compare December 22, 2025 18:57
@copybara-service copybara-service bot force-pushed the test_844877160 branch 2 times, most recently from 1b3593a to 7e0a565 Compare January 12, 2026 16:20
We used to support `SupportsIndex` to allow passing numpy int64 as index into the source. This feature is not used in any examples or use cases that we have visibility into. At the same time it requires an indirection of first converting to `int` by calling `__index__()`. It also required us to maintain two separate protocols for `DataLoader` and `Dataset`. This change allows us to reduce API confusion and simplify internal source handling.

This is a breaking change but it should only affect a small number of use cases that fully enable type checking and define a custom source by inheriting from `grain.RandomAccessDataSource`. Note that it is a protocol, so inheritance is not necessary. `isinstance` check on a protocol does not check argument names or types, so old sources that just implement the Protocol without inheritance will continue to work.

PiperOrigin-RevId: 844877160
@copybara-service copybara-service bot changed the title Use the same source protocol for Dataset and DataLoader. Use the same data source protocol for Dataset and DataLoader. Jan 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant