Use the same data source protocol for Dataset and DataLoader. #1163
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Use the same data source protocol for Dataset and DataLoader.
We used to support
SupportsIndexto allow passing numpy int64 as index into the source. This feature is not used in any examples or use cases that we have visibility into. At the same time it requires an indirection of first converting tointby calling__index__(). It also required us to maintain two separate protocols forDataLoaderandDataset. This change allows us to reduce API confusion and simplify internal source handling.This is a breaking change but it should only affect a small number of use cases that fully enable type checking and define a custom source by inheriting from
grain.RandomAccessDataSource. Note that it is a protocol, so inheritance is not necessary.isinstancecheck on a protocol does not check argument names or types, so old sources that just implement the Protocol without inheritance will continue to work.