Skip to content

Conversation

@copybara-service
Copy link

Allow for passing pipelines to xbeam.Dataset constructors.

Associating a beam.Pipeline with an xbeam.Dataset means that a pipeline doesn't need to be applied later (e.g., to the result of to_zarr). This is both a little cleaner, and also potentially a significant optimization, because it means that Beam understands that it can reuse a ptransform rather than recomputing it.

This includes a new _LazyPCollection class to ensure that our optimizations for Transforms applied directly after xbeam.DatasetToChunks still works.

@copybara-service copybara-service bot force-pushed the test_824054995 branch 2 times, most recently from 43aabc9 to e481ce5 Compare October 26, 2025 23:48
Associating a beam.Pipeline with an xbeam.Dataset means that a pipeline doesn't need to be applied later (e.g., to the result of `to_zarr`). This is both a little cleaner, and also potentially a significant optimization, because it means that Beam understands that it can reuse a ptransform rather than recomputing it.

This includes a new `_LazyPCollection` class to ensure that our optimizations for Transforms applied directly after xbeam.DatasetToChunks still works.

PiperOrigin-RevId: 824281688
@copybara-service copybara-service bot merged commit 7db84e4 into main Oct 27, 2025
@copybara-service copybara-service bot deleted the test_824054995 branch October 27, 2025 00:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant