-
Notifications
You must be signed in to change notification settings - Fork 3k
Core, Data: File Format API interfaces #12774
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
pvary
wants to merge
50
commits into
apache:main
Choose a base branch
from
pvary:file_format_api_only
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+1,306
−0
Open
Changes from all commits
Commits
Show all changes
50 commits
Select commit
Hold shift + click to select a range
b33aec4
Core, Data: File Format API interfaces
pvary 5b79dbd
Renjie's comments
pvary a1daced
registerObjectModel exception handling fix
pvary 79d7703
Removing the need for data.AppenderBuilder
pvary 8404aa7
Fixed Renjie findings
pvary a00540d
Cosmentic changes
pvary 2a4816a
Steven's comments
pvary ba16b59
ObjectModelRegistry->FileAccessFactory, and AppenederBuilder->WriteBu…
pvary 9976bfb
Review comments by Steven and Russell (some javadoc, and a few method…
pvary 62ea041
Added generic for the input/output of the reader/writer - like 'FileA…
pvary 22ca3d7
New classes for implementation allows for absolutely new interfaces
pvary 598f46d
Default methods for setting multiple config/meta values
pvary 5ce49dc
Return FileReader instead of CloseableIterable from the ReadBuilder
pvary 46a1742
Revert "Return FileReader instead of CloseableIterable from the ReadB…
pvary 026e5e9
Ryan's comments
pvary 3149874
Move interface classes to core
pvary 4659adf
Rename FileAccessFactory to ObjectModelFactory
pvary 40ec8bb
Ryan's next round of comments
pvary f20bb4e
Separate input conversion from witers
pvary 7e91a40
Eduard's comments
pvary a957c47
Fixing parameter names
pvary 8ce2f2e
Ryan's comments
pvary 2b1b10b
Remove builder parameter from data file writers
pvary 26e03b7
Remove parametrization as much as possible
pvary ef41daa
ContentFileWriteBuilder needs a generic parameter
pvary acb2254
Revert transformers, and used engine specific type setting for writer…
pvary 3efd188
Steven's and Russel's comments
pvary e5611f4
Move to writeBuilder/positionDeleteWriteBuilder solution, and depreca…
pvary 3a6a5ed
Use a specific FormatModel to write PositionDeletes
pvary fad5e07
Changes discussed on the sync
pvary 790e282
Steven's comments
pvary 224070e
Remove the FormatModelRegistry.writeBuilder method
pvary 0c439c1
Eduard's comment
pvary ddcf866
Addressing Amogh's, Steven's and Gabor's comments
pvary 2fa0c5f
Removing ReadBuilder.outputSchema per Steven's and Renjie's comments
pvary 8f58a9e
Aihua's comments
pvary b22a91a
Move back to parametrized writers
pvary 8085a4b
Encryption handling
pvary af4f8a3
Revert back to OutputFile in the FormatModel
pvary c17c9ca
Synchronized register method, and only provide writeBuilder, location…
pvary 7c175ea
constantValues -> idToConstant
pvary 7ed56a5
Remove new from the API parameter names
pvary 69461a2
Move to EncryptedOutputFile from OutputFile FormatModel.writeBuilder
pvary fdbb6c0
Ryan's comments
pvary 4ae6ba0
Fix typo in comment
pvary fe743e1
Revert back to prevent re-registering models
pvary b66e891
Ryan's comments
pvary d450ec6
Steven's comment to change the log message for failed registration
pvary 564ba8c
Create a marker class for the Comet reader and a few extra nits
pvary 17f030f
Added back the ReadBuilder.outputSchema method as the attirbute is us…
pvary File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
122 changes: 122 additions & 0 deletions
122
core/src/main/java/org/apache/iceberg/formats/CommonWriteBuilder.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,122 @@ | ||
| /* | ||
| * Licensed to the Apache Software Foundation (ASF) under one | ||
| * or more contributor license agreements. See the NOTICE file | ||
| * distributed with this work for additional information | ||
| * regarding copyright ownership. The ASF licenses this file | ||
| * to you under the Apache License, Version 2.0 (the | ||
| * "License"); you may not use this file except in compliance | ||
| * with the License. You may obtain a copy of the License at | ||
| * | ||
| * http://www.apache.org/licenses/LICENSE-2.0 | ||
| * | ||
| * Unless required by applicable law or agreed to in writing, | ||
| * software distributed under the License is distributed on an | ||
| * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
| * KIND, either express or implied. See the License for the | ||
| * specific language governing permissions and limitations | ||
| * under the License. | ||
| */ | ||
| package org.apache.iceberg.formats; | ||
|
|
||
| import java.nio.ByteBuffer; | ||
| import java.util.Map; | ||
| import org.apache.iceberg.MetricsConfig; | ||
| import org.apache.iceberg.PartitionSpec; | ||
| import org.apache.iceberg.SortOrder; | ||
| import org.apache.iceberg.StructLike; | ||
| import org.apache.iceberg.deletes.EqualityDeleteWriter; | ||
| import org.apache.iceberg.deletes.PositionDeleteWriter; | ||
| import org.apache.iceberg.encryption.EncryptionKeyMetadata; | ||
| import org.apache.iceberg.io.DataWriter; | ||
|
|
||
| /** | ||
| * A generic builder interface for creating specialized file writers in the Iceberg ecosystem. | ||
| * | ||
| * <p>This builder provides a unified configuration API for generating various types of content | ||
| * writers: | ||
| * | ||
| * <ul> | ||
| * <li>{@link DataWriter} for creating data files with table records | ||
| * <li>{@link EqualityDeleteWriter} for creating files with equality-based delete records | ||
| * <li>{@link PositionDeleteWriter} for creating files with position-based delete records | ||
| * </ul> | ||
| * | ||
| * <p>Each concrete implementation configures the underlying file format writer while adding | ||
| * content-specific metadata and behaviors. | ||
| * | ||
| * @param <B> the concrete builder type for method chaining | ||
| */ | ||
| interface CommonWriteBuilder<B extends CommonWriteBuilder<B>> { | ||
|
|
||
| /** | ||
| * Set a writer configuration property which affects the writer behavior. | ||
| * | ||
| * @param property a writer config property name | ||
| * @param value config value | ||
| * @return this for method chaining | ||
| */ | ||
| B set(String property, String value); | ||
|
|
||
| /** | ||
| * Adds the new properties to the writer configuration. | ||
| * | ||
| * @param properties a map of writer config properties | ||
| * @return this for method chaining | ||
| */ | ||
| default B setAll(Map<String, String> properties) { | ||
| properties.forEach(this::set); | ||
| return self(); | ||
| } | ||
|
|
||
| /** | ||
| * Set a file metadata property in the created file. | ||
| * | ||
| * @param property a file metadata property name | ||
| * @param value config value | ||
| * @return this for method chaining | ||
| */ | ||
| B meta(String property, String value); | ||
|
|
||
| /** | ||
| * Add the new properties to file metadata for the created file. | ||
| * | ||
| * @param properties a map of file metadata properties | ||
| * @return this for method chaining | ||
| */ | ||
| default B meta(Map<String, String> properties) { | ||
| properties.forEach(this::meta); | ||
| return self(); | ||
| } | ||
|
|
||
| /** Sets the metrics configuration used for collecting column metrics for the created file. */ | ||
| B metricsConfig(MetricsConfig metricsConfig); | ||
|
|
||
| /** Overwrite the file if it already exists. By default, overwrite is disabled. */ | ||
| B overwrite(); | ||
|
|
||
| /** | ||
| * Sets the encryption key used for writing the file. If the writer does not support encryption, | ||
| * then an exception should be thrown. | ||
| */ | ||
| B withFileEncryptionKey(ByteBuffer encryptionKey); | ||
|
|
||
| /** | ||
| * Sets the additional authentication data (AAD) prefix used for writing the file. If the writer | ||
| * does not support encryption, then an exception should be thrown. | ||
| */ | ||
| B withAADPrefix(ByteBuffer aadPrefix); | ||
|
|
||
| /** Sets the partition specification for the Iceberg metadata. */ | ||
| B spec(PartitionSpec newSpec); | ||
|
|
||
| /** Sets the partition value for the Iceberg metadata. */ | ||
| B partition(StructLike partition); | ||
|
|
||
| /** Sets the encryption key metadata for Iceberg metadata. */ | ||
| B keyMetadata(EncryptionKeyMetadata keyMetadata); | ||
|
|
||
| /** Sets the sort order for the Iceberg metadata. */ | ||
| B sortOrder(SortOrder sortOrder); | ||
|
|
||
| B self(); | ||
| } | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What API stability do we guarantee for the new read/write interfaces?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 minor release