feat: Support customizing S3 endpoints #1913

dimas-b · 2025-06-19T23:44:10Z

This change enables using non-AWS S3 implementations (#1530) when SKIP_CREDENTIAL_SUBSCOPING_INDIRECTION is set to false.

Add optional endpoint to AWS AwsStorageConfigInfo (API change)
Introduce a dedicated interface for StsClient suppliers and implement it using a pool of cached clients.
All client are "thin" and share the same SdkHttpClient. The latter is closed when the server shuts down.
If an endpoint is set, propagate it to S3 clients and local FileIO.
Use the same endpoint for STS requests (credential vending).
Add utility classes for testing with MinIO .
Add integration test based on MinIO.

Existing catalogs are not affected by this change.

Dev ML discussion: https://lists.apache.org/thread/72psyjb40fb9l73sxld2qcr69l4tf4cw

eric-maynard · 2025-06-20T02:46:38Z

I took a quick look -- pooling the STS clients seems like a reasonable thing to do and this approach to doing so looks pretty correct to me. However it's not totally clear to me why this is needed to implement #32. Looking at #389, this part of the code wasn't touched at all.

Could we instead start more top-down with S3-compatible storage support and figure out if these changes are actually necessary?

polaris-core/src/main/java/org/apache/polaris/core/storage/aws/StsClientSupplier.java

runtime/service/src/main/java/org/apache/polaris/service/quarkus/config/QuarkusProducers.java

runtime/service/src/main/java/org/apache/polaris/service/storage/aws/S3AccessConfig.java

eric-maynard · 2025-06-20T02:48:33Z

runtime/service/src/main/java/org/apache/polaris/service/storage/aws/StsClientsPool.java

Why are we copying such generic code from another project?

Because it is convenient as opposed to writing it from scratch.

I'm curious what will happen if anyone change this class later? Should we keep CODE_COPIED_TO_POLARIS for ever?

To reduce the maintenance burden, I'd suggest to rewrite to avoid copying from another project.

OSS is meant to be reused. Rewriting to avoid attribution is against the grain of OSS concepts.

Re: forever: I'd guess it's very likely as I do not personally see a reason for rewriting this code in any substantial way in the foreseeable future.

@eric-maynard @flyrain We have a ton of code taken from other projects already in the code base. Most of that was already part of the initial contribution to Apache. A lot of that was also "generic" and literally copies of rather trivial code. So I seriously do not understand what you're trying to say here. The goal should really not be to reinvent "all the wheels".

What’s an example of similarly generic code that’s been copied?

It's always a good practice to reduce the dependency for code. In this case, we have to add additional information in the LICENSE. If rewriting isn't a big task which I believe so, we mighty avoid extra maintenance burden. That's also just a suggestion, I'm NOT gonna block this PR for that.

License checks for CODE_COPIED_TO_POLARIS marks are automated - there's minimal cognitive effort involved in maintaining these license references (plus they are fixed and do not need to change over time).

There is, as mentioned, a lot of other projects' code in Polaris (OpenAPI generator files, the web site, code copied from Iceberg, build scripts, etc etc etc).

There is no ongoing maintenance burden, but there is a huge plus taking code that's tested and known to work in production for years. New developed code is way more work.

eric-maynard · 2025-06-20T02:49:07Z

runtime/service/src/main/java/org/apache/polaris/service/storage/aws/StsClientsPool.java

Is this really related to S3-compatible storage support?

Stats are not directly relates to supporting S3-compatible storage. However, the STS clients pool is. Having a pool in the system, it is good to expose metrics about it for observability.

if we are using AWS ask it has metrics inbuilt https://docs.aws.amazon.com/sdk-for-java/latest/developer-guide/metrics-list.html is the current mechanism more comprehensive than that ?

these stats are about the cache of STS clients we keep in Polaris code.

I see, we just plotting a gauge for max clients, wouldn't it be better to have the stats for the http connection pool ? for ex lets say i have 2 entires in the cache and one entry hogs my 49 connection and other only 1 i would want to know who hogs 49 connection ?

Stats for the HTTP client would be nice to have for sure, but let's defer it to another PR. I do not have that code handy ATM and I would not want to bloat the scope of this PR even more :)

eric-maynard · 2025-06-20T02:52:37Z

runtime/service/src/main/java/org/apache/polaris/service/storage/aws/StsClientsPool.java

This is the only usage of SdkHttpClient outside of QuarkusProducers itself, I wonder if we can simplify things by just injecting the StsClientBuilder

StsClientsPool is injected where it is needed. SdkHttpClient is injected into StsClientsPool. Are you suggesting to hide the SdkHttpClient from CDI? If we did that, it would not be possible to close the singleton SdkHttpClient using CDI callbacks.

If the suggestion is to make SdkHttpClient an inner (owned) fields in StsClientsPool and close the pool, IMHO, that introduces unnecessary coupling, because the pool only needs the interface (namely SdkHttpClient), but QuarkusProducers chooses a specific implementation of that interface.

Yes, that is the suggestion. They are always coupled together, so encapsulating the client inside the pool simplifies usage of the pool.

Currently the pool and the HTTP client are not coupled. The pool needs a client. Yes, the client can be reused by other parts of Polaris (if it becomes necessary in the future).

The pool is easily injectable in the current state of the PR. Use sites need not be aware of how the client is constructed.

I'd like to keep the current approach of having distinct CDI beans for the HTTP client and the STS client pool.

Also, as @singhpk234 commented in another thread, we're going to support multiple HTTP client implementations (later), selectable in runtime (by config).

singhpk234

my comment doesn't shows up linking here

dimas-b · 2025-06-20T15:36:43Z

Could we instead start more top-down with S3-compatible storage support and figure out if these changes are actually necessary?

I can do that, but the diff is going to be larger ;)

dimas-b · 2025-06-20T15:49:31Z

@singhpk234 : Sorry, I do not see your comments (even via above links). Could you re-summit them?

singhpk234

Sure, i resubmitted !
I think i messed up last time in submitting, thank you for patience ~

singhpk234 · 2025-06-20T15:53:28Z

runtime/service/src/main/java/org/apache/polaris/service/quarkus/config/QuarkusProducers.java

if HTTP connection pool is shared, shouldn't we restrict max connection per route ? as if lets say my pool size is 50 and for route 1 hit with 50 concurrent request so pool now only has 50 connection of route 1 and if now route 2 comes it starves ? I am not sure how to do it here HTTPClient use to PoolingConnectionManager

Is the AWS SDK not able to deal with this situation on its side?

Did you observe this kind of starvation in practice?

This is practically possible, i think since we don't specify it default kicks in from http client, TBH i haven't hit in practice as in AWS EMR spark jobs generally have one endpoint to connect to, but with a long running service like polaris, i think the possibility being increased, WDYT ?

@snazy WDYT?

singhpk234 · 2025-06-20T15:53:51Z

runtime/service/src/main/java/org/apache/polaris/service/quarkus/config/QuarkusProducers.java

why only ApacheHttpClient and not URLConnection client ?

I can make it configurable in runtime if you prefer. Let's do that as a follow-up PR, though.

ApacheHttpClient is more capable in general, AFAIK.

Iceberg supports both, hence coming from there, Its good to have alternative specially considering how they behave. for example The Apache HTTP connection will always read the entire stream on close, which results in much more data being read than needed. The URL connection client does not behave this way. for more details : apache/iceberg#7262

Certainly. If do not mind supporting both, but let's do that in a follow-up PR to avoid bloating the scope of this PR (the main idea in this PR is not the ApacheHttpClient client, but supporting multiple endpoints).

runtime/service/src/main/java/org/apache/polaris/service/storage/aws/StsClientsPool.java

dimas-b · 2025-06-20T16:49:07Z

Marking this PR as "draft" to add end-to-end support for MinIO. Please feel free to comment anyway.

dimas-b · 2025-06-25T04:48:16Z

CatalogSerializationTest currently fails because unset endpoint values serialize as null. This is technically a breaking change for older clients. I'll see how it can be fixed later (under this PR). Please review the main change for now.

dimas-b · 2025-06-27T21:30:56Z

Prerequisite PR: #1955

runtime/service/src/main/java/org/apache/polaris/service/quarkus/config/QuarkusProducers.java

flyrain · 2025-07-03T17:58:53Z

Thanks for doing this. I think it's the right direction. We could also make CLI to support new fields, which could be done as a followup.

dimas-b · 2025-07-03T18:37:13Z

This is not the final PR for non-AWS S3 :) We should certainly add CLI support, but probably in a follow-up PR.

No functional change. Introduce a dedicated interface for StsClient suppliers and implement it using a pool of cached clients. All client are "thin" and share the same `SdkHttpClient`. The latter is closed when the server shuts down. This is a step towards supporting non-AWS S3 storage (apache#1530). For this reason the STS endpoint is present in new interfaces, but is not used yet.

flyrain · 2025-07-03T19:32:17Z

runtime/service/src/main/java/org/apache/polaris/service/storage/aws/StsClientsPool.java

+  private final Function<StsDestination, StsClient> clientBuilder;
+
+  public StsClientsPool(
+      S3AccessConfig effectiveSts, SdkHttpClient sdkHttpClient, MeterRegistry meterRegistry) {


Can we pass effectiveSts as an integer instead of passing the whole S3AccessConfig object?

certainly - updated.

Amends apache#1913 and apache#2012

…ess (#2127) This change adds support for endpoint, sts-endpoint, path-style-access to the Polaris Python client. Amends #1913 and #2012

No functional change. Introduce a dedicated interface for `StsClient` suppliers and implement it using a pool of cached clients. All client are "thin" and share the same `SdkHttpClient`. The latter is closed when the server shuts down. This is a step towards supporting non-AWS S3 storage (apache#1530). For this reason the STS endpoint is present in new interfaces, but is not used yet. (cherry picked from commit 95d1eac)

No functional change. Introduce a dedicated interface for `StsClient` suppliers and implement it using a pool of cached clients. All client are "thin" and share the same `SdkHttpClient`. The latter is closed when the server shuts down. This is a step towards supporting non-AWS S3 storage (apache#1530). For this reason the STS endpoint is present in new interfaces, but is not used yet.

* chore(deps): update dependency mypy to >=1.17, <=1.17.0 (apache#2114) * Spark 3.5.6 and Iceberg 1.9.1 (apache#1960) * Spark 3.5.6 and Iceberg 1.9.1 * Cleanup * Add `pathStyleAccess` to AwsStorageConfigInfo (apache#2012) * Add `pathStyleAccess` to AwsStorageConfigInfo This change allows configuring the "path-style" access mode in S3 clients (both in Polaris Servers and Iceberg REST Catalog API clients). This change is applicable both to AWS storage and to non-AWS S3-compatible storage (apache#1530). * Add TestFileIOFactory helper (apache#2105) * Add FileIOFactory.wrapExisting helper * fix(deps): update dependency gradle.plugin.org.jetbrains.gradle.plugin.idea-ext:gradle-idea-ext to v1.2 (apache#2125) * fix(deps): update dependency boto3 to v1.39.7 (apache#2124) * Abstract polaris-runtime-service tests for all persistence implementations (apache#2106) The NoSQL persistence implementation has to run the Iceberg table & view catalog plus the Polaris specific tests as well. Reusing existing tests is beneficial to avoid a lot of code duplcation. This change moves the actual tests to `Abstract*` classes and refactors the existing tests to extend those. The NoSQL persistence work extends the same `Abstract*` classes but runs with different Quarkus test profiles. * Add IMPLICIT authentication support to the CLI (apache#2121) PRs apache#1925 and apache#1912 were merged around the same time. This PR connects the two changes and enables the CLI to accept IMPLICIT authentication type. Since Hadoop federated catalogs rely purely on IMPLICIT authentication, the CLI parsing test has been updated to reflect the same. * feat(helm): Add support for external authentication (apache#2104) * fix(deps): update dependency org.apache.iceberg:iceberg-bom to v1.9.2 (apache#2126) * fix(deps): update quarkus platform and group to v3.24.4 (apache#2128) * fix(deps): update dependency boto3 to v1.39.8 (apache#2129) * fix(deps): update dependency io.smallrye.config:smallrye-config-core to v3.13.3 (apache#2130) * Add newIcebergCatalog helper (apache#2134) creation of `IcebergCatalog` instances was quite redundant as tests mostly use the same parameters most of the time. also remove an unused field in 2 other tests. * Add server and client support for the new generic table `baseLocation` field (apache#2122) * Use Makefile to simplify setup and commands (apache#2027) * Use Makefile to simplify setup and commands * Add targets for minikube state management * Add podman support and spark plugin build * Add version target * Update README.md for Makefile usage and relation to the project * Fix nit * Package polaris client as python package (apache#2049) * Package polaris client as python package * Package polaris client as python package * Change owner to spark when copying files from local into Dockerfile * CI: Address failure from accessing GH API (apache#2132) CI sometimes fails with this failure: ``` * What went wrong: Execution failed for task ':generatePomFileForMavenPublication'. > Unable to process url: https://api.github.com/repos/apache/polaris/contributors?per_page=1000 ``` The sometimes failing request fetches the list of contributors to be published in the "root" POM. Unauthorized GH API requests have an hourly(?) limit of 60 requests per source IP. Authorized requests have a much higher rate limit. We do have a GitHub token available in every CI run, which can be used in GH API requests. This change adds the `Authorization` header for the failing GH API request to leverage the higher rate limit and let CI not fail (that often). * fix(deps): update dependency com.nimbusds:nimbus-jose-jwt to v10.4 (apache#2139) * fix(deps): update dependency com.diffplug.spotless:spotless-plugin-gradle to v7.2.0 (apache#2142) * fix(deps): update dependency software.amazon.awssdk:bom to v2.32.4 (apache#2146) * fix(deps): update dependency org.xerial.snappy:snappy-java to v1.1.10.8 (apache#2138) * fix(deps): update dependency org.junit:junit-bom to v5.13.4 (apache#2147) * fix(deps): update dependency boto3 to v1.39.9 (apache#2137) * fix(deps): update dependency com.fasterxml.jackson:jackson-bom to v2.19.2 (apache#2136) * Python client: add support for endpoint, sts-endpoint, path-style-access (apache#2127) This change adds support for endpoint, sts-endpoint, path-style-access to the Polaris Python client. Amends apache#1913 and apache#2012 * Remove PolarisEntityManager.getCredentialCache (apache#2133) `PolarisEntityManager` itself is not using the `StorageCredentialCache` but just hands it out via `getCredentialCache`. the only caller of `getCredentialCache` is `FileIOUtil.refreshAccessConfig`, which in in turn is only called by `DefaultFileIOFactory` and `IcebergCatalog`. note that in a follow-up we will likely be able to remove `PolarisEntityManager` usage completely from `IcebergCatalog`. additional cleanups: - use `StorageCredentialCache` injection in tests (but we need to invalidate all entries on test start) - remove unused `UserSecretsManagerFactory` from `PolarisCallContextCatalogFactory` * chore(deps): update registry.access.redhat.com/ubi9/openjdk-21-runtime docker tag to v1.22-1.1752676419 (apache#2150) * fix(deps): update dependency com.diffplug.spotless:spotless-plugin-gradle to v7.2.1 (apache#2152) * fix(deps): update dependency boto3 to v1.39.10 (apache#2151) * chore: fix class reference in the javadoc of TableLikeEntity (apache#2157) * fix(deps): update dependency commons-codec:commons-codec to v1.19.0 (apache#2160) * fix(deps): update dependency boto3 to v1.39.11 (apache#2159) * Last merged commit 395459f --------- Co-authored-by: Mend Renovate <bot@renovateapp.com> Co-authored-by: Yong Zheng <yongzheng0809@gmail.com> Co-authored-by: Dmitri Bourlatchkov <dmitri.bourlatchkov@gmail.com> Co-authored-by: Christopher Lambert <xn137@gmx.de> Co-authored-by: Pooja Nilangekar <poojan@umd.edu> Co-authored-by: Alexandre Dutra <adutra@apache.org> Co-authored-by: Yun Zou <yunzou.colostate@gmail.com>

github-project-automation bot added this to Basic Kanban Board Jun 19, 2025

github-project-automation bot moved this to PRs In Progress in Basic Kanban Board Jun 19, 2025

dimas-b requested review from adutra, collado-mike, eric-maynard and snazy June 19, 2025 23:44

eric-maynard reviewed Jun 20, 2025

View reviewed changes

singhpk234 reviewed Jun 20, 2025

View reviewed changes

dimas-b marked this pull request as draft June 20, 2025 16:49

dimas-b force-pushed the sts-client-pool branch from d41e6d5 to 5715305 Compare June 25, 2025 04:03

dimas-b changed the title ~~Add STS clients pool~~ feat: Support customizing S3 endpoints Jun 25, 2025

dimas-b requested review from eric-maynard and singhpk234 June 25, 2025 04:09

dimas-b marked this pull request as ready for review June 25, 2025 04:09

dimas-b mentioned this pull request Jun 27, 2025

Polaris Server is unable to use AWS region from Catalog parameters #1973

Closed

dimas-b force-pushed the sts-client-pool branch from 4b64391 to fb1d71f Compare June 28, 2025 00:31

dimas-b marked this pull request as draft June 28, 2025 00:52

dimas-b force-pushed the sts-client-pool branch 2 times, most recently from 8363e38 to de9c323 Compare June 30, 2025 21:45

dimas-b marked this pull request as ready for review June 30, 2025 21:45

snazy previously approved these changes Jul 2, 2025

View reviewed changes

runtime/service/src/main/java/org/apache/polaris/service/quarkus/config/QuarkusProducers.java Outdated Show resolved Hide resolved

github-project-automation bot moved this from PRs In Progress to Ready to merge in Basic Kanban Board Jul 2, 2025

dimas-b dismissed snazy’s stale review via 91a6ee4 July 2, 2025 22:12

snazy previously approved these changes Jul 3, 2025

View reviewed changes

dimas-b added 10 commits July 3, 2025 15:00

fix LICENSE

4534bd1

review: rename to StsClientProvider

aaf66e2

review: javadoc / cleanup of S3AccessConfig

130327a

support customizing S3 / STS endpoints

eea1901

fix: license header

981be86

move QuarkusRestCatalogMinIoIT to intTest

6e67138

review: add sts endpoint

f666826

add CHANGELOG entry

ab7b8e6

review: aws-sdk-http-client

c61861b

dimas-b force-pushed the sts-client-pool branch from 91a6ee4 to c61861b Compare July 3, 2025 19:00

flyrain reviewed Jul 3, 2025

View reviewed changes

review: simplify StsClientsPool constructor args

4c72502

dimas-b dismissed snazy’s stale review via 4c72502 July 3, 2025 20:14

snazy approved these changes Jul 4, 2025

View reviewed changes

snazy merged commit 95d1eac into apache:main Jul 4, 2025
11 checks passed

github-project-automation bot moved this from Ready to merge to Done in Basic Kanban Board Jul 4, 2025

snazy added a commit to snazy/polaris that referenced this pull request Jul 17, 2025

Python client: add support for endpoint, sts-endpoint, path-style-access

449ae9a

Amends apache#1913 and apache#2012

snazy mentioned this pull request Jul 17, 2025

Python client: add support for endpoint, sts-endpoint, path-style-access #2127

Merged

snazy added a commit that referenced this pull request Jul 21, 2025

Python client: add support for endpoint, sts-endpoint, path-style-acc…

3316f4e

…ess (#2127) This change adds support for endpoint, sts-endpoint, path-style-access to the Polaris Python client. Amends #1913 and #2012

This was referenced Jul 29, 2025

[ENHANCEMENT] Support non-AWS S3 storage that does not have STS #2207

Closed

Support for non-AWS S3 compatible storage with STS #1530

Closed

dennishuo mentioned this pull request Aug 23, 2025

Add a service/realm-level FeatureConfiguration flag which gates setting custom endpoint/stsEndpoint in StorageConfig #2436

Closed

dimas-b mentioned this pull request Sep 15, 2025

add blog: apache doris and polaris integration #2571

Merged

feat: Support customizing S3 endpoints #1913

feat: Support customizing S3 endpoints #1913

Uh oh!

Conversation

dimas-b commented Jun 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

eric-maynard commented Jun 20, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

flyrain Jul 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dimas-b Jul 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dimas-b Jun 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dimas-b Jul 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

singhpk234 left a comment

Choose a reason for hiding this comment

Uh oh!

dimas-b commented Jun 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dimas-b commented Jun 20, 2025

Uh oh!

singhpk234 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

dimas-b commented Jun 19, 2025 •

edited

Loading

flyrain Jul 3, 2025 •

edited

Loading

dimas-b Jul 3, 2025 •

edited

Loading

dimas-b Jun 20, 2025 •

edited

Loading

dimas-b Jul 3, 2025 •

edited

Loading

dimas-b commented Jun 20, 2025 •

edited

Loading

singhpk234 left a comment •

edited

Loading