Merge branch 'apache:main' into MDB_STABLE#5
Merged
ostinru merged 8 commits intoMDB_STABLEfrom Feb 24, 2026
Merged
Conversation
This commit implements comprehensive changes to align the project with Apache Software Foundation incubation requirements and complete the migration from Greenplum to Apache Cloudberry branding. - Add DISCLAIMER file as required for Apache incubating projects - Update LICENSE file with comprehensive list of 245 files containing Apache License headers, organized by module (FDW, External Table, Server, Documentation, CI/Test templates) - Add Apache License headers to GitHub workflow files - Update CONTRIBUTING.md with Apache project contribution guidelines - Update README.md with Apache Cloudberry branding and simplified content - Update documentation templates in docs/content/ to use Cloudberry - Update automation and testing documentation - Migrate server scripts and Java components references - Update CI/CD workflows with proper Apache licensing - Clean up legacy CI documentation (remove ci/README.md) - Update build system references in Makefile - Enhance installation scripts to support both Cloudberry 2.0 and 2.1+ - Add transition guide for Cloudberry migration - Update all user-facing documentation with correct branding - Simplify README.md focusing on essential information - Update book configuration for documentation generation This change ensures full compliance with Apache incubation requirements while completing the transition to Apache Cloudberry ecosystem.
Main changes are included: * Rename package from pxf-gpX to cloudberry-pxf * Update installation paths to /usr/local/cloudberry-pxf-[VERSION] * Remove legacy pxf-gp7.spec and pxf-cbdb1.spec files * Update DEBIAN package control files with new naming * Standardize package configuration based on cloudberry-pxf.spec
### hbase-client update
Update `hbase-client` from `1.3.2` to `2.3.7` (latest version with minimal 3rd-party library updates) with following decision made:
* Use hadoop 2 version - Despite of the fact that we are using hadoop 3 in automation tests, we still using hadoop2 libs for all other connectors.
* Use shaded version of libraries when it relocates java-packages to new namespaces
* Use non-shaded version of libraries when it just bundles several jars to single fat-jar. Fat jars hides dependencies from gradle, however puts classes on classpath. This will lead to unpredictable issues.
Fortunately, `hbase-client:2.3.7` depends on `hadoop-2.10.0` that we are using! No changes here.
### update details
```
runtimeClasspath - Runtime classpath of source set 'main'.
\--- org.apache.hbase:hbase-client:2.3.7
+--- org.apache.hbase.thirdparty:hbase-shaded-protobuf:3.3.0
+--- org.apache.hbase:hbase-common:2.3.7
| +--- org.apache.hbase:hbase-logging:2.3.7
| +--- org.apache.hbase.thirdparty:hbase-shaded-miscellaneous:3.3.0
| | \--- com.google.errorprone:error_prone_annotations:2.3.4
| +--- org.apache.hbase.thirdparty:hbase-shaded-gson:3.3.0
| +--- org.apache.hbase.thirdparty:hbase-shaded-netty:3.3.0
| +--- commons-codec:commons-codec:1.13
| +--- org.apache.commons:commons-lang3:3.9
| +--- commons-io:commons-io:2.11.0
| +--- com.google.protobuf:protobuf-java:2.5.0
| +--- org.apache.htrace:htrace-core4:4.2.0-incubating
| +--- org.apache.commons:commons-crypto:1.0.0
| +--- org.apache.yetus:audience-annotations:0.5.0
| \--- org.apache.hadoop:hadoop-common:2.10.0 -> 2.10.2
+--- org.apache.hbase:hbase-hadoop-compat:2.3.7
| +--- org.apache.hbase.thirdparty:hbase-shaded-miscellaneous:3.3.0 (*)
| +--- org.apache.hbase:hbase-metrics-api:2.3.7
| | +--- org.apache.hbase:hbase-common:2.3.7 (*)
| | +--- org.apache.commons:commons-lang3:3.9
| | +--- org.apache.hbase.thirdparty:hbase-shaded-miscellaneous:3.3.0 (*)
+--- org.apache.hbase:hbase-hadoop2-compat:2.3.7
| +--- org.apache.hbase:hbase-hadoop-compat:2.3.7 (*)
| +--- org.apache.hbase:hbase-common:2.3.7 (*)
| +--- org.apache.hbase:hbase-metrics:2.3.7
| | +--- org.apache.hbase.thirdparty:hbase-shaded-miscellaneous:3.3.0 (*)
| | +--- org.apache.hbase:hbase-common:2.3.7 (*)
| | +--- org.apache.hbase:hbase-metrics-api:2.3.7 (*)
| | +--- io.dropwizard.metrics:metrics-core:3.2.6
| +--- org.apache.hbase:hbase-metrics-api:2.3.7 (*)
| +--- org.apache.hadoop:hadoop-common:2.10.0 -> 2.10.2 (*)
| +--- javax.activation:javax.activation-api:1.2.0
| +--- org.apache.commons:commons-lang3:3.9
| +--- org.apache.hbase.thirdparty:hbase-shaded-miscellaneous:3.3.0 (*)
+--- org.apache.hbase:hbase-protocol-shaded:2.3.7
| +--- org.apache.hbase.thirdparty:hbase-shaded-protobuf:3.3.0
+--- org.apache.hbase:hbase-protocol:2.3.7
| +--- com.google.protobuf:protobuf-java:2.5.0
+--- commons-codec:commons-codec:1.13
+--- commons-io:commons-io:2.11.0
+--- org.apache.commons:commons-lang3:3.9
+--- org.slf4j:slf4j-api:1.7.30 -> 1.7.36
+--- org.apache.hbase.thirdparty:hbase-shaded-miscellaneous:3.3.0 (*)
+--- com.google.protobuf:protobuf-java:2.5.0
+--- org.apache.hbase.thirdparty:hbase-shaded-netty:3.3.0
+--- org.apache.zookeeper:zookeeper:3.5.7 (*)
+--- org.apache.htrace:htrace-core4:4.2.0-incubating
+--- org.jruby.jcodings:jcodings:1.0.18
+--- org.jruby.joni:joni:2.1.11
| \--- org.jruby.jcodings:jcodings:1.0.13 -> 1.0.18
+--- io.dropwizard.metrics:metrics-core:3.2.6 (*)
+--- org.apache.commons:commons-crypto:1.0.0
+--- org.apache.yetus:audience-annotations:0.5.0
+--- org.apache.hadoop:hadoop-auth:2.10.0 -> 2.10.2 (*)
\--- org.apache.hadoop:hadoop-common:2.10.0 -> 2.10.2 (*)
(*) - dependencies omitted (listed previously)
```
### Other changes
* Automation tests: upgrade hbase to 2.3.7
Update the name format from `cloudberry-pxf-*` to `apache-cloudberry-pxf-incubating` for the deb/rpm files.
### Useful changes in Parquet 1.12.x -> 1.15.x * Support LZ4_RAW codec * Implement vectored IO in Parquet file format * More optimal memory usage in compression codecs Dependency tree changes are small: ``` +--- org.apache.parquet:parquet-column:1.15.1 | +--- org.apache.parquet:parquet-common:1.15.1 | | +--- org.apache.parquet:parquet-format-structures:1.15.1 | +--- org.apache.parquet:parquet-encoding:1.15.1 | | +--- org.apache.parquet:parquet-common:1.15.1 (*) +--- org.apache.parquet:parquet-hadoop:1.15.1 | +--- org.apache.parquet:parquet-column:1.15.1 (*) | +--- org.apache.parquet:parquet-format-structures:1.15.1 (*) | +--- org.apache.parquet:parquet-common:1.15.1 (*) | +--- org.xerial.snappy:snappy-java:1.1.10.7 | +--- io.airlift:aircompressor:2.0.2 | +--- commons-pool:commons-pool:1.6 | +--- com.github.luben:zstd-jni:1.5.6-6 +--- org.apache.parquet:parquet-jackson:1.15.1 +--- org.apache.parquet:parquet-generator:1.15.1 +--- org.apache.parquet:parquet-pig:1.15.1 | +--- org.apache.parquet:parquet-column:1.15.1 (*) | +--- org.apache.parquet:parquet-hadoop:1.15.1 (*) | +--- org.apache.parquet:parquet-common:1.15.1 (*) \--- org.apache.parquet:parquet-format:2.10.0 ``` It has its own thrift library shaded. It doesn't depend on protobuf. `parquet-hadoop` in fact expects that there is `hadoop-client`, `hadoop-common`, `hadoop-annotations` and `hadoop-mapreduce-client-core` provided.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.