Skip to content

Conversation

@ursetta-netflix
Copy link
Contributor

@ursetta-netflix ursetta-netflix commented Jan 14, 2026

DO NOT MERGE

Executive Summary

This commit removes the metacat-connector-hive module entirely, eliminating the Hive Metastore connector as a pluggable backend. The change required extracting reusable components (Iceberg handling, common view support, type converters) into shared modules while preserving Thrift protocol support for legacy compatibility.


Step-by-Step Cascading Effects

1. Module Removal

The metacat-connector-hive module was deleted from settings.gradle and all build configurations. This module contained:

  • Hive Metastore client implementations (embedded and Thrift-based)
  • Direct SQL partition/table operations
  • Iceberg table handling
  • Common view handlers
  • Type converters between Hive and Metacat types
  • 93 source files totaling ~16,500 lines

2. Extraction to metacat-common-server

The following components were extracted because they are used by the Polaris connector independently of Hive:

Iceberg Components (→ common.server.connector.iceberg):

  • IcebergTableHandler, IcebergTableWrapper, IcebergTableOps
  • IcebergTableCriteria, IcebergTableCriteriaImpl
  • IcebergTableOpWrapper, IcebergTableOpsProxy
  • IcebergMetastoreTables, IcebergRequestMetrics
  • DataMetadataMetrics, DataMetadataMetricConstants

Common View Support (→ common.server.connector.commonview):

  • CommonViewHandler

SQL Support (→ common.server.connector.sql):

  • DirectSqlTable (subset - only the interface/base needed by Polaris)

Converters (→ common.server.converter):

  • New IcebergTypeConverter - converts Iceberg types to Metacat types without Hive dependencies
  • New IcebergTableInfoConverter - converts Iceberg tables to TableInfo without Hive Thrift types

Utilities (→ common.server.util.hive):

  • HiveTableUtil, HiveConfigConstants, IcebergFilterGenerator
  • HiveMetrics, HiveConnectorFastServiceMetric

Build changes: Added hadoop-core, guava-retrying, and iceberg-spark-runtime dependencies to metacat-common-server.

3. Extraction to metacat-thrift

Thrift protocol support requires Hive type conversions. These were moved here because Thrift is the only remaining consumer:

  • HiveConnectorInfoConverter - converts between Hive Thrift types (Database, Table, Partition) and Metacat models
  • HiveTypeConverter, HiveTypeMapping - Hive-to-Metacat type mapping
  • HiveTableStructUtil - new utility for Thrift struct operations

The Thrift module now has a direct dependency on Hive libraries solely for protocol compatibility.

4. Polaris Connector Changes

The Polaris connector previously depended on metacat-connector-hive. Changes:

  • Removed dependency: api(project(":metacat-connector-hive")) deleted from build.gradle
  • Import changes: All imports updated from connector.hive.* to common.server.connector.* or common.server.converter.*
  • Type converter: Switched from HiveTypeConverter to IcebergTypeConverter
  • Info converter: Switched from HiveConnectorInfoConverter to IcebergTableInfoConverter
  • Plugin: Created new PolarisConnectorInfoConverter (no-op implementation since Polaris uses internal converters)

5. Main Application Changes

ThriftConfig.java: Added @PostConstruct method to register HiveTypeConverter under the "hive" key with TypeConverterFactory. This is required because CatalogThriftEventHandler sets the data type context to "hive" for all Thrift requests, regardless of which connector module is loaded.

Build.gradle: Removed runtimeOnly(project(":metacat-connector-hive")).

6. Functional Test Changes

TestCatalogs.groovy:

  • Removed hive-metastore catalog from test catalog list
  • Removed validateWithHive field
  • Removed getThriftImplementersToValidateWithHive() method
  • Updated polaris-metastore and polaris-metastore-test catalogs with explicit capability flags

Docker Compose:

  • Authorization ACL strings changed from hive-metastore/fsmoke_acl to polaris-metastore/fsmoke_acl
  • Removed hive-metastore.properties catalog configs

Test Spec Files (-1,652 lines net):

  • MetacatFunctionalSpec.groovy - Removed hive-specific test cases
  • MetacatSmokeSpec.groovy - Removed hive metastore tests
  • MetacatSmokeThriftSpec.groovy - Removed hive Thrift validation tests
  • MetacatThriftFunctionalSpec.groovy - Removed hive-specific assertions

New Iceberg Metadata Files: Added pre-built metadata JSON files for functional tests (replacing Hive-created tables):

  • metacat-all-types.metadata.json
  • pig-table.metadata.json
  • pig-part-table.metadata.json
  • pig-parts-table.metadata.json

7. Dependency Lock Updates

All connector modules had their dependencies.lock files regenerated to reflect the removal of transitive Hive dependencies. Notable reductions in the Polaris connector dependency graph.


Dependency Relationships After Change

metacat-main
└── metacat-thrift (Hive Thrift types for protocol compat)
└── metacat-connector-polaris
└── metacat-common-server (Iceberg handling, converters)
└── metacat-connector-jdbc

The Hive Metastore client libraries remain only in metacat-thrift for Thrift protocol support. The Polaris connector now operates independently of Hive types.


Files Deleted vs. Moved

Category Count Lines
Deleted (Hive-specific) ~70 files ~12,000
Moved to metacat-common-server 25 files ~2,500
Moved to metacat-thrift 5 files ~600
Test metadata files added 4 files ~530

@ursetta-netflix ursetta-netflix force-pushed the remove-hive branch 2 times, most recently from 4e131ec to 585c94c Compare January 20, 2026 16:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants