From 33dd74510b9b60c92ee79604d98a5c693a214b2e Mon Sep 17 00:00:00 2001 From: rnblough <74331686+rnblough@users.noreply.github.com> Date: Wed, 19 Nov 2025 18:32:37 -0500 Subject: [PATCH] Update 04-impala.md Populated with the contents from V1 website. --- .../03-integrations/04-impala.md | 87 ++++++++++++++++++- 1 file changed, 86 insertions(+), 1 deletion(-) diff --git a/docs/04-user-guide/03-integrations/04-impala.md b/docs/04-user-guide/03-integrations/04-impala.md index 1682a1c3fd..1d0efe89e1 100644 --- a/docs/04-user-guide/03-integrations/04-impala.md +++ b/docs/04-user-guide/03-integrations/04-impala.md @@ -1,3 +1,88 @@ +--- +sidebar_label: Impala +--- + # Impala -**TODO:** File a subtask under [HDDS-9858](https://issues.apache.org/jira/browse/HDDS-9858) and complete this page or section. +Starting with version **4.2.0**, Apache Impala provides full support for querying data stored in Apache Ozone. To utilize this functionality, ensure that your Ozone version is **1.4.0** or later. + +## Supported Access Protocols + +Impala supports the following protocols for accessing Ozone data: + + * `ofs` + * `s3a` + +> **Note:** +> The `o3fs` protocol is **NOT** supported by Impala. + +## Supported Replication Types + +Impala is compatible with Ozone buckets configured with either: + + * **RATIS** (Replication) + * **Erasure Coding** + +## Querying Ozone Data with Impala + +Impala provides two approaches to interact with Ozone: + +1. Managed Tables +2. External Tables + +### Managed Tables + +If the Hive Warehouse Directory is located in Ozone, you can execute Impala queries without any changes, treating the Ozone file system like HDFS. + +**Example:** + +```sql +CREATE DATABASE d1; + +CREATE TABLE t1 (x INT, s STRING); +``` + +The data will be stored under the Hive Warehouse Directory path in Ozone. + +#### Specifying a Custom Ozone Path + +You can create managed databases, tables, or partitions at a specific Ozone path using the `LOCATION` clause. + +**Example:** + +```sql +CREATE DATABASE d1 LOCATION 'ofs://ozone1/vol1/bucket1/d1.db'; + +CREATE TABLE t1 LOCATION 'ofs://ozone1/vol1/bucket1/table1'; +``` + +### External Tables + +You can create an external table in Impala to query Ozone data. + +**Example:** + +```sql +CREATE EXTERNAL TABLE external_table ( + id INT, + name STRING +) LOCATION 'ofs://ozone1/vol1/bucket1/table1'; +``` + +With external tables: + + * The data is expected to be created and managed by another tool. + * Impala queries the data as-is. + * The metadata is stored under the external warehouse directory. + +> **Note:** +> Dropping an external table in Impala does not delete the associated data. + +### Using the S3A Protocol + +In addition to `ofs`, Impala can access Ozone via the S3 Gateway using the S3A file system. For more details, refer to: + + * [The S3 Protocol](https://www.google.com/search?q=../01-client-interfaces/03-s3.md) + * The [Hadoop S3A documentation](https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html) + +For additional information, consult the Apache Impala User Documentation on [Using Impala with Apache Ozone Storage](https://impala.apache.org/docs/build/html/topics/impala_ozone.html).