apache · thomasrebele · Feb 22, 2026
diff --git a/.gitignore b/.gitignore
@@ -5,3 +5,4 @@ themes/hive/.DS_Store
 themes/hive/static/.DS_Store
 .hugo_build.lock
 public
+target
diff --git a/README.md b/README.md
@@ -1,20 +1,21 @@
 <!---
-  Licensed to the Apache Software Foundation (ASF) under one
-  or more contributor license agreements.  See the NOTICE file
-  distributed with this work for additional information
-  regarding copyright ownership.  The ASF licenses this file
-  to you under the Apache License, Version 2.0 (the
-  "License"); you may not use this file except in compliance
-  with the License.  You may obtain a copy of the License at
-
-  http://www.apache.org/licenses/LICENSE-2.0
-
-  Unless required by applicable law or agreed to in writing,
-  software distributed under the License is distributed on an
-  "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-  KIND, either express or implied.  See the License for the
-  specific language governing permissions and limitations
-  under the License. -->
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License. -->
+
 # Apache Hive Documentation Site
 
 This repository contains the code for generating the Apache Hive web site.
@@ -25,13 +26,14 @@ It's built with Hugo and hosted at https://hive.apache.org.
 * Clone this repository.
 * Install [hugo] on macOS:
 
- ```brew install hugo```
-* For other OS please refer: [hugo-install] 
+```brew install hugo```
+* For other OS please refer: [hugo-install]
 * To verify your new install:
 
 ```hugo version```
 
 * To build and start the Hugo server run:
+
 ```
 >>> hugo server -D
 
@@ -55,19 +57,20 @@ Running in Fast Render Mode. For full rebuilds on change: hugo server --disableF
 Web Server is available at http://localhost:1313/ (bind address 127.0.0.1)
 Press Ctrl+C to stop
 ```
-* Navigate to `http://localhost:1313/` to view the site locally.
 
+* Navigate to `http://localhost:1313/` to view the site locally.
 
-### To Add New Content 
+### To Add New Content
 
-* To add new markdown file : 
-`hugo new general/Downloads.md`
+* To add new markdown file :
+  `hugo new general/Downloads.md`
 
 * Update `themes/hive/layouts/partials/menu.html` and `config.toml` to add navigation link to the markdown page as needed.
 
 ### Pushing to site
-Commit and push the changes to the main branch. The site is automatically deployed from the site directory.
 
+Commit and push the changes to the main branch. The site is automatically deployed from the site directory.
 
 [hugo]: https://gohugo.io/getting-started/quick-start/
-[hugo-install]: https://gohugo.io/installation/
+[hugo-install]: https://gohugo.io/installation/
+
diff --git a/content/Development/_index.md b/content/Development/_index.md
@@ -2,3 +2,4 @@
 title: "Development"
 date: 2025-07-24
 ---
+
diff --git a/content/Development/desingdocs/_index.md b/content/Development/desingdocs/_index.md
@@ -2,3 +2,4 @@
 title: "Design Documents"
 date: 2025-07-24
 ---
+
diff --git a/content/Development/desingdocs/accessserver-design-proposal.md b/content/Development/desingdocs/accessserver-design-proposal.md
@@ -46,18 +46,15 @@ Hive has a powerful data model that allows users to map logical tables and parti
 
 HCatalog's Storage Based Authorization model is explained in more detail in the [HCatalog documentation](http://hive.apache.org/docs/hcat_r0.5.0/authorization.html), but the following set of quotes provides a good high-level overview:
 
-> 
-> ... when a file system is used for storage, there is a directory corresponding to a database or a table. With this authorization model, **the read/write permissions a user or group has for this directory determine the permissions a user has on the database or table**.  
-> 
-> ...  
-> 
-> For example, an alter table operation would check if the user has permissions on the table directory before allowing the operation, even if it might not change anything on the file system.  
-> 
-> ...  
-> 
+> ... when a file system is used for storage, there is a directory corresponding to a database or a table. With this authorization model, **the read/write permissions a user or group has for this directory determine the permissions a user has on the database or table**.
+>
+> ...
+>
+> For example, an alter table operation would check if the user has permissions on the table directory before allowing the operation, even if it might not change anything on the file system.
+>
+> ...
+>
 > When the database or table is backed by a file system that has a Unix/POSIX-style permissions model (like HDFS), there are read(r) and write(w) permissions you can set for the owner user, group and ‘other’. The file system’s logic for determining if a user has permission **on the directory or file** will be used by Hive.
-> 
-> 
 
 There are several problems with this approach, the first of which is actually hinted at by the inconsistency highlighted in the preceding quote. To determine whether a particular user has read permission on table `foo`, HCatalog's [HdfsAuthorizationProvider class](http://svn.apache.org/repos/asf/hive/branches/branch-0.11/hcatalog/core/src/main/java/org/apache/hcatalog/security/HdfsAuthorizationProvider.java) checks to see if the user has read permission on the corresponding HDFS directory `/hive/warehouse/foo` that contains the table's data. However, in HDFS having [read permission on a directory](http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsPermissionsGuide.html) only implies that you have the ability to list the contents of the directory – it doesn't have any affect on your ability to read the files contained in the directory.
 
@@ -100,7 +97,3 @@ Finally, red is used in the preceding diagram to highlight HCatalog components w
 
 ![](images/icons/bullet_blue.gif)
 
-
-
-
-
diff --git a/content/Development/desingdocs/binary-datatype-proposal.md b/content/Development/desingdocs/binary-datatype-proposal.md
@@ -21,9 +21,9 @@ create table binary_table (a string, b binary);
 
 ### How is 'binary' represented internally in Hive
 
-Binary type in Hive will map to 'binary' data type in thrift.   
+Binary type in Hive will map to 'binary' data type in thrift. 
 
-Primitive java object for 'binary' type is ByteArrayRef  
+Primitive java object for 'binary' type is ByteArrayRef
 
 PrimitiveWritableObject for 'binary' type is BytesWritable
 
@@ -41,13 +41,13 @@ As with other types, binary data will be sent to transform script in String form
 
 ### Supported Serde:
 
-ColumnarSerde  
+ColumnarSerde
 
-BinarySortableSerde  
+BinarySortableSerde
 
-LazyBinaryColumnarSerde    
+LazyBinaryColumnarSerde  
 
-LazyBinarySerde  
+LazyBinarySerde
 
 LazySimpleSerde
 
@@ -57,7 +57,3 @@ Group-by and unions will be supported on columns with 'binary' type
 
 <https://issues.apache.org/jira/browse/HIVE-2380>
 
-
-
-
-
diff --git a/content/Development/desingdocs/column-statistics-in-hive.md b/content/Development/desingdocs/column-statistics-in-hive.md
@@ -30,59 +30,60 @@ To view column stats :
 ```
 describe formatted [table_name] [column_name];
 ```
+
 ### **Metastore Schema**
 
 To persist column level statistics, we propose to add the following new tables,
 
 CREATE TABLE TAB_COL_STATS  
- (  
- CS_ID NUMBER NOT NULL,  
- TBL_ID NUMBER NOT NULL,  
- COLUMN_NAME VARCHAR(128) NOT NULL,  
- COLUMN_TYPE VARCHAR(128) NOT NULL,  
- TABLE_NAME VARCHAR(128) NOT NULL,  
- DB_NAME VARCHAR(128) NOT NULL,
+(  
+CS_ID NUMBER NOT NULL,  
+TBL_ID NUMBER NOT NULL,  
+COLUMN_NAME VARCHAR(128) NOT NULL,  
+COLUMN_TYPE VARCHAR(128) NOT NULL,  
+TABLE_NAME VARCHAR(128) NOT NULL,  
+DB_NAME VARCHAR(128) NOT NULL,
 
 LOW_VALUE RAW,  
- HIGH_VALUE RAW,  
- NUM_NULLS BIGINT,  
- NUM_DISTINCTS BIGINT,
+HIGH_VALUE RAW,  
+NUM_NULLS BIGINT,  
+NUM_DISTINCTS BIGINT,
 
 BIT_VECTOR, BLOB,  /* introduced in [HIVE-16997](https://issues.apache.org/jira/browse/HIVE-16997) in Hive 3.0.0 */
 
 AVG_COL_LEN DOUBLE,  
- MAX_COL_LEN BIGINT,  
- NUM_TRUES BIGINT,  
- NUM_FALSES BIGINT,  
- LAST_ANALYZED BIGINT NOT NULL)
+MAX_COL_LEN BIGINT,  
+NUM_TRUES BIGINT,  
+NUM_FALSES BIGINT,  
+LAST_ANALYZED BIGINT NOT NULL)
 
 ALTER TABLE COLUMN_STATISTICS ADD CONSTRAINT COLUMN_STATISTICS_PK PRIMARY KEY (CS_ID);
 
 ALTER TABLE COLUMN_STATISTICS ADD CONSTRAINT COLUMN_STATISTICS_FK1 FOREIGN KEY (TBL_ID) REFERENCES TBLS (TBL_ID) INITIALLY DEFERRED ;
 
 CREATE TABLE PART_COL_STATS  
- (  
- CS_ID NUMBER NOT NULL,  
- PART_ID NUMBER NOT NULL,
+(  
+CS_ID NUMBER NOT NULL,  
+PART_ID NUMBER NOT NULL,
 
 DB_NAME VARCHAR(128) NOT NULL,  
- COLUMN_NAME VARCHAR(128) NOT NULL,  
- COLUMN_TYPE VARCHAR(128) NOT NULL,  
- TABLE_NAME VARCHAR(128) NOT NULL,  
- PART_NAME VARCHAR(128) NOT NULL,
+COLUMN_NAME VARCHAR(128) NOT NULL,  
+COLUMN_TYPE VARCHAR(128) NOT NULL,  
+TABLE_NAME VARCHAR(128) NOT NULL,  
+PART_NAME VARCHAR(128) NOT NULL,
 
 LOW_VALUE RAW,  
- HIGH_VALUE RAW,  
- NUM_NULLS BIGINT,  
- NUM_DISTINCTS BIGINT,
+HIGH_VALUE RAW,  
+NUM_NULLS BIGINT,  
+NUM_DISTINCTS BIGINT,
 
 BIT_VECTOR, BLOB,  /* introduced in [HIVE-16997](https://issues.apache.org/jira/browse/HIVE-16997) in Hive 3.0.0 */
 
 AVG_COL_LEN DOUBLE,  
- MAX_COL_LEN BIGINT,  
- NUM_TRUES BIGINT,  
- NUM_FALSES BIGINT,  
- LAST_ANALYZED BIGINT NOT NULL)
+MAX_COL_LEN BIGINT,  
+NUM_TRUES BIGINT,  
+NUM_FALSES BIGINT,  
+LAST_ANALYZED BIGINT NOT NULL)
 
 ALTER TABLE COLUMN_STATISTICS ADD CONSTRAINT COLUMN_STATISTICS_PK PRIMARY KEY (CS_ID);
 
@@ -93,44 +94,44 @@ ALTER TABLE COLUMN_STATISTICS ADD CONSTRAINT COLUMN_STATISTICS_FK1 FOREIGN KEY (
 We propose to add the following Thrift structs to transport column statistics:
 
 struct BooleanColumnStatsData {  
- 1: required i64 numTrues,  
- 2: required i64 numFalses,  
- 3: required i64 numNulls  
- }
+1: required i64 numTrues,  
+2: required i64 numFalses,  
+3: required i64 numNulls  
+}
 
 struct DoubleColumnStatsData {  
- 1: required double lowValue,  
- 2: required double highValue,  
- 3: required i64 numNulls,  
- 4: required i64 numDVs,
+1: required double lowValue,  
+2: required double highValue,  
+3: required i64 numNulls,  
+4: required i64 numDVs,
 
 5: optional string bitVectors
 
 }
 
 struct LongColumnStatsData {  
- 1: required i64 lowValue,  
- 2: required i64 highValue,  
- 3: required i64 numNulls,  
- 4: required i64 numDVs,
+1: required i64 lowValue,  
+2: required i64 highValue,  
+3: required i64 numNulls,  
+4: required i64 numDVs,
 
 5: optional string bitVectors  
- }
+}
 
 struct StringColumnStatsData {  
- 1: required i64 maxColLen,  
- 2: required double avgColLen,  
- 3: required i64 numNulls,  
- 4: required i64 numDVs,
+1: required i64 maxColLen,  
+2: required double avgColLen,  
+3: required i64 numNulls,  
+4: required i64 numDVs,
 
 5: optional string bitVectors  
- }
+}
 
 struct BinaryColumnStatsData {  
- 1: required i64 maxColLen,  
- 2: required double avgColLen,  
- 3: required i64 numNulls  
- }
+1: required i64 maxColLen,  
+2: required double avgColLen,  
+3: required i64 numNulls  
+}
 
 struct Decimal {  
 1: required binary unscaled,  
@@ -168,43 +169,43 @@ union ColumnStatisticsData {
 }
 
 struct ColumnStatisticsObj {  
- 1: required string colName,  
- 2: required string colType,  
- 3: required ColumnStatisticsData statsData  
- }
+1: required string colName,  
+2: required string colType,  
+3: required ColumnStatisticsData statsData  
+}
 
 struct ColumnStatisticsDesc {  
- 1: required bool isTblLevel,   
- 2: required string dbName,  
- 3: required string tableName,  
- 4: optional string partName,  
- 5: optional i64 lastAnalyzed  
- }
+1: required bool isTblLevel,   
+2: required string dbName,  
+3: required string tableName,  
+4: optional string partName,  
+5: optional i64 lastAnalyzed  
+}
 
 struct ColumnStatistics {  
- 1: required ColumnStatisticsDesc statsDesc,  
- 2: required list<ColumnStatisticsObj> statsObj;  
- }
+1: required ColumnStatisticsDesc statsDesc,  
+2: required list<ColumnStatisticsObj> statsObj;  
+}
 
 We propose to add the following Thrift APIs to persist, retrieve and delete column statistics:
 
 bool update_table_column_statistics(1:ColumnStatistics stats_obj) throws (1:NoSuchObjectException o1,   
- 2:InvalidObjectException o2, 3:MetaException o3, 4:InvalidInputException o4)  
- bool update_partition_column_statistics(1:ColumnStatistics stats_obj) throws (1:NoSuchObjectException o1,   
- 2:InvalidObjectException o2, 3:MetaException o3, 4:InvalidInputException o4)
+2:InvalidObjectException o2, 3:MetaException o3, 4:InvalidInputException o4)  
+bool update_partition_column_statistics(1:ColumnStatistics stats_obj) throws (1:NoSuchObjectException o1,   
+2:InvalidObjectException o2, 3:MetaException o3, 4:InvalidInputException o4)
 
 ColumnStatistics get_table_column_statistics(1:string db_name, 2:string tbl_name, 3:string col_name) throws  
- (1:NoSuchObjectException o1, 2:MetaException o2, 3:InvalidInputException o3, 4:InvalidObjectException o4)   
- ColumnStatistics get_partition_column_statistics(1:string db_name, 2:string tbl_name, 3:string part_name,  
- 4:string col_name) throws (1:NoSuchObjectException o1, 2:MetaException o2,   
- 3:InvalidInputException o3, 4:InvalidObjectException o4)
+(1:NoSuchObjectException o1, 2:MetaException o2, 3:InvalidInputException o3, 4:InvalidObjectException o4)   
+ColumnStatistics get_partition_column_statistics(1:string db_name, 2:string tbl_name, 3:string part_name,  
+4:string col_name) throws (1:NoSuchObjectException o1, 2:MetaException o2,   
+3:InvalidInputException o3, 4:InvalidObjectException o4)
 
 bool delete_partition_column_statistics(1:string db_name, 2:string tbl_name, 3:string part_name, 4:string col_name) throws   
- (1:NoSuchObjectException o1, 2:MetaException o2, 3:InvalidObjectException o3,   
- 4:InvalidInputException o4)  
- bool delete_table_column_statistics(1:string db_name, 2:string tbl_name, 3:string col_name) throws   
- (1:NoSuchObjectException o1, 2:MetaException o2, 3:InvalidObjectException o3,   
- 4:InvalidInputException o4)
+(1:NoSuchObjectException o1, 2:MetaException o2, 3:InvalidObjectException o3,   
+4:InvalidInputException o4)  
+bool delete_table_column_statistics(1:string db_name, 2:string tbl_name, 3:string col_name) throws   
+(1:NoSuchObjectException o1, 2:MetaException o2, 3:InvalidObjectException o3,   
+4:InvalidInputException o4)
 
 Note that delete_column_statistics is needed to remove the entries from the metastore when a table is dropped. Also note that currently Hive doesn’t support drop column.
-Original file line number
+Diff line change
@@ Expand Up / @@ -5,3 +5,4 @@ themes/hive/.DS_Store @@
     themes/hive/static/.DS_Store
     .hugo_build.lock
     public
+    target
Original file line number	Diff line number	Diff line change
Expand Up		@@ -2,3 +2,4 @@
		title: "Development"
		date: 2025-07-24
		---
Original file line number	Diff line number	Diff line change
Expand Up		@@ -2,3 +2,4 @@
		title: "Design Documents"
		date: 2025-07-24
		---