diff --git a/content/docs/latest/hcatalog/hcatalog-cli.md b/content/docs/latest/hcatalog/hcatalog-cli.md index 0cdd13a5..d0b8898e 100644 --- a/content/docs/latest/hcatalog/hcatalog-cli.md +++ b/content/docs/latest/hcatalog/hcatalog-cli.md @@ -143,11 +143,11 @@ Any command not listed above is NOT supported and throws an exception with the m ### Authentication -If a failure results in a message like "2010-11-03 16:17:28,225 WARN hive.metastore ... - Unable to connect metastore with URI thrift://..." in `/tmp/`**`/hive.log`, then make sure you have run "`kinit` **`@FOO.COM`" to get a Kerberos ticket and to be able to authenticate to the HCatalog server. +If a failure results in a message like "2010-11-03 16:17:28,225 WARN hive.metastore ... - Unable to connect metastore with URI thrift://..." in `/tmp//hive.log`, then make sure you have run "`kinit @FOO.COM`" to get a Kerberos ticket and to be able to authenticate to the HCatalog server. ### Error Log -If other errors occur while using the HCatalog CLI, more detailed messages are written to /tmp/**/hive.log. +If other errors occur while using the HCatalog CLI, more detailed messages are written to `/tmp//hive.log`. diff --git a/content/docs/latest/hcatalog/hcatalog-inputoutput.md b/content/docs/latest/hcatalog/hcatalog-inputoutput.md index 64a46f24..1b5b019d 100644 --- a/content/docs/latest/hcatalog/hcatalog-inputoutput.md +++ b/content/docs/latest/hcatalog/hcatalog-inputoutput.md @@ -196,7 +196,7 @@ hdfs:///tmp/slf4j-api-1.6.1.jar ### Authentication -If a failure results in a message like "2010-11-03 16:17:28,225 WARN hive.metastore ... - Unable to connect metastore with URI thrift://..." in `/tmp/`**`/hive.log`, then make sure you have run "`kinit` **`@FOO.COM`" to get a Kerberos ticket and to be able to authenticate to the HCatalog server. +If a failure results in a message like "2010-11-03 16:17:28,225 WARN hive.metastore ... - Unable to connect metastore with URI thrift://..." in `/tmp//hive.log`, then make sure you have run "`kinit @FOO.COM`" to get a Kerberos ticket and to be able to authenticate to the HCatalog server. ### Read Example diff --git a/content/docs/latest/hcatalog/hcatalog-loadstore.md b/content/docs/latest/hcatalog/hcatalog-loadstore.md index 10690c41..9b668519 100644 --- a/content/docs/latest/hcatalog/hcatalog-loadstore.md +++ b/content/docs/latest/hcatalog/hcatalog-loadstore.md @@ -164,7 +164,7 @@ The version number found in each filepath will be substituted for *. For example #### Authentication -If you are using a secure cluster and a failure results in a message like "2010-11-03 16:17:28,225 WARN hive.metastore ... - Unable to connect metastore with URI thrift://..." in `/tmp/`**`/hive.log`, then make sure you have run "`kinit` **`@FOO.COM`" to get a Kerberos ticket and to be able to authenticate to the HCatalog server. +If you are using a secure cluster and a failure results in a message like "2010-11-03 16:17:28,225 WARN hive.metastore ... - Unable to connect metastore with URI thrift://..." in `/tmp//hive.log`, then make sure you have run "`kinit @FOO.COM`" to get a Kerberos ticket and to be able to authenticate to the HCatalog server. ### Load Examples diff --git a/content/docs/latest/language/cast-format-with-sql2016-datetime-formats.md b/content/docs/latest/language/cast-format-with-sql2016-datetime-formats.md index 9fc9de0d..92cdfe5f 100644 --- a/content/docs/latest/language/cast-format-with-sql2016-datetime-formats.md +++ b/content/docs/latest/language/cast-format-with-sql2016-datetime-formats.md @@ -137,7 +137,7 @@ e.g. input=2019-01-01 20:00, format=“AM”, output=“PM”. - Retains the exact format (capitalization and length) provided in the pattern string. If p.m. is in the pattern, we expect a.m. or p.m. in the output; if AM is in the pattern, we expect AM or PM in the output. If the case is mixed (Am or aM) then the output case will match the -case of the pattern's first character (Am => AM, aM => am). +case of the pattern's first character (Am => AM, aM => am). - String to datetime conversion: - Conflicts with HH24 and SSSSS. - It doesn't matter which meridian indicator is in the pattern. @@ -221,10 +221,10 @@ MONTH|Month|month Name of month of year - For datetime to string conversion, will include trailing spaces up to length 9 (length of longest month of year name: "September"). Case is taken into account according to the -following example (pattern => output): -- MONTH => JANUARY -- Month => January -- month => january +following example (pattern => output): +- MONTH => JANUARY +- Month => January +- month => january - For string to datetime conversion, neither the case of the pattern nor the case of the input are taken into account. - For string to datetime conversion, conflicts with MM and MON. @@ -232,10 +232,10 @@ are taken into account. MON|Mon|mon Abbreviated name of month of year - For datetime to string conversion, case is taken into account according to the following -example (pattern => output): -- MON => JAN -- Mon => Jan -- mon => jan +example (pattern => output): +- MON => JAN +- Mon => Jan +- mon => jan - For string to datetime conversion, neither the case of the pattern nor the case of the input are taken into account. - For string to datetime conversion, conflicts with MM and MONTH. @@ -244,7 +244,7 @@ DAY|Day|day Name of day of week - For datetime to string conversion, will include trailing spaces until length is 9 (length of longest day of week name: "Wednesday"). Case is taken into account according to the following -example (pattern => output): +example (pattern => output): - DAY = SUNDAY - Day = Sunday - day = sunday @@ -255,7 +255,7 @@ are taken into account. DY|Dy|dy Abbreviated name of day of week - For datetime to string conversion, case is taken into account according to the following -example (pattern => output): +example (pattern => output): - DY = SUN - Dy = Sun - dy = sun @@ -286,11 +286,11 @@ zone agnostic. ##### C. Separators --|.|/|,|'|;|:| +- |.|/|,|'|;|:|<space>\ Separator - Uses loose matching. Existence of a sequence of separators in the format should match the existence of a sequence of separators in the input regardless of the types of the separator or -the length of the sequence where length > 1. E.g. input=“2019-. ;10/10”, pattern=“YYYY-MM-DD” +the length of the sequence where length > 1. E.g. input=“2019-. ;10/10”, pattern=“YYYY-MM-DD” is valid; input=“20191010”, pattern=“YYYY-MM-DD” is not valid. - If the last separator character in the separator substring is "-" and is immediately followed by a time zone hour (tzh) token, it's a negative sign and not counted as a separator, UNLESS diff --git a/content/docs/latest/language/hive-udfs.md b/content/docs/latest/language/hive-udfs.md index 3facde34..21a47f71 100644 --- a/content/docs/latest/language/hive-udfs.md +++ b/content/docs/latest/language/hive-udfs.md @@ -52,9 +52,9 @@ These functions can be used without GROUP BY as well.  | **double** | ``` covar_samp(col1, col2) ``` | Returns the sample covariance of a pair of numeric columns in the group. | [GenericUDAFCovarianceSample](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFCovarianceSample.java) | | **double** | ``` corr(col1, col2) ``` | Returns the Pearson coefficient of correlation of a pair of numeric columns in the group. | [GenericUDAFCorrelation](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFCorrelation.java) | | **double** | ``` percentile(bigint col, p) ``` | Returns the exact pth percentile of a column in the group (does not work with floating point types). p must be between 0 and 1. NOTE: A true percentile can only be computed for integer values. Use PERCENTILE_APPROX if your input is non-integral. | [UDAFPercentile](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/UDAFPercentile.java) | -| **array** | ``` percentile(bigint col, array(p1 [, p2]...)) ``` | Returns the exact percentiles p1, p2, ... of a column in the group (does not work with floating point types). pi must be between 0 and 1. NOTE: A true percentile can only be computed for integer values. Use PERCENTILE_APPROX if your input is non-integral. | +| **array\** | ``` percentile(bigint col, array(p1 [, p2]...)) ``` | Returns the exact percentiles p1, p2, ... of a column in the group (does not work with floating point types). pi must be between 0 and 1. NOTE: A true percentile can only be computed for integer values. Use PERCENTILE_APPROX if your input is non-integral. | | **double** | ``` percentile_approx(double col, p [, B]) ``` | Returns an approximate pth percentile of a numeric column (including floating point types) in the group. The B parameter controls approximation accuracy at the cost of memory. Higher values yield better approximations, and the default is 10,000. When the number of distinct values in col is smaller than B, this gives an exact percentile value. | [GenericUDAFPercentileApprox](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFPercentileApprox.java) | -| **array** | ``` percentile_approx(double col, array(p1 [, p2]...) [, B]) ``` | Same as above, but accepts and returns an array of percentile values instead of a single one. | +| **array\** | ``` percentile_approx(double col, array(p1 [, p2]...) [, B]) ``` | Same as above, but accepts and returns an array of percentile values instead of a single one. | | **double** | ``` regr_avgx(independent, dependent) ``` | Equivalent to avg(dependent).  | | | **double** | ``` regr_avgy(independent, dependent) ``` | Equivalent to avg(independent).  | | | **double** | ``` regr_count(independent, dependent) ``` | Returns the number of non-null pairs used to fit the linear regression line.  | | @@ -64,7 +64,7 @@ These functions can be used without GROUP BY as well.  | **double** | ``` regr_sxx(independent, dependent) ``` | Equivalent to regr_count(independent, dependent) * var_pop(dependent).  | | | **double** | ``` regr_sxy(independent, dependent) ``` | Equivalent to regr_count(independent, dependent) * covar_pop(independent, dependent).  | | | **double** | ``` regr_syy(independent, dependent) ``` | Equivalent to regr_count(independent, dependent) * var_pop(independent). | | -| **array** | ``` histogram_numeric(col, b) ``` | Computes a histogram of a numeric column in the group using b non-uniformly spaced bins. The output is an array of size b of double-valued (x,y) coordinates that represent the bin centers and heights | [GenericUDAFHistogramNumeric](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFHistogramNumeric.java) | +| **array\** | ``` histogram_numeric(col, b) ``` | Computes a histogram of a numeric column in the group using b non-uniformly spaced bins. The output is an array of size b of double-valued (x,y) coordinates that represent the bin centers and heights | [GenericUDAFHistogramNumeric](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFHistogramNumeric.java) | | **array** | ``` collect_set(col) ``` | Returns a set of objects with duplicate elements eliminated. | [GenericUDAFCollectSet](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFCollectSet.java) | | **array** | ``` collect_list(col) ``` | Returns a list of objects with duplicates.  | [GenericUDAFCollectList](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFCollectList.java) | | **int** | ``` ntile(integer x) ``` | Divides an ordered partition into `x` groups called buckets and assign a bucket number to each row in the partition. This allows easy calculation of tertiles, quartiles, deciles, percentiles, and other common summary statistics. | [GenericUDAFNTile](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFNTile.java) | @@ -84,7 +84,7 @@ Normal user-defined functions, such as concat(), take in a single input row and | **T1,...,Tn** | ``` inline(ARRAY> a) ``` | Explodes an array of structs to multiple rows. Returns a row-set with N columns (N = number of top level elements in the struct), one row per struct from the array. | [GenericUDTFInline](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFInline.java) | | **T1,...,Tn/r** | ``` stack(int r,T1 V1,...,Tn/r Vn) ``` | Breaks up *n* values V1,...,Vninto *r* rows. Each row will have *n/r* columns. *r* must be constant. | [GenericUDTFStack](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFStack.java) | | **string1,...,stringn** | ``` json_tuple(string jsonStr, string k1,...,string kn) ``` | Takes JSON string and a set of *n* keys, and returns a tuple of *n* values. This is a more efficient version of the `get_json_object` UDF because it can get multiple keys with just one call. | [GenericUDTFJSONTuple](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFJSONTuple.java) | -| **string 1,...,stringn** | ``` parse_url_tuple(string urlStr, string p1,...,string pn) ``` | Takes URLstring and a set of n URL parts, and returns a tuple of n values. This is similar to the parse_url() UDF but can extract multiple parts at once out of a URL. Valid part names are HOST, PATH, QUERY, REF, PROTOCOL, AUTHORITY, FILE, USERINFO, QUERY:. | [GenericUDTFParseUrlTuple](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFParseUrlTuple.java) | +| **string 1,...,stringn** | ``` parse_url_tuple(string urlStr, string p1,...,string pn) ``` | Takes URLstring and a set of n URL parts, and returns a tuple of n values. This is similar to the parse_url() UDF but can extract multiple parts at once out of a URL. Valid part names are HOST, PATH, QUERY, REF, PROTOCOL, AUTHORITY, FILE, USERINFO, QUERY:\. | [GenericUDTFParseUrlTuple](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFParseUrlTuple.java) | ### String Functions @@ -97,7 +97,7 @@ There is no good engine without string manipulation functions. Apache Hive has r | **int** | ``` character_length(string str) ``` | Returns the number of UTF-8 characters contained in str. The function char_length is shorthand for this function. | [GenericUDFCharacterLength](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCharacterLength.java) | | **string** | ``` chr(bigint\|double A) ``` | Returns the ASCII character having the binary equivalent to A. If A is larger than 256 the result is equivalent to chr(A % 256). Example: select chr(88); returns "X". | [UDFChr](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFChr.java) | | **string** | ``` concat(string\|binary A,string\|binary B...) ``` | Returns the string or bytes resulting from concatenating the strings or bytes passed in as parameters in order. For example, concat('foo', 'bar') results in 'foobar'. Note that this function can take any number of input strings. | [GenericUDFConcat](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFConcat.java) | -| **array>** | ``` context_ngrams(array>, array, int K, int pf) ``` | Returns the top-k contextual N-grams from a set of tokenized sentences, given a string of "context". See [StatisticsAndDataMining]({{< ref "statisticsanddatamining" >}}) for more information. | [GenericUDAFContextNGrams](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFContextNGrams.java) | +| **array\\>** | ``` context_ngrams(array>, array, int K, int pf) ``` | Returns the top-k contextual N-grams from a set of tokenized sentences, given a string of "context". See [StatisticsAndDataMining]({{< ref "statisticsanddatamining" >}}) for more information. | [GenericUDAFContextNGrams](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFContextNGrams.java) | | **string** | ``` concat_ws(string SEP, string A, string B...) ``` | Like concat() above, but with custom separator SEP. | [GenericUDFConcatWS](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFConcatWS.java) | | **string** | ``` concat_ws(string SEP, array) ``` | Like concat_ws() above, but taking an array ofstrings. | | **string** | ``` decode(binary bin, string charset) ``` | Decodes the first argument into a string using the provided character set (one of 'US-ASCII', 'ISO-8859-1', 'UTF-8', 'UTF-16BE', 'UTF-16LE', 'UTF-16'). If either argument is null, the result will also be null. | [GenericUDFDecode](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDecode.java) | @@ -114,7 +114,7 @@ There is no good engine without string manipulation functions. Apache Hive has r | **string** | ``` lower(string A) lcase(string A) ``` | Returns the string resulting from converting all characters of B to lowercase. For example, lower('fOoBaR') results in 'foobar'. | [GenericUDFLower](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFLower.java) | | **string** | ``` lpad(string str,int len,string pad) ``` | Returns str, left-padded with pad to a length of len. If str is longer than len, the return value is shortened to len characters. In the case of an empty padstring, the return value is null. | [GenericUDFLpad](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFLpad.java) | | **string** | ``` ltrim(string A) ``` | Returns the string resulting from trimming spaces from the beginning(left-hand side) of A. For example, ltrim(' foobar ') results in 'foobar '. | [GenericUDFLTrim](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFLTrim.java) | -| **array>** | ``` ngrams(array>,int N,int K,int pf) ``` | Returns the top-k N-grams from a set of tokenized sentences, such as those returned by the sentences() UDAF. See [StatisticsAndDataMining]({{< ref "statisticsanddatamining" >}}) for more information. | [GenericUDAFnGrams](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFnGrams.java) | +| **array\\>** | ``` ngrams(array>,int N,int K,int pf) ``` | Returns the top-k N-grams from a set of tokenized sentences, such as those returned by the sentences() UDAF. See [StatisticsAndDataMining]({{< ref "statisticsanddatamining" >}}) for more information. | [GenericUDAFnGrams](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFnGrams.java) | | **int** | ``` octet_length(string str) ``` | Returns the number of octets required to hold the string str in UTF-8 encoding.  Note that octet_length(str) can be larger than character_length(str). | [GenericUDFOctetLength](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOctetLength.java) | | **string** | ``` parse_url(string urlString,string partToExtract [,string keyToExtract]) ``` | Returns the specified part from the URL. Valid values for partToExtract include HOST, PATH, QUERY, REF, PROTOCOL, AUTHORITY, FILE, and USERINFO. For example, parse_url('', 'HOST') returns '[facebook.com](http://facebook.com)'. Also, a value of a particular key in QUERY can be extracted by providing the key as the third argument, for example, parse_url('', 'QUERY', 'k1') returns 'v1'. | [UDFParseUrl](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFParseUrl.java) | | **string** | ``` printf(string format, Obj... args) ``` | Returns the input formatted according to [printf-style](https://en.wikipedia.org/wiki/Printf) formatstrings. | [GenericUDFPrintf](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFPrintf.java) | @@ -135,10 +135,10 @@ There is no good engine without string manipulation functions. Apache Hive has r | **string** | ``` reverse(string A) ``` | Returns the reversed string. | [UDFReverse](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFReverse.java) | | **string** | ``` rpad(string str,int len,string pad) ``` | Returns str, right-padded with pad to a length of len. If str is longer than len, the return value is shortened to len characters. In the case of an empty padstring, the return value is null. | [GenericUDFRpad](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFRpad.java) | | **string** | ``` rtrim(string A) ``` | Returns the string resulting from trimming spaces from the end(right-hand side) of A. For example, rtrim(' foobar ') results in ' foobar'. | [GenericUDFRTrim](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFRTrim.java) | -| **array>** | ``` sentences(string str,string lang,string locale) ``` | Tokenizes a string of natural language text into words and sentences, where each sentence is broken at the appropriate sentence boundary and returned as an array of words. The 'lang' and 'locale' are optional arguments. For example, sentences('Hello there! How are you?') returns ( ("Hello", "there"), ("How", "are", "you") ). | [GenericUDFSentences](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSentences.java) | +| **array\\>** | ``` sentences(string str,string lang,string locale) ``` | Tokenizes a string of natural language text into words and sentences, where each sentence is broken at the appropriate sentence boundary and returned as an array of words. The 'lang' and 'locale' are optional arguments. For example, sentences('Hello there! How are you?') returns ( ("Hello", "there"), ("How", "are", "you") ). | [GenericUDFSentences](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSentences.java) | | **string** | ``` space(int n) ``` | Returns a string of n spaces. | [UDFSpace](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFSpace.java) | | **array** | ``` split(string str, string pat) ``` | Splits str around pat (pat is a regular expression). | [GenericUDFSplit](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSplit.java) | -| **map** | ``` str_to_map(text[, delimiter1, delimiter2]) ``` | Splits text into key-value pairs using two delimiters. Delimiter1 separates text into K-V pairs, and Delimiter2 splits each K-V pair. Default delimiters are ',' for delimiter1 and ':' for delimiter2. | [GenericUDFStringToMap](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFStringToMap.java) | +| **map\** | ``` str_to_map(text[, delimiter1, delimiter2]) ``` | Splits text into key-value pairs using two delimiters. Delimiter1 separates text into K-V pairs, and Delimiter2 splits each K-V pair. Default delimiters are ',' for delimiter1 and ':' for delimiter2. | [GenericUDFStringToMap](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFStringToMap.java) | | **string** | ``` substr(string\|binary A,int start) substring(string\|binary A,int start) ``` | Returns the substring or slice of the byte array of A starting from start position till the end of string A. For example, substr('foobar', 4) results in 'bar'. | [GenericUDFSubstringIndex](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSubstringIndex.java) | | **string** | ``` substr(string\|binary A,int start,int len) substring(string\|binary A,int start,int len) ``` | Returns the substring or slice of the byte array of A starting from start position with length len. For example, substr('foobar', 4, 1) results in 'b'. | | **string** | ``` substring_index(string A,string delim,int count) ``` | Returns the substring from string A before count occurrences of the delimiter delim. If the count is positive, everything to the left of the final delimiter (counting from the left) is returned. If count is negative, everything to the right of the final delimiter (counting from the right) is returned. Substring_index performs a case-sensitive match when searching for delim. Example: substring_index('[www.apache.org](http://www.apache.org)', '.', 2) = 'www.apache'. | [GenericUDFSubstringIndex](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSubstringIndex.java) | @@ -229,9 +229,9 @@ The following built-in mathematical functions are supported in Hive.  | **int** **bigint** | ``` shiftleft(TINYINT\|SMALLINT\|INT a,int b) ``` ``` shiftleft(bigint a,int b) ``` | Bitwise left shift. Shifts `a` `b` positions to the left.Returns int for tinyint, smallint andint `a`. Returns bigint for bigint `a`. | [UDFOPBitShiftLeft](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFOPBitShiftLeft.java) | | **int** **bigint** | ``` shiftright(TINYINT\|SMALLINT\|INT a,int b) ``` ``` shiftright(bigint a,int b) ``` | Bitwise right shift. Shifts `a` `b` positions to the right.Returns int for tinyint, smallint andint `a`. Returns bigint for bigint `a`. | [UDFOPBitShiftRight](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFOPBitShiftRight.java) | | **int** **bigint** | ``` shiftrightunsigned(TINYINT\|SMALLINT\|INT a,int b), ``` ``` shiftrightunsigned(bigint a,int b) ``` | Bitwise unsigned right shift. Shifts `a` `b` positions to the right.Returns int for tinyint, smallint andint `a`. Returns bigint for bigint `a`. | [UDFOPBitShiftRightUnsigned](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFOPBitShiftRightUnsigned.java) | -| **T** | ``` greatest(T v1, T v2, ...) ``` | Returns the greatest value of the list of values. Fixed to return NULL when one or more arguments are NULL, and strict type restriction relaxed, consistent with ">" operator. | [GenericUDFGreatest](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFGreatest.java) | +| **T** | ``` greatest(T v1, T v2, ...) ``` | Returns the greatest value of the list of values. Fixed to return NULL when one or more arguments are NULL, and strict type restriction relaxed, consistent with "\>" operator. | [GenericUDFGreatest](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFGreatest.java) | | **T** | ``` least(T v1, T v2, ...) ``` | Returns the least value of the list of values. Fixed to return NULL when one or more arguments are NULL, and strict type restriction relaxed, consistent with "<" operator. | [GenericUDFLeast](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFLeast.java) | -| **int** | ``` width_bucket(NUMERIC expr, NUMERIC min_value, NUMERIC max_value,int num_buckets) ``` | Returns an integer between 0 and num_buckets+1 by mapping expr into the ith equally sized bucket. Buckets are made by dividing [min_value, max_value]into equally sized regions. If expr < min_value, return 1, if expr > max_value return num_buckets+1. See | [GenericUDFWidthBucket](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFWidthBucket.java) | +| **int** | ``` width_bucket(NUMERIC expr, NUMERIC min_value, NUMERIC max_value,int num_buckets) ``` | Returns an integer between 0 and num_buckets+1 by mapping expr into the ith equally sized bucket. Buckets are made by dividing [min_value, max_value]into equally sized regions. If expr \< min_value, return 1, if expr \> max_value return num_buckets+1. See | [GenericUDFWidthBucket](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFWidthBucket.java) | | **double** | ``` cosh(double x) ``` | NEW Returns the hyperbolic cosine of `x,` where `x` is in radians. Example: cosh(0) Result: 1 | [UDFCosh](https://github.com/apache/hive/blame/master/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFCosh.java) | | **double** | ``` tanh(double x) ``` | NEW Returns the hyperbolic tangent of `x,` where `x` is in radians. Example: tanh(0) Result: 1 | [UDFTanh](https://github.com/apache/hive/blame/master/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFTanh.java) | @@ -243,20 +243,20 @@ The following built-in collection functions are supported in Hive.  | --- | --- | --- | --- | | **int** | ``` size(Map) ``` | Returns the number of elements in the map type. | | | **int** | ``` size(Array) ``` | Returns the number of elements in the array type. | | -| **array** | ``` map_keys(Map) ``` | Returns an unordered array containing the keys of the input map. | | -| **array** | ``` map_values(Map) ``` | Returns an unordered array containing the values of the input map. | | +| **array\** | ``` map_keys(Map) ``` | Returns an unordered array containing the keys of the input map. | | +| **array\** | ``` map_values(Map) ``` | Returns an unordered array containing the values of the input map. | | | **boolean** | ``` array_contains(Array, value) ``` | Returns TRUE if the array contains the provided parameter value. | | -| **array** | ``` sort_array(Array) ``` | Sorts the input array in ascending order according to the natural ordering of the array elements and returns it. | | -| **array** | ``` array(obj1, obj2, .... objN) ``` | NEW The function returns an array of the same type as the input array with distinct values. Example: array('b', 'd', 'd', 'a') reurtns ['b', 'd', 'a']  | [GenericUDFArrayDistinct.java](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFArrayDistinct.java) | -| **array** | ``` array_slice(array, start, length) ``` | NEW Returns the subset or range of elements. Example: array-slice(array(1, 2, 3, 4), 2 , 2) Result: 3,4 | [GenericUDFArraySlice](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFArraySlice.java)  | +| **array\** | ``` sort_array(Array) ``` | Sorts the input array in ascending order according to the natural ordering of the array elements and returns it. | | +| **array\** | ``` array(obj1, obj2, .... objN) ``` | NEW The function returns an array of the same type as the input array with distinct values. Example: array('b', 'd', 'd', 'a') reurtns ['b', 'd', 'a']  | [GenericUDFArrayDistinct.java](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFArrayDistinct.java) | +| **array\** | ``` array_slice(array, start, length) ``` | NEW Returns the subset or range of elements. Example: array-slice(array(1, 2, 3, 4), 2 , 2) Result: 3,4 | [GenericUDFArraySlice](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFArraySlice.java)  | | **t** | ``` array_min((array(obj1, obj2, obj3...)) ``` | NEW The function returns the minimum value in the array with elements for which order is supported. Example: array_min(array(1, 3, 0, NULL)) Result: 0 | [GenericUDFArrayMin](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFArrayMin.java) | | **t** | ``` array_max((array(obj1, obj2, obj3...)) ``` | NEW The function returns the maximum value in the array with elements for which order is supported. Example: array_max(array(1, 3, 0, NULL)) Result: 3 | [GenericUDFArrayMax](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFArrayMax.java) | | **t** | ``` array_distinct(array(obj1, obj2, obj3...)) ``` | NEW The function returns an array of the same type as the input array with distinct values. Example: array_distinct(array('b', 'd', 'd', 'a')) Result:  ['b', 'd', 'a'] | [GenericUDFArrayDistinct](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFArrayDistinct.java) | | **string** | ``` array_join(array, delimiter, replaceNull) ``` | NEW Concatenate the elements of an array with a specified delimiter. Example: array_join(array(1, 2, NULL, 4), ',',':') Result: 1,2,:,4 | [GenericUDFArrayJoin](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFArrayJoin.java) | -| **array** | ``` array_expect(array1, array2) ``` | NEW Returns an array of the elements in array1 but not in array2. Example: array_expect(array(1, 2, 3,4), array(2,3)) Result: [1,4] | [GenericUDFArrayExcept](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFArrayExcept.java) | -| **array** | ``` array_intersect(array1, array2) ``` | NEW Returns an array of the elements in the intersection of array1 and array2, without duplicates. Example: array_intersect(array(1, 2, 3,4), array(1,2,3)) Result: [1,2,3] | [GenericUDFArrayIntersect](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFArrayIntersect.java) | -| **array** | ``` array_union(array1, array2) ``` | NEW Returns an array of the elements in the union of array1 and array2 without duplicates. Example: array_union(array(1, 2, 2, 4), array(2, 3)) Result: [1, 2, 3, 4] | [GenericUDFArrayUnion](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFArrayUnion.java) | -| **array** | ``` array_remove(array, element) ``` | NEW Removes all occurrences of elements from the array. Example: array_remove(array(1, 2, 3, 4, 2), 2) Result: [1, 3, 4] | [GenericUDFArrayRemove](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFArrayRemove.java) | +| **array\** | ``` array_expect(array1, array2) ``` | NEW Returns an array of the elements in array1 but not in array2. Example: array_expect(array(1, 2, 3,4), array(2,3)) Result: [1,4] | [GenericUDFArrayExcept](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFArrayExcept.java) | +| **array\** | ``` array_intersect(array1, array2) ``` | NEW Returns an array of the elements in the intersection of array1 and array2, without duplicates. Example: array_intersect(array(1, 2, 3,4), array(1,2,3)) Result: [1,2,3] | [GenericUDFArrayIntersect](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFArrayIntersect.java) | +| **array\** | ``` array_union(array1, array2) ``` | NEW Returns an array of the elements in the union of array1 and array2 without duplicates. Example: array_union(array(1, 2, 2, 4), array(2, 3)) Result: [1, 2, 3, 4] | [GenericUDFArrayUnion](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFArrayUnion.java) | +| **array\** | ``` array_remove(array, element) ``` | NEW Removes all occurrences of elements from the array. Example: array_remove(array(1, 2, 3, 4, 2), 2) Result: [1, 3, 4] | [GenericUDFArrayRemove](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFArrayRemove.java) | ### Type Conversion Functions @@ -266,7 +266,7 @@ The following built-in type conversion functions are supported in Hive.  | --- | --- | --- | --- | | **binary** | ``` binary(string\|binary) ``` | Casts the parameter into a binary. | [GenericUDFBaseBinary](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFBaseBinary.java) | | **Expected "=" to follow "type"** | ``` cast(expr as ) ``` | Converts the results of the expression expr to \. For example, cast('1' as bigint) will convert the string '1' to its integral representation. A null is returned if the conversion does not succeed. If cast(expr as boolean) Hive returns true for a non-empty string. | | -| **string or datetime** | ``` CAST(expr AS FORMAT ) ``` | Converts the expression to the specified using the provided . The if present follows the SQL:2016 standard specification. Currently only conversions between datetime and string data types are supported. | [GenericUDFCastFormat](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCastFormat.java) | +| **string or datetime** | ``` CAST(expr AS FORMAT ) ``` | Converts the expression to the specified \ using the provided \. The \ if present follows the SQL:2016 standard specification. Currently only conversions between datetime and string data types are supported. | [GenericUDFCastFormat](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCastFormat.java) | ### Conditional Functions The following built-in conditional functions are supported in Hive. diff --git a/content/docs/latest/language/languagemanual-cli.md b/content/docs/latest/language/languagemanual-cli.md index 2186f1be..590ea9e7 100644 --- a/content/docs/latest/language/languagemanual-cli.md +++ b/content/docs/latest/language/languagemanual-cli.md @@ -116,7 +116,7 @@ When `$HIVE_HOME/bin/hive` is run with the `-e` or `-f` option, it executes SQL Version 0.14 -As of Hive 0.14, can be from one of the Hadoop supported filesystems (HDFS, S3, etc.) as well. +As of Hive 0.14, \ can be from one of the Hadoop supported filesystems (HDFS, S3, etc.) as well. `$HIVE_HOME/bin/hive -f hdfs://:/hive-script.sql``$HIVE_HOME/bin/hive -f s3://mys3bucket/s3-script.sql` @@ -130,21 +130,22 @@ Use ";" (semicolon) to terminate commands. Comments in scripts can be specified | Command | Description | | --- | --- | -| quit exit | Use quit or exit to leave the interactive shell. | -| reset | Resets the configuration to the default values (as of Hive 0.10: see [HIVE-3202](https://issues.apache.org/jira/browse/HIVE-3202)). | -| set = | Sets the value of a particular configuration variable (key). **Note:** If you misspell the variable name, the CLI will not show an error. | -| set | Prints a list of configuration variables that are overridden by the user or Hive. | -| set -v | Prints all Hadoop and Hive configuration variables. | -| add FILE[S] * add JAR[S] * add ARCHIVE[S] * | Adds one or more files, jars, or archives to the list of resources in the distributed cache. See [Hive Resources]({{< ref "#hive-resources" >}}) below for more information. | -| add FILE[S] * add JAR[S] * add ARCHIVE[S] * | As of [Hive 1.2.0](https://issues.apache.org/jira/browse/HIVE-9664), adds one or more files, jars or archives to the list of resources in the distributed cache using an [Ivy](http://ant.apache.org/ivy/) URL of the form ivy://group:module:version?query_string. See [Hive Resources]({{< ref "#hive-resources" >}}) below for more information. | -| list FILE[S] list JAR[S] list ARCHIVE[S] | Lists the resources already added to the distributed cache. See [Hive Resources]({{< ref "#hive-resources" >}}) below for more information. | -| list FILE[S] * list JAR[S] * list ARCHIVE[S] * | Checks whether the given resources are already added to the distributed cache or not. See [Hive Resources]({{< ref "#hive-resources" >}}) below for more information. | -| delete FILE[S] * delete JAR[S] * delete ARCHIVE[S] * | Removes the resource(s) from the distributed cache. | -| delete FILE[S] * delete JAR[S] * delete ARCHIVE[S] * | As of [Hive 1.2.0](https://issues.apache.org/jira/browse/HIVE-9664), removes the resource(s) which were added using the from the distributed cache. See [Hive Resources]({{< ref "#hive-resources" >}}) below for more information. | -| ! | Executes a shell command from the Hive shell. | -| dfs | Executes a dfs command from the Hive shell. | -| | Executes a Hive query and prints results to standard output. | -| source | Executes a script file inside the CLI. | +| `quit exit` | Use quit or exit to leave the interactive shell. | +| `reset` | Resets the configuration to the default values (as of Hive 0.10: see [HIVE-3202](https://issues.apache.org/jira/browse/HIVE-3202)). | +| `set =` | Sets the value of a particular configuration variable (key). **Note:** If you misspell the variable name, the CLI will not show an error. | +| `set` | Prints a list of configuration variables that are overridden by the user or Hive. | +| `set -v` | Prints all Hadoop and Hive configuration variables. | +| `add FILE[S] *` `add JAR[S] *` `add ARCHIVE[S] *` | Adds one or more files, jars, or archives to the list of resources in the distributed cache. See [Hive Resources]({{< ref "#hive-resources" >}}) below for more information. | +| | | +| `add FILE[S] *` `add JAR[S] *` `add ARCHIVE[S] *` | As of [Hive 1.2.0](https://issues.apache.org/jira/browse/HIVE-9664), adds one or more files, jars or archives to the list of resources in the distributed cache using an [Ivy](http://ant.apache.org/ivy/) URL of the form ivy://group:module:version?query_string. See [Hive Resources]({{< ref "#hive-resources" >}}) below for more information. | +| `list FILE[S]` `list JAR[S]` `list ARCHIVE[S]` | Lists the resources already added to the distributed cache. See [Hive Resources]({{< ref "#hive-resources" >}}) below for more information. | +| `list FILE[S] *` `list JAR[S] *` `list ARCHIVE[S] *` | Checks whether the given resources are already added to the distributed cache or not. See [Hive Resources]({{< ref "#hive-resources" >}}) below for more information. | +| `delete FILE[S] *` `delete JAR[S] *` `delete ARCHIVE[S] *` | Removes the resource(s) from the distributed cache. | +| `delete FILE[S] *` `delete JAR[S] *` `delete ARCHIVE[S] *` | As of [Hive 1.2.0](https://issues.apache.org/jira/browse/HIVE-9664), removes the resource(s) which were added using the \ from the distributed cache. See [Hive Resources]({{< ref "#hive-resources" >}}) below for more information. | +| `! ` | Executes a shell command from the Hive shell. | +| `dfs ` | Executes a dfs command from the Hive shell. | +| `` | Executes a Hive query and prints results to standard output. | +| `source ` | Executes a script file inside the CLI. | Sample Usage: @@ -208,7 +209,7 @@ ADD { FILE[S] | JAR[S] | ARCHIVE[S] } * ``` -Also, we can mix and in the same ADD and DELETE commands. +Also, we can mix \ and \ in the same ADD and DELETE commands. ``` ADD { FILE[S] | JAR[S] | ARCHIVE[S] } { | } * * diff --git a/content/docs/latest/language/languagemanual-commands.md b/content/docs/latest/language/languagemanual-commands.md index 12939be1..af63f546 100644 --- a/content/docs/latest/language/languagemanual-commands.md +++ b/content/docs/latest/language/languagemanual-commands.md @@ -9,23 +9,23 @@ Commands are non-SQL statements such as setting a property or adding a resource. | Command | Description | | --- | --- | -| quit exit | Use quit or exit to leave the interactive shell. | -| reset | Resets the configuration to the default values (as of Hive 0.10: see [HIVE-3202](https://issues.apache.org/jira/browse/HIVE-3202)). Any configuration parameters that were set using the set command or -hiveconf parameter in hive commandline will get reset to default value.Note that this does not apply to configuration parameters that were set in set command using the "hiveconf:" prefix for the key name (for historic reasons). | -| set = | Sets the value of a particular configuration variable (key). **Note:** If you misspell the variable name, the CLI will not show an error. | -| set | Prints a list of configuration variables that are overridden by the user or Hive. | -| set -v | Prints all Hadoop and Hive configuration variables. | -| add FILE[S] * add JAR[S] * add ARCHIVE[S] * | Adds one or more files, jars, or archives to the list of resources in the distributed cache. See [Hive Resources]({{< ref "#hive-resources" >}}) for more information. | -| add FILE[S] * add JAR[S]  * add ARCHIVE[S] * | As of [Hive 1.2.0](https://issues.apache.org/jira/browse/HIVE-9664), adds one or more files, jars or archives to the list of resources in the distributed cache using an [Ivy](http://ant.apache.org/ivy/) URL of the form ivy://group:module:version?query_string. See [Hive Resources]({{< ref "#hive-resources" >}}) for more information. | -| list FILE[S] list JAR[S] list ARCHIVE[S] | Lists the resources already added to the distributed cache. See [Hive Resources]({{< ref "#hive-resources" >}}) for more information. | -| list FILE[S] * list JAR[S] * list ARCHIVE[S] * | Checks whether the given resources are already added to the distributed cache or not. See [Hive Resources]({{< ref "#hive-resources" >}}) for more information. | -| delete FILE[S] * delete JAR[S] * delete ARCHIVE[S] * | Removes the resource(s) from the distributed cache. | -| delete FILE[S] * delete JAR[S] * delete ARCHIVE[S] * | As of [Hive 1.2.0](https://issues.apache.org/jira/browse/HIVE-9664), removes the resource(s) which were added using the from the distributed cache. See [Hive Resources]({{< ref "#hive-resources" >}}) for more information. | -| ! | Executes a shell command from the Hive shell. | -| dfs | Executes a dfs command from the Hive shell. | -| | Executes a Hive query and prints results to standard output. | -| source FILE | Executes a script file inside the CLI. | -| compile `` AS GROOVY NAMED | This allows inline Groovy code to be compiled and be used as a UDF (as of Hive [0.13.0](https://issues.apache.org/jira/browse/HIVE-5252)). For a usage example, see [Nov. 2013 Hive Contributors Meetup Presentations – Using Dynamic Compilation with Hive](/attachments/27362054/HiveContrib-Nov13-groovy_plus_hive.pptx). | -| show processlist | Displays information about the operations currently running on HiveServer2. It helps to troubleshoot issues such as long running queries, connection starvation, etc. The command was introduced in [HIVE-27829](https://issues.apache.org/jira/browse/HIVE-27829). | +| `quit exit` | Use quit or exit to leave the interactive shell. | +| `reset` | Resets the configuration to the default values (as of Hive 0.10: see [HIVE-3202](https://issues.apache.org/jira/browse/HIVE-3202)). Any configuration parameters that were set using the set command or -hiveconf parameter in hive commandline will get reset to default value.Note that this does not apply to configuration parameters that were set in set command using the "hiveconf:" prefix for the key name (for historic reasons). | +| `set =` | Sets the value of a particular configuration variable (key). **Note:** If you misspell the variable name, the CLI will not show an error. | +| `set` | Prints a list of configuration variables that are overridden by the user or Hive. | +| `set -v` | Prints all Hadoop and Hive configuration variables. | +| `add FILE[S] *` `add JAR[S] *` `add ARCHIVE[S] *` | Adds one or more files, jars, or archives to the list of resources in the distributed cache. See [Hive Resources]({{< ref "#hive-resources" >}}) for more information. | +| `add FILE[S] *` `add JAR[S]  *` `add ARCHIVE[S] *` | As of [Hive 1.2.0](https://issues.apache.org/jira/browse/HIVE-9664), adds one or more files, jars or archives to the list of resources in the distributed cache using an [Ivy](http://ant.apache.org/ivy/) URL of the form ivy://group:module:version?query_string. See [Hive Resources]({{< ref "#hive-resources" >}}) for more information. | +| `list FILE[S]` `list JAR[S]` `list ARCHIVE[S]` | Lists the resources already added to the distributed cache. See [Hive Resources]({{< ref "#hive-resources" >}}) for more information. | +| `list FILE[S] *` `list JAR[S] *` `list ARCHIVE[S] *` | Checks whether the given resources are already added to the distributed cache or not. See [Hive Resources]({{< ref "#hive-resources" >}}) for more information. | +| `delete FILE[S] *` `delete JAR[S] *` `delete ARCHIVE[S] *` | Removes the resource(s) from the distributed cache. | +| `delete FILE[S] *`  `delete JAR[S] *`  `delete ARCHIVE[S] *` | As of [Hive 1.2.0](https://issues.apache.org/jira/browse/HIVE-9664), removes the resource(s) which were added using the \ from the distributed cache. See [Hive Resources]({{< ref "#hive-resources" >}}) for more information. | +| `! ` | Executes a shell command from the Hive shell. | +| `dfs ` | Executes a dfs command from the Hive shell. | +| `` | Executes a Hive query and prints results to standard output. | +| `source FILE ` | Executes a script file inside the CLI. | +| ``compile `` AS GROOVY NAMED `` | This allows inline Groovy code to be compiled and be used as a UDF (as of Hive [0.13.0](https://issues.apache.org/jira/browse/HIVE-5252)). For a usage example, see [Nov. 2013 Hive Contributors Meetup Presentations – Using Dynamic Compilation with Hive](/attachments/27362054/HiveContrib-Nov13-groovy_plus_hive.pptx). | +| `show processlist` | Displays information about the operations currently running on HiveServer2. It helps to troubleshoot issues such as long running queries, connection starvation, etc. The command was introduced in [HIVE-27829](https://issues.apache.org/jira/browse/HIVE-27829). | Sample Usage: diff --git a/content/docs/latest/language/languagemanual-ddl.md b/content/docs/latest/language/languagemanual-ddl.md index f08c198f..68a26d33 100644 --- a/content/docs/latest/language/languagemanual-ddl.md +++ b/content/docs/latest/language/languagemanual-ddl.md @@ -328,7 +328,7 @@ To change a table's SerDe or SERDEPROPERTIES, use the ALTER TABLE statement as d | Row Format | Description | | --- | --- | -| **RegEx**ROW FORMAT SERDE'org.apache.hadoop.hive.serde2.RegexSerDe'WITH SERDEPROPERTIES ("input.regex" = "")STORED AS TEXTFILE; | Stored as plain text file, translated by Regular Expression.The following example defines a table in the default Apache Weblog format.`CREATE` `TABLE` `apachelog (``host STRING,``identity STRING,``user` `STRING,``time` `STRING,``request STRING,``status STRING,``size` `STRING,``referer STRING,``agent STRING)``ROW FORMAT SERDE``'org.apache.hadoop.hive.serde2.RegexSerDe'``WITH` `SERDEPROPERTIES (``"input.regex"` `=``"([^]*) ([^]*) ([^]*) (-|\\[^\\]*\\]) ([^ \"]*|\"[^\"]*\") (-|[0-9]*) (-|[0-9]*)(?: ([^ \"]*|\".*\") ([^ \"]*|\".*\"))?"``)``STORED``AS` `TEXTFILE;`More about RegexSerDe can be found here in [HIVE-662](https://issues.apache.org/jira/browse/HIVE-662) and [HIVE-1719](https://issues.apache.org/jira/browse/HIVE-1719). | +| **RegEx**ROW FORMAT SERDE'org.apache.hadoop.hive.serde2.RegexSerDe'WITH SERDEPROPERTIES ("input.regex" = "\")STORED AS TEXTFILE; | Stored as plain text file, translated by Regular Expression.The following example defines a table in the default Apache Weblog format.`CREATE` `TABLE` `apachelog (``host STRING,``identity STRING,``user` `STRING,``time` `STRING,``request STRING,``status STRING,``size` `STRING,``referer STRING,``agent STRING)``ROW FORMAT SERDE``'org.apache.hadoop.hive.serde2.RegexSerDe'``WITH` `SERDEPROPERTIES (``"input.regex"` `=``"([^]*) ([^]*) ([^]*) (-|\\[^\\]*\\]) ([^ \"]*|\"[^\"]*\") (-|[0-9]*) (-|[0-9]*)(?: ([^ \"]*|\".*\") ([^ \"]*|\".*\"))?"``)``STORED``AS` `TEXTFILE;`More about RegexSerDe can be found here in [HIVE-662](https://issues.apache.org/jira/browse/HIVE-662) and [HIVE-1719](https://issues.apache.org/jira/browse/HIVE-1719). | | **JSON** ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe' STORED AS TEXTFILE | Stored as plain text file in JSON format.The JsonSerDe for JSON files is available in [Hive 0.12](https://issues.apache.org/jira/browse/HIVE-4895) and later.In some distributions, a reference to hive-hcatalog-core.jar is required.`ADD JAR /usr/lib/hive-hcatalog/lib/hive-hcatalog-core.jar;CREATE` `TABLE` `my_table(a string, b``bigint``, ...)``ROW FORMAT SERDE``'org.apache.hive.hcatalog.data.JsonSerDe'``STORED``AS` `TEXTFILE;`The JsonSerDe was moved to Hive from HCatalog and before it was in hive-contrib project. It was added to the Hive distribution by [HIVE-4895](https://issues.apache.org/jira/browse/HIVE-4895).An Amazon SerDe is available at `s3://elasticmapreduce/samples/hive-ads/libs/jsonserde.jar` for releases prior to 0.12.0.The JsonSerDe for JSON files is available in [Hive 0.12](https://issues.apache.org/jira/browse/HIVE-4895) and later.Starting in Hive 3.0.0, JsonSerDe is added to Hive Serde as "org.apache.hadoop.hive.serde2.JsonSerDe" ([HIVE-19211](https://issues.apache.org/jira/browse/HIVE-19211)).`CREATE` `TABLE` `my_table(a string, b``bigint``, ...)``ROW FORMAT SERDE``'org.apache.hadoop.hive.serde2.JsonSerDe'``STORED``AS` `TEXTFILE;`Or `STORED AS JSONFILE` is supported starting in Hive 4.0.0 ([HIVE-19899](https://issues.apache.org/jira/browse/HIVE-19899)), so you can create table as follows:`CREATE` `TABLE` `my_table(a string, b``bigint``, ...) STORED AS JSONFILE;` | | **CSV/TSV**ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde' STORED AS TEXTFILE | Stored as plain text file in CSV / TSV format. The CSVSerde is available in [Hive 0.14](https://issues.apache.org/jira/browse/HIVE-7777) and greater.The following example creates a TSV (Tab-separated) file.``CREATE` `TABLE` `my_table(a string, b string, ...)`ROW FORMAT SERDE``'org.apache.hadoop.hive.serde2.OpenCSVSerde'``WITH` `SERDEPROPERTIES (``"separatorChar"` `=``"\t"``,``"quoteChar"`     `=``"'"``,``"escapeChar"`    `=``"\\"``)``STORED``AS` `TEXTFILE;`Default properties for SerDe is Comma-Separated (CSV) file `DEFAULT_ESCAPE_CHARACTER \``DEFAULT_QUOTE_CHARACTER  "``DEFAULT_SEPARATOR        ,`This SerDe works for most CSV data, but does not handle embedded newlines. To use the SerDe, specify the fully qualified class name org.apache.hadoop.hive.serde2.OpenCSVSerde.  Documentation is based on original documentation at .**Limitations**This SerDe treats all columns to be of type String. Even if you create a table with non-string column types using this SerDe, the DESCRIBE TABLE output would show string column type. The type information is retrieved from the SerDe. To convert columns to the desired type in a table, you can create a view over the table that does the CAST to the desired type.The CSV SerDe is based on , and was added to the Hive distribution in [HIVE-7777](https://issues.apache.org/jira/browse/HIVE-7777).The CSVSerde has been built and tested against Hive 0.14 and later, and uses [Open-CSV](http://opencsv.sourceforge.net/) 2.3 which is bundled with the Hive distribution.For general information about SerDes, see [Hive SerDe](/community/resources/developerguide#hive-serde) in the Developer Guide. Also see [SerDe](/docs/latest/user/serde) for details about input and output processing. | @@ -397,7 +397,7 @@ STORED AS SEQUENCEFILE; The above statement lets you create the same table as the previous table. -In the previous examples the data is stored in /page_view. Specify a value for the key `[hive.metastore.warehouse.dir]({{< ref "#hive-metastore-warehouse-dir" >}})` in the Hive config file hive-site.xml. +In the previous examples the data is stored in \/page_view. Specify a value for the key `[hive.metastore.warehouse.dir]({{< ref "#hive-metastore-warehouse-dir" >}})` in the Hive config file hive-site.xml. #### External Tables diff --git a/content/docs/latest/language/languagemanual-dml.md b/content/docs/latest/language/languagemanual-dml.md index 6fc6fdca..42dba68c 100644 --- a/content/docs/latest/language/languagemanual-dml.md +++ b/content/docs/latest/language/languagemanual-dml.md @@ -333,7 +333,7 @@ SQL Standard requires that an error is raised if the ON clause is such that more * 1, 2, or 3 WHEN clauses may be present; at most 1 of each type:  UPDATE/DELETE/INSERT. * WHEN NOT MATCHED must be the last WHEN clause. -* If both UPDATE and DELETE clauses are present, the first one in the statement must include [AND ]. +* If both UPDATE and DELETE clauses are present, the first one in the statement must include [AND \]. * Vectorization will be turned off for merge operations.  This is automatic and requires no action on the part of the user.  Non-delete operations are not affected.  Tables with deleted data can still be queried using vectorization. ##### Examples diff --git a/content/docs/latest/language/languagemanual-lateralview.md b/content/docs/latest/language/languagemanual-lateralview.md index b212ed5d..ed4ad42e 100644 --- a/content/docs/latest/language/languagemanual-lateralview.md +++ b/content/docs/latest/language/languagemanual-lateralview.md @@ -32,7 +32,7 @@ Consider the following base table named `pageAds`. It has two columns: `pageid` | Column name | Column type | | --- | --- | | pageid | STRING | -| adid_list | Array | +| adid_list | Array\ | An example table with two rows: diff --git a/content/docs/latest/language/languagemanual-orc.md b/content/docs/latest/language/languagemanual-orc.md index 7cdb5119..4b7f5919 100644 --- a/content/docs/latest/language/languagemanual-orc.md +++ b/content/docs/latest/language/languagemanual-orc.md @@ -71,10 +71,10 @@ The parameters are all placed in the TBLPROPERTIES (see [Create Table]({{< ref " | orc.compress | ZLIB | high level compression (one of NONE, ZLIB, SNAPPY) | | orc.compress.size | 262,144 | number of bytes in each compression chunk | | orc.stripe.size | 67,108,864 | number of bytes in each stripe | -| orc.row.index.stride | 10,000 | number of rows between index entries (must be >= 1000) | +| orc.row.index.stride | 10,000 | number of rows between index entries (must be \>= 1000) | | orc.create.index | true | whether to create row indexes | | orc.bloom.filter.columns | "" | comma separated list of column names for which bloom filter should be created | -| orc.bloom.filter.fpp | 0.05 | false positive probability for bloom filter (must >0.0 and <1.0) | +| orc.bloom.filter.fpp | 0.05 | false positive probability for bloom filter (must \>0.0 and \<1.0) | For example, creating an ORC stored table without compression: @@ -175,9 +175,9 @@ Specifying `--skip-dump` along with `--recover` will perform recovery without Specifying `--backup-path` with a *new-path* will let the recovery tool move corrupted files to the specified backup path (default: /tmp). -** is the URI of the ORC file. +*\* is the URI of the ORC file. -** is the URI of the ORC file or directory. From [Hive 1.3.0](https://issues.apache.org/jira/browse/HIVE-11669) onward, this URI can be a directory containing ORC files. +*\* is the URI of the ORC file or directory. From [Hive 1.3.0](https://issues.apache.org/jira/browse/HIVE-11669) onward, this URI can be a directory containing ORC files. ## ORC Configuration Parameters diff --git a/content/docs/latest/language/scheduled-queries.md b/content/docs/latest/language/scheduled-queries.md index 7c16389e..aa8b4a42 100644 --- a/content/docs/latest/language/scheduled-queries.md +++ b/content/docs/latest/language/scheduled-queries.md @@ -32,7 +32,7 @@ Hive has it’s scheduled query interface built into the language itself for eas ## Create Scheduled query syntax -**CREATE SCHEDULED QUERY +**CREATE SCHEDULED QUERY \ [``](/docs/latest/language/scheduled-queries#schedulespecification-syntax) [[``](/docs/latest/language/scheduled-queries#executedas-syntax)] [[``](/docs/latest/language/scheduled-queries#enablespecification-syntax)] @@ -41,7 +41,7 @@ Hive has it’s scheduled query interface built into the language itself for eas ## Alter Scheduled query syntax -**ALTER SCHEDULED QUERY ( +**ALTER SCHEDULED QUERY \ ( [``](/docs/latest/language/scheduled-queries#schedulespecification-syntax)| [``](/docs/latest/language/scheduled-queries#executedas-syntax)| [``](/docs/latest/language/scheduled-queries#enablespecification-syntax)| @@ -54,7 +54,7 @@ Hive has it’s scheduled query interface built into the language itself for eas -**DROP SCHEDULED QUERY ;** +**DROP SCHEDULED QUERY \;** @@ -64,7 +64,7 @@ Schedules can be specified using CRON expressions or for common cases there is a ### CRON based schedule syntax -**CRON ** +**CRON \** where quartz_schedule_expression is quoted schedule in the Quartz format @@ -76,7 +76,7 @@ For example the `CRON '0 */10 * * * ? *'`  expression will fire every 10 minut To give a more readable way to declare schedules EVERY can be used. -**EVERY [] (SECOND|MINUTE|HOUR) [(OFFSET BY|AT) ]** +**EVERY [\] (SECOND|MINUTE|HOUR) [(OFFSET BY|AT) \]** the format makes it possible to declare schedules in a more readable way: @@ -90,7 +90,7 @@ EVERY DAY AT '11:35:30'** ## ExecutedAs syntax -**EXECUTED AS ** +**EXECUTED AS \** Scheduled queries are executed as the declaring user by default; but people with admin privileges might be able to change the executing user. @@ -106,7 +106,7 @@ In case there are in-flight scheduled executions at the time when the correspon ## Defined AS syntax -**[DEFINED] AS ** +**[DEFINED] AS \** The “query” is a single statement expression to be scheduled for execution. diff --git a/content/docs/latest/language/sql-standard-based-hive-authorization.md b/content/docs/latest/language/sql-standard-based-hive-authorization.md index 9792209c..9dfa86e6 100644 --- a/content/docs/latest/language/sql-standard-based-hive-authorization.md +++ b/content/docs/latest/language/sql-standard-based-hive-authorization.md @@ -179,7 +179,7 @@ principal_specification Revokes the membership of the roles from the user/roles in the FROM clause. -As of Hive 0.14.0, revoking just the ADMIN OPTION is possible with the use of REVOKE ADMIN OPTION FOR ([HIVE-6252](https://issues.apache.org/jira/browse/HIVE-6252)). +As of Hive 0.14.0, revoking just the ADMIN OPTION is possible with the use of REVOKE ADMIN OPTION FOR \ ([HIVE-6252](https://issues.apache.org/jira/browse/HIVE-6252)). #### Show Role Grant @@ -261,7 +261,7 @@ priv_type : INSERT | SELECT | UPDATE | DELETE | ALL ``` -If a user is granted a privilege WITH GRANT OPTION on a table or view, then the user can also grant/revoke privileges of other users and roles on those objects. As of Hive 0.14.0, the grant option for a privilege can be removed while still keeping the privilege by using REVOKE GRANT OPTION FOR ([HIVE-7404](https://issues.apache.org/jira/browse/HIVE-7404)). +If a user is granted a privilege WITH GRANT OPTION on a table or view, then the user can also grant/revoke privileges of other users and roles on those objects. As of Hive 0.14.0, the grant option for a privilege can be removed while still keeping the privilege by using REVOKE GRANT OPTION FOR \ ([HIVE-7404](https://issues.apache.org/jira/browse/HIVE-7404)). Note that in case of the REVOKE statement, the DROP-BEHAVIOR option of CASCADE is not currently supported (which is in SQL standard). As a result, the revoke statement will not drop any dependent privileges. For details on CASCADE behavior, you can check the [Postgres revoke documentation](http://www.postgresql.org/docs/8.4/static/sql-revoke.html). diff --git a/content/docs/latest/language/supported-features-apache-hive-2-1.md b/content/docs/latest/language/supported-features-apache-hive-2-1.md index a56a46ba..f906a8ef 100644 --- a/content/docs/latest/language/supported-features-apache-hive-2-1.md +++ b/content/docs/latest/language/supported-features-apache-hive-2-1.md @@ -31,7 +31,7 @@ date: 2024-12-12 | E051 | Basic query specification | Yes | | | E051-01 | SELECT DISTINCT | Yes | | | E051-02 | GROUP BY clause | Partial | Empty grouping sets not supported | -| E051-04 | GROUP BY can contain columns not in | Yes | | | E051-05 | Select list items can be renamed | Yes | | | E051-06 | HAVING clause | Yes | | | E051-07 | Qualified * in select list | Yes | | @@ -126,10 +126,10 @@ date: 2024-12-12 | F651 | Catalog name qualifiers | Yes | | | F846 | Octet support in regular expression operators | Yes | | | F847 | Nonconstant regular expressions | Yes | | -| F850 | Top-level in | Yes | | -| F851 | in subqueries | Yes | | -| F852 | Top-level in views | Yes | | -| F855 | Nested in | Yes | | +| F850 | Top-level \ in \ | Yes | | +| F851 | \ in subqueries | Yes | | +| F852 | Top-level \ in views | Yes | | +| F855 | Nested \ in \ | Yes | | | S023 | Basic structured types | Yes | | | S091 | Basic array support | Yes | | | S091-01 | Arrays of built-in data types | Yes | | diff --git a/content/docs/latest/language/supported-features-apache-hive-2-3.md b/content/docs/latest/language/supported-features-apache-hive-2-3.md index fc16bb93..ebd8ee15 100644 --- a/content/docs/latest/language/supported-features-apache-hive-2-3.md +++ b/content/docs/latest/language/supported-features-apache-hive-2-3.md @@ -32,7 +32,7 @@ date: 2024-12-12 | E051 | Basic query specification | Yes | | | E051-01 | SELECT DISTINCT | Yes | | | E051-02 | GROUP BY clause | Partial | Empty grouping sets not supported | -| E051-04 | GROUP BY can contain columns not in | Yes | | | E051-05 | Select list items can be renamed | Yes | | | E051-06 | HAVING clause | Yes | | | E051-07 | Qualified * in select list | Yes | | @@ -142,10 +142,10 @@ date: 2024-12-12 | F651 | Catalog name qualifiers | Yes | | | F846 | Octet support in regular expression operators | Yes | | | F847 | Nonconstant regular expressions | Yes | | -| F850 | Top-level in | Yes | | -| F851 | in subqueries | Yes | | -| F852 | Top-level in views | Yes | | -| F855 | Nested in | Yes | | +| F850 | Top-level \ in \ | Yes | | +| F851 | \ in subqueries | Yes | | +| F852 | Top-level \ in views | Yes | | +| F855 | Nested \ in \ | Yes | | | S023 | Basic structured types | Yes | | | S091 | Basic array support | Yes | | | S091-01 | Arrays of built-in data types | Yes | | diff --git a/content/docs/latest/language/supported-features.md b/content/docs/latest/language/supported-features.md index 13466c2b..0522546b 100644 --- a/content/docs/latest/language/supported-features.md +++ b/content/docs/latest/language/supported-features.md @@ -36,7 +36,7 @@ This table covers all mandatory features from [SQL:2016](https://en.wikipedia.o | E051 | Basic query specification | Yes | Mandatory | | | E051-01 | SELECT DISTINCT | Yes | Mandatory | | | E051-02 | GROUP BY clause | Yes | Mandatory | | -| E051-04 | GROUP BY can contain columns not in | Yes | Mandatory | | | E051-05 | Select list items can be renamed | Yes | Mandatory | | | E051-06 | HAVING clause | Yes | Mandatory | | | E051-07 | Qualified * in select list | Yes | Mandatory | | @@ -196,14 +196,14 @@ This table covers all mandatory features from [SQL:2016](https://en.wikipedia.o | F812 | Basic flagging | No | Mandatory | | | F841 | LIKE_REGEX predicate | Partial | Optional | use RLIKE instead | | F847 | Nonconstant regular expressions | Yes | Optional | | -| F850 | Top level in | Yes | Optional | | -| F851 | in subqueries | Yes | Optional | | -| F852 | Top-level in views | Yes | Optional | | -| F855 | Nested in | Yes | Optional | | -| F856 | Nested in | Yes | Optional | | -| F857 | Top-level in | Yes | Optional | | -| F858 | in subqueries | Yes | Optional | | -| F859 | Top-level in views | Yes | Optional | | +| F850 | Top level \ in \ | Yes | Optional | | +| F851 | \ in subqueries | Yes | Optional | | +| F852 | Top-level \ in views | Yes | Optional | | +| F855 | Nested \ in \ | Yes | Optional | | +| F856 | Nested \ in \ | Yes | Optional | | +| F857 | Top-level \ in \ | Yes | Optional | | +| F858 | \ in subqueries | Yes | Optional | | +| F859 | Top-level \ in views | Yes | Optional | | | S011 | Distinct data types | No | Mandatory | | | S091 | Basic array support | Partial | Optional | Syntax non-standard.No option to declare max cardinality.SIZE instead of CARDINALITY. | | S091-01 | Arrays of built-in data types | Partial | Optional | Syntax non-standard |