diff --git a/src/UserGuide/Master/Table/SQL-Manual/Basis-Function.md b/src/UserGuide/Master/Table/SQL-Manual/Basis-Function.md index d8c967202..303809766 100644 --- a/src/UserGuide/Master/Table/SQL-Manual/Basis-Function.md +++ b/src/UserGuide/Master/Table/SQL-Manual/Basis-Function.md @@ -156,30 +156,30 @@ SELECT LEAST(temperature,humidity) FROM table2; ### 2.2 Supported Aggregate Functions -| Function Name | Description | Allowed Input Types | Output Type | -|:-----------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------------| -| COUNT | Counts the number of data points. | All types | INT64 | -| COUNT_IF | COUNT_IF(exp) counts the number of rows that satisfy a specified boolean expression. | `exp` must be a boolean expression,(e.g. `count_if(temperature>20)`) | INT64 | -| APPROX_COUNT_DISTINCT | The APPROX_COUNT_DISTINCT(x[, maxStandardError]) function provides an approximation of COUNT(DISTINCT x), returning the estimated number of distinct input values. | `x`: The target column to be calculated, supports all data types.
`maxStandardError` (optional): Specifies the maximum standard error allowed for the function's result. Valid range is [0.0040625, 0.26]. Defaults to 0.023 if not specified. | INT64 | -| APPROX_MOST_FREQUENT | The APPROX_MOST_FREQUENT(x, k, capacity) function is used to approximately calculate the top k most frequent elements in a dataset. It returns a JSON-formatted string where the keys are the element values and the values are their corresponding approximate frequencies. (Available since V2.0.5.1) | `x` : The column to be calculated, supporting all existing data types in IoTDB;
`k`: The number of top-k most frequent values to return;
`capacity`: The number of buckets used for computation, which relates to memory usage—a larger value reduces error but consumes more memory, while a smaller value increases error but uses less memory. | STRING | -| SUM | Calculates the sum. | INT32 INT64 FLOAT DOUBLE | DOUBLE | -| AVG | Calculates the average. | INT32 INT64 FLOAT DOUBLE | DOUBLE | -| MAX | Finds the maximum value. | All types | Same as input type | -| MIN | Finds the minimum value. | All types | Same as input type | -| FIRST | Finds the value with the smallest timestamp that is not NULL. | All types | Same as input type | -| LAST | Finds the value with the largest timestamp that is not NULL. | All types | Same as input type | -| STDDEV | Alias for STDDEV_SAMP, calculates the sample standard deviation. | INT32 INT64 FLOAT DOUBLE | DOUBLE | -| STDDEV_POP | Calculates the population standard deviation. | INT32 INT64 FLOAT DOUBLE | DOUBLE | -| STDDEV_SAMP | Calculates the sample standard deviation. | INT32 INT64 FLOAT DOUBLE | DOUBLE | -| VARIANCE | Alias for VAR_SAMP, calculates the sample variance. | INT32 INT64 FLOAT DOUBLE | DOUBLE | -| VAR_POP | Calculates the population variance. | INT32 INT64 FLOAT DOUBLE | DOUBLE | -| VAR_SAMP | Calculates the sample variance. | INT32 INT64 FLOAT DOUBLE | DOUBLE | -| EXTREME | Finds the value with the largest absolute value. If the largest absolute values of positive and negative values are equal, returns the positive value. | INT32 INT64 FLOAT DOUBLE | Same as input type | -| MODE | Finds the mode. Note: 1. There is a risk of memory exception when the number of distinct values in the input sequence is too large; 2. If all elements have the same frequency, i.e., there is no mode, a random element is returned; 3. If there are multiple modes, a random mode is returned; 4. NULL values are also counted in frequency, so even if not all values in the input sequence are NULL, the final result may still be NULL. | All types | Same as input type | -| MAX_BY | MAX_BY(x, y) finds the value of x corresponding to the maximum y in the binary input x and y. MAX_BY(time, x) returns the timestamp when x is at its maximum. | x and y can be of any type | Same as the data type of the first input x | -| MIN_BY | MIN_BY(x, y) finds the value of x corresponding to the minimum y in the binary input x and y. MIN_BY(time, x) returns the timestamp when x is at its minimum. | x and y can be of any type | Same as the data type of the first input x | -| FIRST_BY | FIRST_BY(x, y) finds the value of x in the same row when y is the first non-null value. | x and y can be of any type | Same as the data type of the first input x | -| LAST_BY | LAST_BY(x, y) finds the value of x in the same row when y is the last non-null value. | x and y can be of any type | Same as the data type of the first input x | +| Function Name | Description | Allowed Input Types | Output Type | +|:-----------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------------| +| COUNT | Counts the number of data points. | All types | INT64 | +| COUNT_IF | COUNT_IF(exp) counts the number of rows that satisfy a specified boolean expression. | `exp` must be a boolean expression,(e.g. `count_if(temperature>20)`) | INT64 | +| APPROX_COUNT_DISTINCT | The APPROX_COUNT_DISTINCT(x[, maxStandardError]) function provides an approximation of COUNT(DISTINCT x), returning the estimated number of distinct input values. | `x`: The target column to be calculated, supports all data types.
`maxStandardError` (optional): Specifies the maximum standard error allowed for the function's result. Valid range is [0.0040625, 0.26]. Defaults to 0.023 if not specified. | INT64 | +| APPROX_MOST_FREQUENT | The APPROX_MOST_FREQUENT(x, k, capacity) function is used to approximately calculate the top k most frequent elements in a dataset. It returns a JSON-formatted string where the keys are the element values and the values are their corresponding approximate frequencies. (Available since V2.0.5.1) | `x` : The column to be calculated, supporting all existing data types in IoTDB;
`k`: The number of top-k most frequent values to return;
`capacity`: The number of buckets used for computation, which relates to memory usage—a larger value reduces error but consumes more memory, while a smaller value increases error but uses less memory. | STRING | +| SUM | Calculates the sum. | INT32 INT64 FLOAT DOUBLE | DOUBLE | +| AVG | Calculates the average. | INT32 INT64 FLOAT DOUBLE | DOUBLE | +| MAX | Finds the maximum value. | All types | Same as input type | +| MIN | Finds the minimum value. | All types | Same as input type | +| FIRST | Finds the value with the smallest timestamp that is not NULL. | All types | Same as input type | +| LAST | Finds the value with the largest timestamp that is not NULL. | All types | Same as input type | +| STDDEV | Alias for STDDEV_SAMP, calculates the sample standard deviation. | INT32 INT64 FLOAT DOUBLE | DOUBLE | +| STDDEV_POP | Calculates the population standard deviation. | INT32 INT64 FLOAT DOUBLE | DOUBLE | +| STDDEV_SAMP | Calculates the sample standard deviation. | INT32 INT64 FLOAT DOUBLE | DOUBLE | +| VARIANCE | Alias for VAR_SAMP, calculates the sample variance. | INT32 INT64 FLOAT DOUBLE | DOUBLE | +| VAR_POP | Calculates the population variance. | INT32 INT64 FLOAT DOUBLE | DOUBLE | +| VAR_SAMP | Calculates the sample variance. | INT32 INT64 FLOAT DOUBLE | DOUBLE | +| EXTREME | Finds the value with the largest absolute value. If the largest absolute values of positive and negative values are equal, returns the positive value. | INT32 INT64 FLOAT DOUBLE | Same as input type | +| MODE | Finds the mode. Note: 1. There is a risk of memory exception when the number of distinct values in the input sequence is too large; 2. If all elements have the same frequency, i.e., there is no mode, a random element is returned; 3. If there are multiple modes, a random mode is returned; 4. NULL values are also counted in frequency, so even if not all values in the input sequence are NULL, the final result may still be NULL. | All types | Same as input type | +| MAX_BY | MAX_BY(x, y) finds the value of x corresponding to the maximum y in the binary input x and y. MAX_BY(time, x) returns the timestamp when x is at its maximum. | x and y can be of any type | Same as the data type of the first input x | +| MIN_BY | MIN_BY(x, y) finds the value of x corresponding to the minimum y in the binary input x and y. MIN_BY(time, x) returns the timestamp when x is at its minimum. | x and y can be of any type | Same as the data type of the first input x | +| FIRST_BY | FIRST_BY(x, y) finds the value of x in the same row when y is the first non-null value. | x and y can be of any type | Same as the data type of the first input x | +| LAST_BY | LAST_BY(x, y) finds the value of x in the same row when y is the last non-null value. | x and y can be of any type | Same as the data type of the first input x | ### 2.3 Examples @@ -686,7 +686,7 @@ FROM table1; ``` -Result**:** +Result: ```Plain +-----------------------------+-----------------------------+ @@ -715,6 +715,59 @@ Total line number = 18 It costs 0.319s ``` +### 4.3 Extract Function + +This function is used to extract the value of a specific part of a date. (Supported from version V2.0.6) + +#### 4.3.1 Syntax Definition + +```SQL +EXTRACT (identifier FROM expression) +``` + +* Parameter Description + * **expression**: `TIMESTAMP` type or a time constant + * **identifier**: The valid ranges and corresponding return value types are shown in the table below. + + | Valid Range | Return Type | Return Range | + |----------------------|---------------|--------------------| + | `YEAR` | `INT64` | `/` | + | `QUARTER` | `INT64` | `1-4` | + | `MONTH` | `INT64` | `1-12` | + | `WEEK` | `INT64` | `1-53` | + | `DAY_OF_MONTH (DAY)` | `INT64` | `1-31` | + | `DAY_OF_WEEK (DOW)` | `INT64` | `1-7` | + | `DAY_OF_YEAR (DOY)` | `INT64` | `1-366` | + | `HOUR` | `INT64` | `0-23` | + | `MINUTE` | `INT64` | `0-59` | + | `SECOND` | `INT64` | `0-59` | + | `MS` | `INT64` | `0-999` | + | `US` | `INT64` | `0-999` | + | `NS` | `INT64` | `0-999` | + + +#### 4.3.2 Usage Example + +Using table1 from the [Sample Data](../Reference/Sample-Data.md) as the source data, query the average temperature for the first 12 hours of each day within a certain period. + +```SQL +IoTDB:database1> select format('%1$tY-%1$tm-%1$td',date_bin(1d,time)) as fmtdate,avg(temperature) as avgtp from table1 where time >= 2024-11-26T00:00:00 and time <= 2024-11-30T23:59:59 and extract(hour from time) <= 12 group by date_bin(1d,time) order by date_bin(1d,time) ++----------+-----+ +| fmtdate|avgtp| ++----------+-----+ +|2024-11-28| 86.0| +|2024-11-29| 85.0| +|2024-11-30| 90.0| ++----------+-----+ +Total line number = 3 +It costs 0.041s +``` + +Introduction to the `Format` function: [Format Function](../SQL-Manual/Basis-Function.md#_7-2-format-function) + +Introduction to the `Date_bin` function: [Date_bin Funtion](../SQL-Manual/Basis-Function.md#_4-2-date-bin-interval-timestamp-timestamp-timestamp) + + ## 5. Mathematical Functions and Operators ### 5.1 Mathematical Operators diff --git a/src/UserGuide/latest-Table/SQL-Manual/Basis-Function.md b/src/UserGuide/latest-Table/SQL-Manual/Basis-Function.md index 65ba20146..303809766 100644 --- a/src/UserGuide/latest-Table/SQL-Manual/Basis-Function.md +++ b/src/UserGuide/latest-Table/SQL-Manual/Basis-Function.md @@ -686,7 +686,7 @@ FROM table1; ``` -Result**:** +Result: ```Plain +-----------------------------+-----------------------------+ @@ -715,6 +715,59 @@ Total line number = 18 It costs 0.319s ``` +### 4.3 Extract Function + +This function is used to extract the value of a specific part of a date. (Supported from version V2.0.6) + +#### 4.3.1 Syntax Definition + +```SQL +EXTRACT (identifier FROM expression) +``` + +* Parameter Description + * **expression**: `TIMESTAMP` type or a time constant + * **identifier**: The valid ranges and corresponding return value types are shown in the table below. + + | Valid Range | Return Type | Return Range | + |----------------------|---------------|--------------------| + | `YEAR` | `INT64` | `/` | + | `QUARTER` | `INT64` | `1-4` | + | `MONTH` | `INT64` | `1-12` | + | `WEEK` | `INT64` | `1-53` | + | `DAY_OF_MONTH (DAY)` | `INT64` | `1-31` | + | `DAY_OF_WEEK (DOW)` | `INT64` | `1-7` | + | `DAY_OF_YEAR (DOY)` | `INT64` | `1-366` | + | `HOUR` | `INT64` | `0-23` | + | `MINUTE` | `INT64` | `0-59` | + | `SECOND` | `INT64` | `0-59` | + | `MS` | `INT64` | `0-999` | + | `US` | `INT64` | `0-999` | + | `NS` | `INT64` | `0-999` | + + +#### 4.3.2 Usage Example + +Using table1 from the [Sample Data](../Reference/Sample-Data.md) as the source data, query the average temperature for the first 12 hours of each day within a certain period. + +```SQL +IoTDB:database1> select format('%1$tY-%1$tm-%1$td',date_bin(1d,time)) as fmtdate,avg(temperature) as avgtp from table1 where time >= 2024-11-26T00:00:00 and time <= 2024-11-30T23:59:59 and extract(hour from time) <= 12 group by date_bin(1d,time) order by date_bin(1d,time) ++----------+-----+ +| fmtdate|avgtp| ++----------+-----+ +|2024-11-28| 86.0| +|2024-11-29| 85.0| +|2024-11-30| 90.0| ++----------+-----+ +Total line number = 3 +It costs 0.041s +``` + +Introduction to the `Format` function: [Format Function](../SQL-Manual/Basis-Function.md#_7-2-format-function) + +Introduction to the `Date_bin` function: [Date_bin Funtion](../SQL-Manual/Basis-Function.md#_4-2-date-bin-interval-timestamp-timestamp-timestamp) + + ## 5. Mathematical Functions and Operators ### 5.1 Mathematical Operators diff --git a/src/zh/UserGuide/Master/Table/SQL-Manual/Basis-Function.md b/src/zh/UserGuide/Master/Table/SQL-Manual/Basis-Function.md index 6ed554c8b..e16267f44 100644 --- a/src/zh/UserGuide/Master/Table/SQL-Manual/Basis-Function.md +++ b/src/zh/UserGuide/Master/Table/SQL-Manual/Basis-Function.md @@ -159,8 +159,8 @@ SELECT LEAST(temperature,humidity) FROM table2; |-----------------------|------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------|------------------| | COUNT | 计算数据点数。 | 所有类型 | INT64 | | COUNT_IF | COUNT_IF(exp) 用于统计满足指定布尔表达式的记录行数 | exp 必须是一个布尔类型的表达式,例如 count_if(temperature>20) | INT64 | -| APPROX_COUNT_DISTINCT | APPROX_COUNT_DISTINCT(x[,maxStandardError]) 函数提供 COUNT(DISTINCT x) 的近似值,返回不同输入值的近似个数。 | `x`:待计算列,支持所有类型;
`maxStandardError`:指定该函数应产生的最大标准误差,取值范围[0.0040625, 0.26],未指定值时默认0.023。 | INT64 | -| APPROX_MOST_FREQUENT | APPROX_MOST_FREQUENT(x, k, capacity) 函数用于近似计算数据集中出现频率最高的前 k 个元素。它返回一个JSON 格式的字符串,其中键是该元素的值,值是该元素对应的近似频率。(V 2.0.5.1 及以后版本支持) | `x`:待计算列,支持 IoTDB 现有所有的数据类型;
`k`:返回出现频率最高的 k 个值;
`capacity`: 用于计算的桶的数量,跟内存占用相关:其值越大误差越小,但占用内存更大,反之capacity值越小误差越大,但占用内存更小。 | STRING | +| APPROX_COUNT_DISTINCT | APPROX_COUNT_DISTINCT(x[,maxStandardError]) 函数提供 COUNT(DISTINCT x) 的近似值,返回不同输入值的近似个数。 | `x`:待计算列,支持所有类型;
`maxStandardError`:指定该函数应产生的最大标准误差,取值范围[0.0040625, 0.26],未指定值时默认0.023。 | INT64 | +| APPROX_MOST_FREQUENT | APPROX_MOST_FREQUENT(x, k, capacity) 函数用于近似计算数据集中出现频率最高的前 k 个元素。它返回一个JSON 格式的字符串,其中键是该元素的值,值是该元素对应的近似频率。(V 2.0.5.1 及以后版本支持) | `x`:待计算列,支持 IoTDB 现有所有的数据类型;
`k`:返回出现频率最高的 k 个值;
`capacity`: 用于计算的桶的数量,跟内存占用相关:其值越大误差越小,但占用内存更大,反之capacity值越小误差越大,但占用内存更小。 | STRING | | SUM | 求和。 | INT32 INT64 FLOAT DOUBLE | DOUBLE | | AVG | 求平均值。 | INT32 INT64 FLOAT DOUBLE | DOUBLE | | MAX | 求最大值。 | 所有类型 | 与输入类型一致 | @@ -251,7 +251,6 @@ Total line number = 1 It costs 0.022s ``` - #### 2.3.5 Approx_most_frequent 查询 `table1` 中 `temperature` 列出现频次最高的2个值 @@ -716,6 +715,58 @@ Total line number = 18 It costs 0.319s ``` +### 4.3 Extract 函数 + +该函数用于提取日期对应部分的值。(V2.0.6 版本起支持) + +#### 4.3.1 语法定义 + +```SQL +EXTRACT (identifier FROM expression) +``` +* 参数说明 + * **expression**: `TIMESTAMP` 类型或时间常量 + * **identifier** :取值范围及对应的返回值见下表 + + | 取值范围 | 返回值类型 | 返回值范围 | + | -------------------------- | ------------- | ------------- | + | `YEAR` | `INT64` | `/` | + | `QUARTER` | `INT64` | `1-4` | + | `MONTH` | `INT64` | `1-12` | + | `WEEK` | `INT64` | `1-53` | + | `DAY_OF_MONTH (DAY)` | `INT64` | `1-31` | + | `DAY_OF_WEEK (DOW)` | `INT64` | `1-7` | + | `DAY_OF_YEAR (DOY)` | `INT64` | `1-366` | + | `HOUR` | `INT64` | `0-23` | + | `MINUTE` | `INT64` | `0-59` | + | `SECOND` | `INT64` | `0-59` | + | `MS` | `INT64` | `0-999` | + | `US` | `INT64` | `0-999` | + | `NS` | `INT64` | `0-999` | + + +#### 4.3.2 使用示例 + +以[示例数据](../Reference/Sample-Data.md)中的 table1 为源数据,查询某段时间每天前12个小时的温度平均值 + +```SQL +IoTDB:database1> select format('%1$tY-%1$tm-%1$td',date_bin(1d,time)) as fmtdate,avg(temperature) as avgtp from table1 where time >= 2024-11-26T00:00:00 and time <= 2024-11-30T23:59:59 and extract(hour from time) <= 12 group by date_bin(1d,time) order by date_bin(1d,time) ++----------+-----+ +| fmtdate|avgtp| ++----------+-----+ +|2024-11-28| 86.0| +|2024-11-29| 85.0| +|2024-11-30| 90.0| ++----------+-----+ +Total line number = 3 +It costs 0.041s +``` + +`Format` 函数介绍:[Format 函数](../SQL-Manual/Basis-Function.md#_7-2-format-函数) + +`Date_bin` 函数介绍:[Date_bin 函数](../SQL-Manual/Basis-Function.md#_4-2-date-bin-interval-timestamp-timestamp-timestamp) + + ## 5. 数学函数和运算符 ### 5.1 数学运算符 diff --git a/src/zh/UserGuide/latest-Table/SQL-Manual/Basis-Function.md b/src/zh/UserGuide/latest-Table/SQL-Manual/Basis-Function.md index 219d68208..e16267f44 100644 --- a/src/zh/UserGuide/latest-Table/SQL-Manual/Basis-Function.md +++ b/src/zh/UserGuide/latest-Table/SQL-Manual/Basis-Function.md @@ -715,6 +715,58 @@ Total line number = 18 It costs 0.319s ``` +### 4.3 Extract 函数 + +该函数用于提取日期对应部分的值。(V2.0.6 版本起支持) + +#### 4.3.1 语法定义 + +```SQL +EXTRACT (identifier FROM expression) +``` +* 参数说明 + * **expression**: `TIMESTAMP` 类型或时间常量 + * **identifier** :取值范围及对应的返回值见下表 + + | 取值范围 | 返回值类型 | 返回值范围 | + | -------------------------- | ------------- | ------------- | + | `YEAR` | `INT64` | `/` | + | `QUARTER` | `INT64` | `1-4` | + | `MONTH` | `INT64` | `1-12` | + | `WEEK` | `INT64` | `1-53` | + | `DAY_OF_MONTH (DAY)` | `INT64` | `1-31` | + | `DAY_OF_WEEK (DOW)` | `INT64` | `1-7` | + | `DAY_OF_YEAR (DOY)` | `INT64` | `1-366` | + | `HOUR` | `INT64` | `0-23` | + | `MINUTE` | `INT64` | `0-59` | + | `SECOND` | `INT64` | `0-59` | + | `MS` | `INT64` | `0-999` | + | `US` | `INT64` | `0-999` | + | `NS` | `INT64` | `0-999` | + + +#### 4.3.2 使用示例 + +以[示例数据](../Reference/Sample-Data.md)中的 table1 为源数据,查询某段时间每天前12个小时的温度平均值 + +```SQL +IoTDB:database1> select format('%1$tY-%1$tm-%1$td',date_bin(1d,time)) as fmtdate,avg(temperature) as avgtp from table1 where time >= 2024-11-26T00:00:00 and time <= 2024-11-30T23:59:59 and extract(hour from time) <= 12 group by date_bin(1d,time) order by date_bin(1d,time) ++----------+-----+ +| fmtdate|avgtp| ++----------+-----+ +|2024-11-28| 86.0| +|2024-11-29| 85.0| +|2024-11-30| 90.0| ++----------+-----+ +Total line number = 3 +It costs 0.041s +``` + +`Format` 函数介绍:[Format 函数](../SQL-Manual/Basis-Function.md#_7-2-format-函数) + +`Date_bin` 函数介绍:[Date_bin 函数](../SQL-Manual/Basis-Function.md#_4-2-date-bin-interval-timestamp-timestamp-timestamp) + + ## 5. 数学函数和运算符 ### 5.1 数学运算符