diff --git a/src/UserGuide/Master/Table/SQL-Manual/Basis-Function.md b/src/UserGuide/Master/Table/SQL-Manual/Basis-Function.md
index 98f992ba0..cd20b3671 100644
--- a/src/UserGuide/Master/Table/SQL-Manual/Basis-Function.md
+++ b/src/UserGuide/Master/Table/SQL-Manual/Basis-Function.md
@@ -154,30 +154,31 @@ SELECT LEAST(temperature,humidity) FROM table2;
2. Except for `COUNT()`, all other aggregate functions ignore null values and return null when there are no input rows or all values are null. For example, `SUM()` returns null instead of zero, and `AVG()` does not include null values in the count.
-### 2.2 Supported Aggregate Functions
-
-| Function Name | Description | Allowed Input Types | Output Type |
-|:--------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------|:-------------------------------------------|
-| COUNT | Counts the number of data points. | All types | INT64 |
-| COUNT_IF | COUNT_IF(exp) counts the number of rows that satisfy a specified boolean expression. | `exp` must be a boolean expression,(e.g. `count_if(temperature>20)`) | INT64 |
-| SUM | Calculates the sum. | INT32 INT64 FLOAT DOUBLE | DOUBLE |
-| AVG | Calculates the average. | INT32 INT64 FLOAT DOUBLE | DOUBLE |
-| MAX | Finds the maximum value. | All types | Same as input type |
-| MIN | Finds the minimum value. | All types | Same as input type |
-| FIRST | Finds the value with the smallest timestamp that is not NULL. | All types | Same as input type |
-| LAST | Finds the value with the largest timestamp that is not NULL. | All types | Same as input type |
-| STDDEV | Alias for STDDEV_SAMP, calculates the sample standard deviation. | INT32 INT64 FLOAT DOUBLE | DOUBLE |
-| STDDEV_POP | Calculates the population standard deviation. | INT32 INT64 FLOAT DOUBLE | DOUBLE |
-| STDDEV_SAMP | Calculates the sample standard deviation. | INT32 INT64 FLOAT DOUBLE | DOUBLE |
-| VARIANCE | Alias for VAR_SAMP, calculates the sample variance. | INT32 INT64 FLOAT DOUBLE | DOUBLE |
-| VAR_POP | Calculates the population variance. | INT32 INT64 FLOAT DOUBLE | DOUBLE |
-| VAR_SAMP | Calculates the sample variance. | INT32 INT64 FLOAT DOUBLE | DOUBLE |
-| EXTREME | Finds the value with the largest absolute value. If the largest absolute values of positive and negative values are equal, returns the positive value. | INT32 INT64 FLOAT DOUBLE | Same as input type |
-| MODE | Finds the mode. Note: 1. There is a risk of memory exception when the number of distinct values in the input sequence is too large; 2. If all elements have the same frequency, i.e., there is no mode, a random element is returned; 3. If there are multiple modes, a random mode is returned; 4. NULL values are also counted in frequency, so even if not all values in the input sequence are NULL, the final result may still be NULL. | All types | Same as input type |
-| MAX_BY | MAX_BY(x, y) finds the value of x corresponding to the maximum y in the binary input x and y. MAX_BY(time, x) returns the timestamp when x is at its maximum. | x and y can be of any type | Same as the data type of the first input x |
-| MIN_BY | MIN_BY(x, y) finds the value of x corresponding to the minimum y in the binary input x and y. MIN_BY(time, x) returns the timestamp when x is at its minimum. | x and y can be of any type | Same as the data type of the first input x |
-| FIRST_BY | FIRST_BY(x, y) finds the value of x in the same row when y is the first non-null value. | x and y can be of any type | Same as the data type of the first input x |
-| LAST_BY | LAST_BY(x, y) finds the value of x in the same row when y is the last non-null value. | x and y can be of any type | Same as the data type of the first input x |
+### 2.2 Supported Aggregate Functions
+
+| Function Name | Description | Allowed Input Types | Output Type |
+|:-----------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------------|
+| COUNT | Counts the number of data points. | All types | INT64 |
+| COUNT_IF | COUNT_IF(exp) counts the number of rows that satisfy a specified boolean expression. | `exp` must be a boolean expression,(e.g. `count_if(temperature>20)`) | INT64 |
+| APPROX_COUNT_DISTINCT | The APPROX_COUNT_DISTINCT(x[, maxStandardError]) function provides an approximation of COUNT(DISTINCT x), returning the estimated number of distinct input values. | `x`: The target column to be calculated, supports all data types.
`maxStandardError` (optional): Specifies the maximum standard error allowed for the function's result. Valid range is [0.0040625, 0.26]. Defaults to 0.023 if not specified. | INT64 |
+| SUM | Calculates the sum. | INT32 INT64 FLOAT DOUBLE | DOUBLE |
+| AVG | Calculates the average. | INT32 INT64 FLOAT DOUBLE | DOUBLE |
+| MAX | Finds the maximum value. | All types | Same as input type |
+| MIN | Finds the minimum value. | All types | Same as input type |
+| FIRST | Finds the value with the smallest timestamp that is not NULL. | All types | Same as input type |
+| LAST | Finds the value with the largest timestamp that is not NULL. | All types | Same as input type |
+| STDDEV | Alias for STDDEV_SAMP, calculates the sample standard deviation. | INT32 INT64 FLOAT DOUBLE | DOUBLE |
+| STDDEV_POP | Calculates the population standard deviation. | INT32 INT64 FLOAT DOUBLE | DOUBLE |
+| STDDEV_SAMP | Calculates the sample standard deviation. | INT32 INT64 FLOAT DOUBLE | DOUBLE |
+| VARIANCE | Alias for VAR_SAMP, calculates the sample variance. | INT32 INT64 FLOAT DOUBLE | DOUBLE |
+| VAR_POP | Calculates the population variance. | INT32 INT64 FLOAT DOUBLE | DOUBLE |
+| VAR_SAMP | Calculates the sample variance. | INT32 INT64 FLOAT DOUBLE | DOUBLE |
+| EXTREME | Finds the value with the largest absolute value. If the largest absolute values of positive and negative values are equal, returns the positive value. | INT32 INT64 FLOAT DOUBLE | Same as input type |
+| MODE | Finds the mode. Note: 1. There is a risk of memory exception when the number of distinct values in the input sequence is too large; 2. If all elements have the same frequency, i.e., there is no mode, a random element is returned; 3. If there are multiple modes, a random mode is returned; 4. NULL values are also counted in frequency, so even if not all values in the input sequence are NULL, the final result may still be NULL. | All types | Same as input type |
+| MAX_BY | MAX_BY(x, y) finds the value of x corresponding to the maximum y in the binary input x and y. MAX_BY(time, x) returns the timestamp when x is at its maximum. | x and y can be of any type | Same as the data type of the first input x |
+| MIN_BY | MIN_BY(x, y) finds the value of x corresponding to the minimum y in the binary input x and y. MIN_BY(time, x) returns the timestamp when x is at its minimum. | x and y can be of any type | Same as the data type of the first input x |
+| FIRST_BY | FIRST_BY(x, y) finds the value of x in the same row when y is the first non-null value. | x and y can be of any type | Same as the data type of the first input x |
+| LAST_BY | LAST_BY(x, y) finds the value of x in the same row when y is the last non-null value. | x and y can be of any type | Same as the data type of the first input x |
### 2.3 Examples
@@ -229,8 +230,29 @@ Total line number = 1
It costs 0.047s
```
+#### 2.3.4 Approx_count_distinct
-#### 2.3.4 First
+Retrieve the number of distinct values in the `temperature` column from `table1`.
+
+```sql
+IoTDB> SELECT COUNT(DISTINCT temperature) as origin, APPROX_COUNT_DISTINCT(temperature) as approx FROM table1;
+IoTDB> SELECT COUNT(DISTINCT temperature) as origin, APPROX_COUNT_DISTINCT(temperature,0.006) as approx FROM table1;
+```
+
+The execution result is as follows:
+
+```sql
++------+------+
+|origin|approx|
++------+------+
+| 3| 3|
++------+------+
+Total line number = 1
+It costs 0.022s
+```
+
+
+#### 2.3.5 First
Finds the values with the smallest timestamp that are not NULL in the `temperature` and `humidity` columns.
@@ -250,7 +272,7 @@ Total line number = 1
It costs 0.170s
```
-#### 2.3.5 Last
+#### 2.3.6 Last
Finds the values with the largest timestamp that are not NULL in the `temperature` and `humidity` columns.
@@ -270,7 +292,7 @@ Total line number = 1
It costs 0.211s
```
-#### 2.3.6 First_by
+#### 2.3.7 First_by
Finds the `time` value of the row with the smallest timestamp that is not NULL in the `temperature` column, and the `humidity` value of the row with the smallest timestamp that is not NULL in the `temperature` column.
@@ -290,7 +312,7 @@ Total line number = 1
It costs 0.269s
```
-#### 2.3.7 Last_by
+#### 2.3.8 Last_by
Queries the `time` value of the row with the largest timestamp that is not NULL in the `temperature` column, and the `humidity` value of the row with the largest timestamp that is not NULL in the `temperature` column.
@@ -310,7 +332,7 @@ Total line number = 1
It costs 0.070s
```
-#### 2.3.8 Max_by
+#### 2.3.9 Max_by
Queries the `time` value of the row where the `temperature` column is at its maximum, and the `humidity` value of the row where the `temperature` column is at its maximum.
@@ -330,7 +352,7 @@ Total line number = 1
It costs 0.172s
```
-#### 2.3.9 Min_by
+#### 2.3.10 Min_by
Queries the `time` value of the row where the `temperature` column is at its minimum, and the `humidity` value of the row where the `temperature` column is at its minimum.
@@ -395,7 +417,7 @@ NULL OR true -- true
##### 3.2.2.1 Truth Table
-The following truth table illustrates how `NULL` is handled in `AND` and `OR` operators:
+The following truth table illustrates how `NULL` is handled in `AND` and `OR` operators:
| a | b | a AND b | a OR b |
| :---- | :---- | :------ | :----- |
@@ -469,7 +491,7 @@ date_bin(interval,source,origin)
4. If `source` is `null`, the function returns `null`.
5. Mixing months and non-month time units (e.g., `1 MONTH 1 DAY`) is not supported due to ambiguity.
-> For example, if the starting point is **April 30, 2000**, calculating `1 DAY` first and then `1 MONTH` results in **June 1, 2000**, whereas calculating `1 MONTH` first and then `1 DAY` results in **May 31, 2000**. The resulting dates are different.
+> For example, if the starting point is **April 30, 2000**, calculating `1 DAY` first and then `1 MONTH` results in **June 1, 2000**, whereas calculating `1 MONTH` first and then `1 DAY` results in **May 31, 2000**. The resulting dates are different.
#### 4.2.2 Examples
@@ -980,10 +1002,10 @@ Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 701: Invalid format string: %.5f (
```
3. Invalid Invocation Errors
-Triggered if:
+ Triggered if:
-* Total arguments < 2 (must include `pattern` and at least one argument).•
-* `pattern` is not of type `STRING`/`TEXT`.
+ * Total arguments < 2 (must include `pattern` and at least one argument).•
+ * `pattern` is not of type `STRING`/`TEXT`.
```SQL
-- Example 1
@@ -1006,7 +1028,7 @@ The `||` operator is used for string concatenation and functions the same as the
#### 8.1.2 LIKE Statement
-The `LIKE` statement is used for pattern matching. For detailed usage, refer to Pattern Matching:[LIKE](#1-like-运算符).
+ The `LIKE` statement is used for pattern matching. For detailed usage, refer to Pattern Matching:[LIKE](#1-like-运算符).
### 8.2 String Functions
diff --git a/src/UserGuide/latest-Table/SQL-Manual/Basis-Function.md b/src/UserGuide/latest-Table/SQL-Manual/Basis-Function.md
index 5c97d7435..cd20b3671 100644
--- a/src/UserGuide/latest-Table/SQL-Manual/Basis-Function.md
+++ b/src/UserGuide/latest-Table/SQL-Manual/Basis-Function.md
@@ -156,28 +156,29 @@ SELECT LEAST(temperature,humidity) FROM table2;
### 2.2 Supported Aggregate Functions
-| Function Name | Description | Allowed Input Types | Output Type |
-|:--------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------|:-------------------------------------------|
-| COUNT | Counts the number of data points. | All types | INT64 |
-| COUNT_IF | COUNT_IF(exp) counts the number of rows that satisfy a specified boolean expression. | `exp` must be a boolean expression,(e.g. `count_if(temperature>20)`) | INT64 |
-| SUM | Calculates the sum. | INT32 INT64 FLOAT DOUBLE | DOUBLE |
-| AVG | Calculates the average. | INT32 INT64 FLOAT DOUBLE | DOUBLE |
-| MAX | Finds the maximum value. | All types | Same as input type |
-| MIN | Finds the minimum value. | All types | Same as input type |
-| FIRST | Finds the value with the smallest timestamp that is not NULL. | All types | Same as input type |
-| LAST | Finds the value with the largest timestamp that is not NULL. | All types | Same as input type |
-| STDDEV | Alias for STDDEV_SAMP, calculates the sample standard deviation. | INT32 INT64 FLOAT DOUBLE | DOUBLE |
-| STDDEV_POP | Calculates the population standard deviation. | INT32 INT64 FLOAT DOUBLE | DOUBLE |
-| STDDEV_SAMP | Calculates the sample standard deviation. | INT32 INT64 FLOAT DOUBLE | DOUBLE |
-| VARIANCE | Alias for VAR_SAMP, calculates the sample variance. | INT32 INT64 FLOAT DOUBLE | DOUBLE |
-| VAR_POP | Calculates the population variance. | INT32 INT64 FLOAT DOUBLE | DOUBLE |
-| VAR_SAMP | Calculates the sample variance. | INT32 INT64 FLOAT DOUBLE | DOUBLE |
-| EXTREME | Finds the value with the largest absolute value. If the largest absolute values of positive and negative values are equal, returns the positive value. | INT32 INT64 FLOAT DOUBLE | Same as input type |
-| MODE | Finds the mode. Note: 1. There is a risk of memory exception when the number of distinct values in the input sequence is too large; 2. If all elements have the same frequency, i.e., there is no mode, a random element is returned; 3. If there are multiple modes, a random mode is returned; 4. NULL values are also counted in frequency, so even if not all values in the input sequence are NULL, the final result may still be NULL. | All types | Same as input type |
-| MAX_BY | MAX_BY(x, y) finds the value of x corresponding to the maximum y in the binary input x and y. MAX_BY(time, x) returns the timestamp when x is at its maximum. | x and y can be of any type | Same as the data type of the first input x |
-| MIN_BY | MIN_BY(x, y) finds the value of x corresponding to the minimum y in the binary input x and y. MIN_BY(time, x) returns the timestamp when x is at its minimum. | x and y can be of any type | Same as the data type of the first input x |
-| FIRST_BY | FIRST_BY(x, y) finds the value of x in the same row when y is the first non-null value. | x and y can be of any type | Same as the data type of the first input x |
-| LAST_BY | LAST_BY(x, y) finds the value of x in the same row when y is the last non-null value. | x and y can be of any type | Same as the data type of the first input x |
+| Function Name | Description | Allowed Input Types | Output Type |
+|:-----------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------------|
+| COUNT | Counts the number of data points. | All types | INT64 |
+| COUNT_IF | COUNT_IF(exp) counts the number of rows that satisfy a specified boolean expression. | `exp` must be a boolean expression,(e.g. `count_if(temperature>20)`) | INT64 |
+| APPROX_COUNT_DISTINCT | The APPROX_COUNT_DISTINCT(x[, maxStandardError]) function provides an approximation of COUNT(DISTINCT x), returning the estimated number of distinct input values. | `x`: The target column to be calculated, supports all data types.
`maxStandardError` (optional): Specifies the maximum standard error allowed for the function's result. Valid range is [0.0040625, 0.26]. Defaults to 0.023 if not specified. | INT64 |
+| SUM | Calculates the sum. | INT32 INT64 FLOAT DOUBLE | DOUBLE |
+| AVG | Calculates the average. | INT32 INT64 FLOAT DOUBLE | DOUBLE |
+| MAX | Finds the maximum value. | All types | Same as input type |
+| MIN | Finds the minimum value. | All types | Same as input type |
+| FIRST | Finds the value with the smallest timestamp that is not NULL. | All types | Same as input type |
+| LAST | Finds the value with the largest timestamp that is not NULL. | All types | Same as input type |
+| STDDEV | Alias for STDDEV_SAMP, calculates the sample standard deviation. | INT32 INT64 FLOAT DOUBLE | DOUBLE |
+| STDDEV_POP | Calculates the population standard deviation. | INT32 INT64 FLOAT DOUBLE | DOUBLE |
+| STDDEV_SAMP | Calculates the sample standard deviation. | INT32 INT64 FLOAT DOUBLE | DOUBLE |
+| VARIANCE | Alias for VAR_SAMP, calculates the sample variance. | INT32 INT64 FLOAT DOUBLE | DOUBLE |
+| VAR_POP | Calculates the population variance. | INT32 INT64 FLOAT DOUBLE | DOUBLE |
+| VAR_SAMP | Calculates the sample variance. | INT32 INT64 FLOAT DOUBLE | DOUBLE |
+| EXTREME | Finds the value with the largest absolute value. If the largest absolute values of positive and negative values are equal, returns the positive value. | INT32 INT64 FLOAT DOUBLE | Same as input type |
+| MODE | Finds the mode. Note: 1. There is a risk of memory exception when the number of distinct values in the input sequence is too large; 2. If all elements have the same frequency, i.e., there is no mode, a random element is returned; 3. If there are multiple modes, a random mode is returned; 4. NULL values are also counted in frequency, so even if not all values in the input sequence are NULL, the final result may still be NULL. | All types | Same as input type |
+| MAX_BY | MAX_BY(x, y) finds the value of x corresponding to the maximum y in the binary input x and y. MAX_BY(time, x) returns the timestamp when x is at its maximum. | x and y can be of any type | Same as the data type of the first input x |
+| MIN_BY | MIN_BY(x, y) finds the value of x corresponding to the minimum y in the binary input x and y. MIN_BY(time, x) returns the timestamp when x is at its minimum. | x and y can be of any type | Same as the data type of the first input x |
+| FIRST_BY | FIRST_BY(x, y) finds the value of x in the same row when y is the first non-null value. | x and y can be of any type | Same as the data type of the first input x |
+| LAST_BY | LAST_BY(x, y) finds the value of x in the same row when y is the last non-null value. | x and y can be of any type | Same as the data type of the first input x |
### 2.3 Examples
@@ -229,8 +230,29 @@ Total line number = 1
It costs 0.047s
```
+#### 2.3.4 Approx_count_distinct
-#### 2.3.4 First
+Retrieve the number of distinct values in the `temperature` column from `table1`.
+
+```sql
+IoTDB> SELECT COUNT(DISTINCT temperature) as origin, APPROX_COUNT_DISTINCT(temperature) as approx FROM table1;
+IoTDB> SELECT COUNT(DISTINCT temperature) as origin, APPROX_COUNT_DISTINCT(temperature,0.006) as approx FROM table1;
+```
+
+The execution result is as follows:
+
+```sql
++------+------+
+|origin|approx|
++------+------+
+| 3| 3|
++------+------+
+Total line number = 1
+It costs 0.022s
+```
+
+
+#### 2.3.5 First
Finds the values with the smallest timestamp that are not NULL in the `temperature` and `humidity` columns.
@@ -250,7 +272,7 @@ Total line number = 1
It costs 0.170s
```
-#### 2.3.5 Last
+#### 2.3.6 Last
Finds the values with the largest timestamp that are not NULL in the `temperature` and `humidity` columns.
@@ -270,7 +292,7 @@ Total line number = 1
It costs 0.211s
```
-#### 2.3.6 First_by
+#### 2.3.7 First_by
Finds the `time` value of the row with the smallest timestamp that is not NULL in the `temperature` column, and the `humidity` value of the row with the smallest timestamp that is not NULL in the `temperature` column.
@@ -290,7 +312,7 @@ Total line number = 1
It costs 0.269s
```
-#### 2.3.7 Last_by
+#### 2.3.8 Last_by
Queries the `time` value of the row with the largest timestamp that is not NULL in the `temperature` column, and the `humidity` value of the row with the largest timestamp that is not NULL in the `temperature` column.
@@ -310,7 +332,7 @@ Total line number = 1
It costs 0.070s
```
-#### 2.3.8 Max_by
+#### 2.3.9 Max_by
Queries the `time` value of the row where the `temperature` column is at its maximum, and the `humidity` value of the row where the `temperature` column is at its maximum.
@@ -330,7 +352,7 @@ Total line number = 1
It costs 0.172s
```
-#### 2.3.9 Min_by
+#### 2.3.10 Min_by
Queries the `time` value of the row where the `temperature` column is at its minimum, and the `humidity` value of the row where the `temperature` column is at its minimum.
diff --git a/src/zh/UserGuide/Master/Table/SQL-Manual/Basis-Function.md b/src/zh/UserGuide/Master/Table/SQL-Manual/Basis-Function.md
index 62ad12638..5b3bca340 100644
--- a/src/zh/UserGuide/Master/Table/SQL-Manual/Basis-Function.md
+++ b/src/zh/UserGuide/Master/Table/SQL-Manual/Basis-Function.md
@@ -144,7 +144,7 @@ SELECT GREATEST(temperature,humidity) FROM table2;
-- 查询 table2 中 temperature 和 humidity 的最小记录
SELECT LEAST(temperature,humidity) FROM table2;
```
-
+
## 2. 聚合函数
@@ -153,30 +153,31 @@ SELECT LEAST(temperature,humidity) FROM table2;
1. 聚合函数是多对一函数。它们对一组值进行聚合计算,得到单个聚合结果。
2. 除了 `COUNT()`之外,其他所有聚合函数都忽略空值,并在没有输入行或所有值为空时返回空值。 例如,`SUM()` 返回 null 而不是零,而 `AVG()` 在计数中不包括 null 值。
-### 2.2 支持的聚合函数
-
-| 函数名 | 功能描述 | 允许的输入类型 | 输出类型 |
-| ----------- | ------------------------------------------------------------ |-----------------------------------------------|------------------|
-| COUNT | 计算数据点数。 | 所有类型 | INT64 |
-| COUNT_IF | COUNT_IF(exp) 用于统计满足指定布尔表达式的记录行数 | exp 必须是一个布尔类型的表达式,例如 count_if(temperature>20) | INT64 |
-| SUM | 求和。 | INT32 INT64 FLOAT DOUBLE | DOUBLE |
-| AVG | 求平均值。 | INT32 INT64 FLOAT DOUBLE | DOUBLE |
-| MAX | 求最大值。 | 所有类型 | 与输入类型一致 |
-| MIN | 求最小值。 | 所有类型 | 与输入类型一致 |
-| FIRST | 求时间戳最小且不为 NULL 的值。 | 所有类型 | 与输入类型一致 |
-| LAST | 求时间戳最大且不为 NULL 的值。 | 所有类型 | 与输入类型一致 |
-| STDDEV | STDDEV_SAMP 的别名,求样本标准差。 | INT32 INT64 FLOAT DOUBLE | DOUBLE |
-| STDDEV_POP | 求总体标准差。 | INT32 INT64 FLOAT DOUBLE | DOUBLE |
-| STDDEV_SAMP | 求样本标准差。 | INT32 INT64 FLOAT DOUBLE | DOUBLE |
-| VARIANCE | VAR_SAMP 的别名,求样本方差。 | INT32 INT64 FLOAT DOUBLE | DOUBLE |
-| VAR_POP | 求总体方差。 | INT32 INT64 FLOAT DOUBLE | DOUBLE |
-| VAR_SAMP | 求样本方差。 | INT32 INT64 FLOAT DOUBLE | DOUBLE |
-| EXTREME | 求具有最大绝对值的值。如果正值和负值的最大绝对值相等,则返回正值。 | INT32 INT64 FLOAT DOUBLE | 与输入类型一致 |
-| MODE | 求众数。注意: 1.输入序列的不同值个数过多时会有内存异常风险; 2.如果所有元素出现的频次相同,即没有众数,则随机返回一个元素; 3.如果有多个众数,则随机返回一个众数; 4. NULL 值也会被统计频次,所以即使输入序列的值不全为 NULL,最终结果也可能为 NULL。 | 所有类型 | 与输入类型一致 |
-| MAX_BY | MAX_BY(x, y) 求二元输入 x 和 y 在 y 最大时对应的 x 的值。MAX_BY(time, x) 返回 x 取最大值时对应的时间戳。 | x 和 y 可以是任意类型 | 与第一个输入 x 的数据类型一致 |
-| MIN_BY | MIN_BY(x, y) 求二元输入 x 和 y 在 y 最小时对应的 x 的值。MIN_BY(time, x) 返回 x 取最小值时对应的时间戳。 | x 和 y 可以是任意类型 | 与第一个输入 x 的数据类型一致 |
-| FIRST_BY | FIRST_BY(x, y) 求当 y 为第一个不为 NULL 的值时,同一行里对应的 x 值。 | x 和 y 可以是任意类型 | 与第一个输入 x 的数据类型一致 |
-| LAST_BY | LAST_BY(x, y) 求当 y 为最后一个不为 NULL 的值时,同一行里对应的 x 值。 | x 和 y 可以是任意类型 | 与第一个输入 x 的数据类型一致 |
+### 2.2 支持的聚合函数
+
+| 函数名 | 功能描述 | 允许的输入类型 | 输出类型 |
+|-----------------------|------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------|------------------|
+| COUNT | 计算数据点数。 | 所有类型 | INT64 |
+| COUNT_IF | COUNT_IF(exp) 用于统计满足指定布尔表达式的记录行数 | exp 必须是一个布尔类型的表达式,例如 count_if(temperature>20) | INT64 |
+| APPROX_COUNT_DISTINCT | APPROX_COUNT_DISTINCT(x[,maxStandardError]) 函数提供 COUNT(DISTINCT x) 的近似值,返回不同输入值的近似个数。 | x:待计算列,支持所有类型;
maxStandardError:指定该函数应产生的最大标准误差,取值范围[0.0040625, 0.26],未指定值时默认0.023。 | INT64 |
+| SUM | 求和。 | INT32 INT64 FLOAT DOUBLE | DOUBLE |
+| AVG | 求平均值。 | INT32 INT64 FLOAT DOUBLE | DOUBLE |
+| MAX | 求最大值。 | 所有类型 | 与输入类型一致 |
+| MIN | 求最小值。 | 所有类型 | 与输入类型一致 |
+| FIRST | 求时间戳最小且不为 NULL 的值。 | 所有类型 | 与输入类型一致 |
+| LAST | 求时间戳最大且不为 NULL 的值。 | 所有类型 | 与输入类型一致 |
+| STDDEV | STDDEV_SAMP 的别名,求样本标准差。 | INT32 INT64 FLOAT DOUBLE | DOUBLE |
+| STDDEV_POP | 求总体标准差。 | INT32 INT64 FLOAT DOUBLE | DOUBLE |
+| STDDEV_SAMP | 求样本标准差。 | INT32 INT64 FLOAT DOUBLE | DOUBLE |
+| VARIANCE | VAR_SAMP 的别名,求样本方差。 | INT32 INT64 FLOAT DOUBLE | DOUBLE |
+| VAR_POP | 求总体方差。 | INT32 INT64 FLOAT DOUBLE | DOUBLE |
+| VAR_SAMP | 求样本方差。 | INT32 INT64 FLOAT DOUBLE | DOUBLE |
+| EXTREME | 求具有最大绝对值的值。如果正值和负值的最大绝对值相等,则返回正值。 | INT32 INT64 FLOAT DOUBLE | 与输入类型一致 |
+| MODE | 求众数。注意: 1.输入序列的不同值个数过多时会有内存异常风险; 2.如果所有元素出现的频次相同,即没有众数,则随机返回一个元素; 3.如果有多个众数,则随机返回一个众数; 4. NULL 值也会被统计频次,所以即使输入序列的值不全为 NULL,最终结果也可能为 NULL。 | 所有类型 | 与输入类型一致 |
+| MAX_BY | MAX_BY(x, y) 求二元输入 x 和 y 在 y 最大时对应的 x 的值。MAX_BY(time, x) 返回 x 取最大值时对应的时间戳。 | x 和 y 可以是任意类型 | 与第一个输入 x 的数据类型一致 |
+| MIN_BY | MIN_BY(x, y) 求二元输入 x 和 y 在 y 最小时对应的 x 的值。MIN_BY(time, x) 返回 x 取最小值时对应的时间戳。 | x 和 y 可以是任意类型 | 与第一个输入 x 的数据类型一致 |
+| FIRST_BY | FIRST_BY(x, y) 求当 y 为第一个不为 NULL 的值时,同一行里对应的 x 值。 | x 和 y 可以是任意类型 | 与第一个输入 x 的数据类型一致 |
+| LAST_BY | LAST_BY(x, y) 求当 y 为最后一个不为 NULL 的值时,同一行里对应的 x 值。 | x 和 y 可以是任意类型 | 与第一个输入 x 的数据类型一致 |
### 2.3 示例
@@ -213,7 +214,7 @@ It costs 0.834s
统计 `table2` 中 到达时间 `arrival_time` 不是 `null` 的记录行数。
```sql
-select count_if(arrival_time is not null) from table2;
+IoTDB> select count_if(arrival_time is not null) from table2;
```
执行结果如下:
@@ -228,8 +229,29 @@ Total line number = 1
It costs 0.047s
```
+#### 2.3.4 Approx_count_distinct
+
+查询 `table1` 中 `temperature` 列不同值的个数。
-#### 2.3.4 First
+```sql
+IoTDB> SELECT COUNT(DISTINCT temperature) as origin, APPROX_COUNT_DISTINCT(temperature) as approx FROM table1;
+IoTDB> SELECT COUNT(DISTINCT temperature) as origin, APPROX_COUNT_DISTINCT(temperature,0.006) as approx FROM table1;
+```
+
+执行结果如下:
+
+```sql
++------+------+
+|origin|approx|
++------+------+
+| 3| 3|
++------+------+
+Total line number = 1
+It costs 0.022s
+```
+
+
+#### 2.3.5 First
查询`temperature`列、`humidity`列时间戳最小且不为 NULL 的值。
@@ -249,7 +271,7 @@ Total line number = 1
It costs 0.170s
```
-#### 2.3.5 Last
+#### 2.3.6 Last
查询`temperature`列、`humidity`列时间戳最大且不为 NULL 的值。
@@ -269,7 +291,7 @@ Total line number = 1
It costs 0.211s
```
-#### 2.3.6 First_by
+#### 2.3.7 First_by
查询 `temperature` 列中非 NULL 且时间戳最小的行的 `time` 值,以及 `temperature` 列中非 NULL 且时间戳最小的行的 `humidity` 值。
@@ -289,7 +311,7 @@ Total line number = 1
It costs 0.269s
```
-#### 2.3.7 Last_by
+#### 2.3.8 Last_by
查询`temperature` 列中非 NULL 且时间戳最大的行的 `time` 值,以及 `temperature` 列中非 NULL 且时间戳最大的行的 `humidity` 值。
@@ -309,7 +331,7 @@ Total line number = 1
It costs 0.070s
```
-#### 2.3.8 Max_by
+#### 2.3.9 Max_by
查询`temperature` 列中最大值所在行的 `time` 值,以及`temperature` 列中最大值所在行的 `humidity` 值。
@@ -329,7 +351,7 @@ Total line number = 1
It costs 0.172s
```
-#### 2.3.9 Min_by
+#### 2.3.10 Min_by
查询`temperature` 列中最小值所在行的 `time` 值,以及`temperature` 列中最小值所在行的 `humidity` 值。
@@ -350,7 +372,6 @@ It costs 0.244s
```
-
## 3. 逻辑运算符
### 3.1 概述
diff --git a/src/zh/UserGuide/latest-Table/SQL-Manual/Basis-Function.md b/src/zh/UserGuide/latest-Table/SQL-Manual/Basis-Function.md
index 00e421f38..5b3bca340 100644
--- a/src/zh/UserGuide/latest-Table/SQL-Manual/Basis-Function.md
+++ b/src/zh/UserGuide/latest-Table/SQL-Manual/Basis-Function.md
@@ -155,28 +155,29 @@ SELECT LEAST(temperature,humidity) FROM table2;
### 2.2 支持的聚合函数
-| 函数名 | 功能描述 | 允许的输入类型 | 输出类型 |
-| ----------- | ------------------------------------------------------------ |-----------------------------------------------|------------------|
-| COUNT | 计算数据点数。 | 所有类型 | INT64 |
-| COUNT_IF | COUNT_IF(exp) 用于统计满足指定布尔表达式的记录行数 | exp 必须是一个布尔类型的表达式,例如 count_if(temperature>20) | INT64 |
-| SUM | 求和。 | INT32 INT64 FLOAT DOUBLE | DOUBLE |
-| AVG | 求平均值。 | INT32 INT64 FLOAT DOUBLE | DOUBLE |
-| MAX | 求最大值。 | 所有类型 | 与输入类型一致 |
-| MIN | 求最小值。 | 所有类型 | 与输入类型一致 |
-| FIRST | 求时间戳最小且不为 NULL 的值。 | 所有类型 | 与输入类型一致 |
-| LAST | 求时间戳最大且不为 NULL 的值。 | 所有类型 | 与输入类型一致 |
-| STDDEV | STDDEV_SAMP 的别名,求样本标准差。 | INT32 INT64 FLOAT DOUBLE | DOUBLE |
-| STDDEV_POP | 求总体标准差。 | INT32 INT64 FLOAT DOUBLE | DOUBLE |
-| STDDEV_SAMP | 求样本标准差。 | INT32 INT64 FLOAT DOUBLE | DOUBLE |
-| VARIANCE | VAR_SAMP 的别名,求样本方差。 | INT32 INT64 FLOAT DOUBLE | DOUBLE |
-| VAR_POP | 求总体方差。 | INT32 INT64 FLOAT DOUBLE | DOUBLE |
-| VAR_SAMP | 求样本方差。 | INT32 INT64 FLOAT DOUBLE | DOUBLE |
-| EXTREME | 求具有最大绝对值的值。如果正值和负值的最大绝对值相等,则返回正值。 | INT32 INT64 FLOAT DOUBLE | 与输入类型一致 |
-| MODE | 求众数。注意: 1.输入序列的不同值个数过多时会有内存异常风险; 2.如果所有元素出现的频次相同,即没有众数,则随机返回一个元素; 3.如果有多个众数,则随机返回一个众数; 4. NULL 值也会被统计频次,所以即使输入序列的值不全为 NULL,最终结果也可能为 NULL。 | 所有类型 | 与输入类型一致 |
-| MAX_BY | MAX_BY(x, y) 求二元输入 x 和 y 在 y 最大时对应的 x 的值。MAX_BY(time, x) 返回 x 取最大值时对应的时间戳。 | x 和 y 可以是任意类型 | 与第一个输入 x 的数据类型一致 |
-| MIN_BY | MIN_BY(x, y) 求二元输入 x 和 y 在 y 最小时对应的 x 的值。MIN_BY(time, x) 返回 x 取最小值时对应的时间戳。 | x 和 y 可以是任意类型 | 与第一个输入 x 的数据类型一致 |
-| FIRST_BY | FIRST_BY(x, y) 求当 y 为第一个不为 NULL 的值时,同一行里对应的 x 值。 | x 和 y 可以是任意类型 | 与第一个输入 x 的数据类型一致 |
-| LAST_BY | LAST_BY(x, y) 求当 y 为最后一个不为 NULL 的值时,同一行里对应的 x 值。 | x 和 y 可以是任意类型 | 与第一个输入 x 的数据类型一致 |
+| 函数名 | 功能描述 | 允许的输入类型 | 输出类型 |
+|-----------------------|------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------|------------------|
+| COUNT | 计算数据点数。 | 所有类型 | INT64 |
+| COUNT_IF | COUNT_IF(exp) 用于统计满足指定布尔表达式的记录行数 | exp 必须是一个布尔类型的表达式,例如 count_if(temperature>20) | INT64 |
+| APPROX_COUNT_DISTINCT | APPROX_COUNT_DISTINCT(x[,maxStandardError]) 函数提供 COUNT(DISTINCT x) 的近似值,返回不同输入值的近似个数。 | x:待计算列,支持所有类型;
maxStandardError:指定该函数应产生的最大标准误差,取值范围[0.0040625, 0.26],未指定值时默认0.023。 | INT64 |
+| SUM | 求和。 | INT32 INT64 FLOAT DOUBLE | DOUBLE |
+| AVG | 求平均值。 | INT32 INT64 FLOAT DOUBLE | DOUBLE |
+| MAX | 求最大值。 | 所有类型 | 与输入类型一致 |
+| MIN | 求最小值。 | 所有类型 | 与输入类型一致 |
+| FIRST | 求时间戳最小且不为 NULL 的值。 | 所有类型 | 与输入类型一致 |
+| LAST | 求时间戳最大且不为 NULL 的值。 | 所有类型 | 与输入类型一致 |
+| STDDEV | STDDEV_SAMP 的别名,求样本标准差。 | INT32 INT64 FLOAT DOUBLE | DOUBLE |
+| STDDEV_POP | 求总体标准差。 | INT32 INT64 FLOAT DOUBLE | DOUBLE |
+| STDDEV_SAMP | 求样本标准差。 | INT32 INT64 FLOAT DOUBLE | DOUBLE |
+| VARIANCE | VAR_SAMP 的别名,求样本方差。 | INT32 INT64 FLOAT DOUBLE | DOUBLE |
+| VAR_POP | 求总体方差。 | INT32 INT64 FLOAT DOUBLE | DOUBLE |
+| VAR_SAMP | 求样本方差。 | INT32 INT64 FLOAT DOUBLE | DOUBLE |
+| EXTREME | 求具有最大绝对值的值。如果正值和负值的最大绝对值相等,则返回正值。 | INT32 INT64 FLOAT DOUBLE | 与输入类型一致 |
+| MODE | 求众数。注意: 1.输入序列的不同值个数过多时会有内存异常风险; 2.如果所有元素出现的频次相同,即没有众数,则随机返回一个元素; 3.如果有多个众数,则随机返回一个众数; 4. NULL 值也会被统计频次,所以即使输入序列的值不全为 NULL,最终结果也可能为 NULL。 | 所有类型 | 与输入类型一致 |
+| MAX_BY | MAX_BY(x, y) 求二元输入 x 和 y 在 y 最大时对应的 x 的值。MAX_BY(time, x) 返回 x 取最大值时对应的时间戳。 | x 和 y 可以是任意类型 | 与第一个输入 x 的数据类型一致 |
+| MIN_BY | MIN_BY(x, y) 求二元输入 x 和 y 在 y 最小时对应的 x 的值。MIN_BY(time, x) 返回 x 取最小值时对应的时间戳。 | x 和 y 可以是任意类型 | 与第一个输入 x 的数据类型一致 |
+| FIRST_BY | FIRST_BY(x, y) 求当 y 为第一个不为 NULL 的值时,同一行里对应的 x 值。 | x 和 y 可以是任意类型 | 与第一个输入 x 的数据类型一致 |
+| LAST_BY | LAST_BY(x, y) 求当 y 为最后一个不为 NULL 的值时,同一行里对应的 x 值。 | x 和 y 可以是任意类型 | 与第一个输入 x 的数据类型一致 |
### 2.3 示例
@@ -213,7 +214,7 @@ It costs 0.834s
统计 `table2` 中 到达时间 `arrival_time` 不是 `null` 的记录行数。
```sql
-select count_if(arrival_time is not null) from table2;
+IoTDB> select count_if(arrival_time is not null) from table2;
```
执行结果如下:
@@ -228,8 +229,29 @@ Total line number = 1
It costs 0.047s
```
+#### 2.3.4 Approx_count_distinct
-#### 2.3.4 First
+查询 `table1` 中 `temperature` 列不同值的个数。
+
+```sql
+IoTDB> SELECT COUNT(DISTINCT temperature) as origin, APPROX_COUNT_DISTINCT(temperature) as approx FROM table1;
+IoTDB> SELECT COUNT(DISTINCT temperature) as origin, APPROX_COUNT_DISTINCT(temperature,0.006) as approx FROM table1;
+```
+
+执行结果如下:
+
+```sql
++------+------+
+|origin|approx|
++------+------+
+| 3| 3|
++------+------+
+Total line number = 1
+It costs 0.022s
+```
+
+
+#### 2.3.5 First
查询`temperature`列、`humidity`列时间戳最小且不为 NULL 的值。
@@ -249,7 +271,7 @@ Total line number = 1
It costs 0.170s
```
-#### 2.3.5 Last
+#### 2.3.6 Last
查询`temperature`列、`humidity`列时间戳最大且不为 NULL 的值。
@@ -269,7 +291,7 @@ Total line number = 1
It costs 0.211s
```
-#### 2.3.6 First_by
+#### 2.3.7 First_by
查询 `temperature` 列中非 NULL 且时间戳最小的行的 `time` 值,以及 `temperature` 列中非 NULL 且时间戳最小的行的 `humidity` 值。
@@ -289,7 +311,7 @@ Total line number = 1
It costs 0.269s
```
-#### 2.3.7 Last_by
+#### 2.3.8 Last_by
查询`temperature` 列中非 NULL 且时间戳最大的行的 `time` 值,以及 `temperature` 列中非 NULL 且时间戳最大的行的 `humidity` 值。
@@ -309,7 +331,7 @@ Total line number = 1
It costs 0.070s
```
-#### 2.3.8 Max_by
+#### 2.3.9 Max_by
查询`temperature` 列中最大值所在行的 `time` 值,以及`temperature` 列中最大值所在行的 `humidity` 值。
@@ -329,7 +351,7 @@ Total line number = 1
It costs 0.172s
```
-#### 2.3.9 Min_by
+#### 2.3.10 Min_by
查询`temperature` 列中最小值所在行的 `time` 值,以及`temperature` 列中最小值所在行的 `humidity` 值。