diff --git a/src/UserGuide/Master/Table/Tools-System/Data-Import-Tool.md b/src/UserGuide/Master/Table/Tools-System/Data-Import-Tool.md index b56e4a4a0..f8084f79c 100644 --- a/src/UserGuide/Master/Table/Tools-System/Data-Import-Tool.md +++ b/src/UserGuide/Master/Table/Tools-System/Data-Import-Tool.md @@ -1,7 +1,11 @@ # Data Import ## 1. Functional Overview -The data import tool `import-data.sh/bat` is located in the `tools` directory and can import data in ​CSV, ​SQL, and ​TsFile (an open-source time-series file format) into ​IoTDB. Its specific functionalities are as follows: + +IoTDB supports three methods for data import: +- Data Import Tool: Use the `import-data.sh/bat` script in the `tools` directory to manually import CSV, SQL, or TsFile (open-source time-series file format) data into IoTDB. +- `TsFile` Auto-Loading Feature +- Load `TsFile` SQL @@ -19,13 +23,21 @@ The data import tool `import-data.sh/bat` is located in the `tools` directory an - + - + + + + + + + + +
Can be used for single or batch import of SQL files into IoTDB
TsFileTsFile Can be used for single or batch import of TsFile files into IoTDB
TsFile Auto-Loading FeatureCan automatically monitor a specified directory for newly generated TsFiles and load them into IoTDB
Load SQLCan be used for single or batch import of TsFile files into IoTDB
-## 2. Detailed Features +## 2. Data Import Tool ### 2.1 Common Parameters | Short | Full Parameter | Description | Required | Default | @@ -45,9 +57,9 @@ The data import tool `import-data.sh/bat` is located in the `tools` directory an | `-tz` | `--timezone` | Timezone (e.g., `+08:00`, `-01:00`). | No | System default | | `-help` | `--help` | Display help (general or format-specific: `-help csv`). | No | - | -### 2.2 CSV Format +### 2.2 CSV Format -#### 2.2.1 Command +#### 2.2.1 Command ```Shell # Unix/OS X > tools/import-data.sh -ft [-sql_dialect] -db -table @@ -64,7 +76,7 @@ The data import tool `import-data.sh/bat` is located in the `tools` directory an [-tn ] ``` -#### 2.2.2 CSV-Specific Parameters +#### 2.2.2 CSV-Specific Parameters | Short | Full Parameter | Description | Required | Default | | ---------------- | ------------------------------- |----------------------------------------------------------| ---------- |-----------------| @@ -75,7 +87,7 @@ The data import tool `import-data.sh/bat` is located in the `tools` directory an | `-ti` | `--type_infer` | Type mapping (e.g., `BOOLEAN=text,INT=long`). | No | - | | `-tp` | `--timestamp_precision` | Timestamp precision: `ms`, `us`, `ns`. | No | `ms` | -#### 2.2.3 Examples +#### 2.2.3 Examples ```Shell # Valid Example @@ -95,7 +107,7 @@ There are no tables or the target table table5 does not exist - Special Character Escaping Rules: If a text-type field contains special characters (e.g., commas `,`), they must be escaped using a backslash (`\`). - Supported Time Formats: `yyyy-MM-dd'T'HH:mm:ss`, `yyyy-MM-dd HH:mm:ss`, or `yyyy-MM-dd'T'HH:mm:ss.SSSZ`. -- Timestamp Column Requirement: The timestamp column must be the first column in the data file. +- Timestamp Column Requirement: The timestamp column must be the first column in the data file. 2. CSV File Example @@ -106,9 +118,9 @@ time,region,device,model,temperature,humidity ``` -### 2.3 SQL Format +### 2.3 SQL Format -#### 2.3.1 Command +#### 2.3.1 Command ```Shell # Unix/OS X @@ -124,7 +136,7 @@ time,region,device,model,temperature,humidity [-batch ] [-tn ] ``` -#### 2.3.2 SQL-Specific Parameters +#### 2.3.2 SQL-Specific Parameters | Short | Full Parameter | Description | Required | Default | | -------------- | ------------------------------- | -------------------------------------------------------------------- | ---------- | ------------------ | @@ -132,7 +144,7 @@ time,region,device,model,temperature,humidity | `-lpf` | `--lines_per_failed_file` | Max lines per failed file. | No | `100000`
Range: 0 to Integer.Max(2147483647). | | `-batch` | `--batch_size` | Rows processed per API call. | No | `100000`
Range: 0 to Integer.Max(2147483647). | -#### 2.3.3 Examples +#### 2.3.3 Examples ```Shell # Valid Example @@ -146,9 +158,9 @@ Source file or directory ./sql/dump1_1.sql does not exist # Log Example Fail to insert measurements '[column.name]' caused by [data type is not consistent, input '[column.value]', registered '[column.DataType]'] ``` -### 2.4 TsFile Format +### 2.4 TsFile Format -#### 2.4.1 Command +#### 2.4.1 Command ```Shell # Unix/OS X @@ -163,7 +175,7 @@ Fail to insert measurements '[column.name]' caused by [data type is not consiste -s -os [-sd ] -of [-fd ] [-tn ] [-tz ] [-tp ] ``` -#### 2.4.2 TsFile-Specific Parameters +#### 2.4.2 TsFile-Specific Parameters | Short | Full Parameter | Description | Required | Default | | ----------- | ----------------------------- |-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| ----------------- | --------------------------- | @@ -173,7 +185,7 @@ Fail to insert measurements '[column.name]' caused by [data type is not consiste | `-fd` | `--fail_dir` | Target directory for `mv`/`cp` actions on failure. Required if `-of` is `mv`/`cp`. The file name will be flattened and concatenated with the original file name. | Conditional | `${EXEC_DIR}/fail` | | `-tp` | `--timestamp_precision` | TsFile timestamp precision: `ms`, `us`, `ns`.
For non-remote TsFile imports: Use -tp to specify the timestamp precision of the TsFile. The system will manually verify if the timestamp precision matches the server. If it does not match, an error will be returned.
​For remote TsFile imports: Use -tp to specify the timestamp precision of the TsFile. The Pipe system will automatically verify if the timestamp precision matches. If it does not match, a Pipe error will be returned. | No | `ms` | -#### 2.4.3 Examples +#### 2.4.3 Examples ```Shell # Valid Example @@ -183,3 +195,105 @@ Fail to insert measurements '[column.name]' caused by [data type is not consiste > tools/import-data.sh -ft tsfile -sql_dialect table -s ./tsfile -db database1 Parse error: Missing required options: os, of ``` + +## 3. TsFile Auto-Loading + +This feature enables IoTDB to automatically monitor a specified directory for new TsFiles and load them into the database without manual intervention. + +![](/img/Data-import2.png) + +### 3.1 Configuration + +Add the following parameters to `iotdb-system.properties` (template: `iotdb-system.properties.template`): + +| Parameter | Description | Value Range | Required | Default | Hot-Load? | +| ---------------------------------------------------- |----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------| ---------- | ----------------------------- | ----------------------- | +| `load_active_listening_enable` | Enable auto-loading. | `true`/`false` | Optional | `true` | Yes | +| `load_active_listening_dirs` | Directories to monitor (subdirectories included). Multiple paths separated by commas.
Note: In the table model, the directory name where the file is located will be used as the database. | String | Optional | `ext/load/pending` | Yes | +| `load_active_listening_fail_dir` | Directory to store failed TsFiles. Only can set one. | String | Optional | `ext/load/failed` | Yes | +| `load_active_listening_max_thread_num` | Maximum Threads for TsFile Loading Tasks:The default value for this parameter, when commented out, is max(1, CPU cores / 2). If the value set by the user falls outside the range [1, CPU cores / 2], it will be reset to the default value of max(1, CPU cores / 2). | `1` to `Long.MAX_VALUE` | Optional | `max(1, CPU_CORES / 2)` | No (restart required) | +| `load_active_listening_check_interval_seconds` | Active Listening Polling Interval (in seconds):The active listening feature for TsFiles is implemented through polling the target directory. This configuration specifies the time interval between two consecutive checks of the `load_active_listening_dirs`. After each check, the next check will be performed after `load_active_listening_check_interval_seconds` seconds. If the polling interval set by the user is less than 1, it will be reset to the default value of 5 seconds. | `1` to `Long.MAX_VALUE` | Optional | `5` | No (restart required) | + +### 3.2 Examples + +```bash +load_active_listening_dir/ +├─sensors/ +│ ├─temperature/ +│ │ └─temperature-table.TSFILE + +``` + +- Table model TsFile + - `temperature-table.TSFILE`: will be imported into the `temperature` database (because it is located in the `sensors/temperature/` directory) + + +### 3.3 Notes + +1. ​​**Mods Files**​: If TsFiles have associated `.mods` files, move `.mods` files to the monitored directory ​**before** their corresponding TsFiles. Ensure `.mods` and TsFiles are in the same directory. +2. ​​**Restricted Directories**​: Do NOT set Pipe receiver directories, data directories, or other system paths as monitored directories. +3. ​​**Directory Conflicts**​: Ensure `load_active_listening_fail_dir` does not overlap with `load_active_listening_dirs` or its subdirectories. +4. ​​**Permissions**​: The monitored directory must have write permissions. Files are deleted after successful loading; insufficient permissions may cause duplicate loading. + + +## 4. Load SQL + +IoTDB supports importing one or multiple TsFile files containing time series into another running IoTDB instance directly via SQL execution through the CLI. + +### 4.1 Command + +```SQL +load '' with ( + 'attribute-key1'='attribute-value1', + 'attribute-key2'='attribute-value2', +) +``` + +* `` : The path to a TsFile or a folder containing multiple TsFiles. +* ``: Optional parameters, as described below. + +| Key | Key Description | Value Type | Value Range | Value is Required | Default Value | +|--------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------|--------------------------------|-------------------|----------------------------| +| `database-level` | When the database corresponding to the TsFile does not exist, the database hierarchy level can be specified via the ` database-level` parameter. The default is the level set in `iotdb-common.properties`. For example, setting level=1 means the prefix path of level 1 in all time series in the TsFile will be used as the database. | Integer | `[1: Integer.MAX_VALUE]` | No | 1 | +| `on-success` | Action for successfully loaded TsFiles: `delete` (delete the TsFile after successful import) or `none` (retain the TsFile in the source folder). | String | `delete / none` | No | delete | +| `model` | Specifies whether the TsFile uses the `table` model or `tree` model. | String | `tree / table` | No | Aligns with `-sql_dialect` | +| `database-name` | Table model only: Target database for import. Automatically created if it does not exist. The database-name must not include the `root.` prefix (an error will occur if included). | String | `-` | No | null | +| `convert-on-type-mismatch` | Whether to perform type conversion during loading if data types in the TsFile mismatch the target schema. | Boolean | `true / false` | No | true | +| `verify` | Whether to validate the schema before loading the TsFile. | Boolean | `true / false` | No | true | +| `tablet-conversion-threshold` | Size threshold (in bytes) for converting TsFiles into tablet format during loading. Default: `-1` (no conversion for any TsFile). | Integer | `[-1,0 :`​`Integer.MAX_VALUE]` | No | -1 | +| `async` | Whether to enable asynchronous loading. If enabled, TsFiles are moved to an active-load directory and loaded into the `database-name` asynchronously. | Boolean | `true / false` | No | false | + +### 4.2 Example + +```SQL +-- Create target database: database2 +IoTDB> create database database2 +Msg: The statement is executed successfully. + +IoTDB> use database2 +Msg: The statement is executed successfully. + +IoTDB:database2> show tables details ++---------+-------+------+-------+ +|TableName|TTL(ms)|Status|Comment| ++---------+-------+------+-------+ ++---------+-------+------+-------+ +Empty set. + +-- Import tsfile by excuting load sql +IoTDB:database2> load '/home/dump0.tsfile' with ( 'on-success'='none', 'database-name'='database2') +Msg: The statement is executed successfully. + +-- Verify whether the import was successful +IoTDB:database2> select * from table2 ++-----------------------------+------+--------+---------+-----------+--------+------+-----------------------------+ +| time|region|plant_id|device_id|temperature|humidity|status| arrival_time| ++-----------------------------+------+--------+---------+-----------+--------+------+-----------------------------+ +|2024-11-30T00:00:00.000+08:00| 上海| 3002| 101| 90.0| 35.2| true| null| +|2024-11-29T00:00:00.000+08:00| 上海| 3001| 101| 85.0| 35.1| null|2024-11-29T10:00:13.000+08:00| +|2024-11-27T00:00:00.000+08:00| 北京| 1001| 101| 85.0| 35.1| true|2024-11-27T16:37:01.000+08:00| +|2024-11-29T11:00:00.000+08:00| 上海| 3002| 100| null| 45.1| true| null| +|2024-11-28T08:00:00.000+08:00| 上海| 3001| 100| 85.0| 35.2| false|2024-11-28T08:00:09.000+08:00| +|2024-11-26T13:37:00.000+08:00| 北京| 1001| 100| 90.0| 35.1| true|2024-11-26T13:37:34.000+08:00| ++-----------------------------+------+--------+---------+-----------+--------+------+-----------------------------+ +``` diff --git a/src/UserGuide/Master/Tree/Tools-System/Data-Import-Tool.md b/src/UserGuide/Master/Tree/Tools-System/Data-Import-Tool.md index 1f5f83c01..efa6ba679 100644 --- a/src/UserGuide/Master/Tree/Tools-System/Data-Import-Tool.md +++ b/src/UserGuide/Master/Tree/Tools-System/Data-Import-Tool.md @@ -1,9 +1,10 @@ # Data Import ## 1. Overview -IoTDB supports two methods for data import: -* Data Import Tool: Use the import-data.sh (Unix/OS X) or import-data.bat (Windows) script in the tools directory to manually import CSV, SQL, or TsFile (open-source time-series file format) data into IoTDB. -* TsFile Auto-Loading Feature +IoTDB supports three methods for data import: +- Data Import Tool: Use the `import-data.sh/bat` script in the `tools` directory to manually import CSV, SQL, or TsFile (open-source time-series file format) data into IoTDB. +- `TsFile` Auto-Loading Feature +- Load `TsFile` SQL
@@ -21,12 +22,16 @@ IoTDB supports two methods for data import: - + - - + + + + + +
Can be used for single or batch import of SQL files into IoTDB
TsFileTsFile Can be used for single or batch import of TsFile files into IoTDB
TsFile Auto-Loading FeatureCan automatically monitor a specified directory for newly generated TsFiles and load them into IoTDB.TsFile Auto-Loading FeatureCan automatically monitor a specified directory for newly generated TsFiles and load them into IoTDB
Load SQLCan be used for single or batch import of TsFile files into IoTDB
@@ -46,9 +51,9 @@ IoTDB supports two methods for data import: | `-tz` | `--timezone` | Timezone (e.g., `+08:00`, `-01:00`). | No | System default | | `-help` | `--help` | Display help (general or format-specific: `-help csv`). | No | - | -### 2.2 CSV Format +### 2.2 CSV Format -#### 2.2.1 Command +#### 2.2.1 Command ```Shell # Unix/OS X > tools/import-data.sh -ft [-h ] [-p ] [-u ] [-pw ] @@ -63,7 +68,7 @@ IoTDB supports two methods for data import: [-tn ] ``` -#### 2.2.2 CSV-Specific Parameters +#### 2.2.2 CSV-Specific Parameters | Short | Full Parameter | Description | Required | Default | | ---------------- | ------------------------------- |----------------------------------------------------------| ---------- |-----------------| @@ -74,7 +79,7 @@ IoTDB supports two methods for data import: | `-ti` | `--type_infer` | Type mapping (e.g., `BOOLEAN=text,INT=long`). | No | - | | `-tp` | `--timestamp_precision` | Timestamp precision: `ms`, `us`, `ns`. | No | `ms` | -#### 2.2.3 Examples +#### 2.2.3 Examples ```Shell # Valid Example @@ -134,9 +139,9 @@ Time,Device,str(TEXT),var(INT32) ``` -### 2.3 SQL Format +### 2.3 SQL Format -#### 2.3.1 Command +#### 2.3.1 Command ```Shell # Unix/OS X @@ -150,7 +155,7 @@ Time,Device,str(TEXT),var(INT32) [-batch ] [-tn ] ``` -#### 2.3.2 SQL-Specific Parameters +#### 2.3.2 SQL-Specific Parameters | Short | Full Parameter | Description | Required | Default | | -------------- | ------------------------------- | -------------------------------------------------------------------- | ---------- | ------------------ | @@ -158,7 +163,7 @@ Time,Device,str(TEXT),var(INT32) | `-lpf` | `--lines_per_failed_file` | Max lines per failed file. | No | `100000`
Range: 0 to Integer.Max(2147483647). | | `-batch` | `--batch_size` | Rows processed per API call. | No | `100000`
Range: 0 to Integer.Max(2147483647). | -#### 2.3.3 Examples +#### 2.3.3 Examples ```Shell # Valid Example @@ -174,9 +179,9 @@ error: Source file or directory /path/sql does not exist > tools/import-data.sh -ft sql -s /path/sql -tn 0 error: Invalid thread number '0'. Please set a positive integer. ``` -### 2.4 TsFile Format +### 2.4 TsFile Format -#### 2.4.1 Command +#### 2.4.1 Command ```Shell # Unix/OS X @@ -189,7 +194,7 @@ error: Invalid thread number '0'. Please set a positive integer. -s -os [-sd ] -of [-fd ] [-tn ] [-tz ] [-tp ] ``` -#### 2.4.2 TsFile-Specific Parameters +#### 2.4.2 TsFile-Specific Parameters | Short | Full Parameter | Description | Required | Default | | ----------- | ----------------------------- |-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| ----------------- | --------------------------- | @@ -199,7 +204,7 @@ error: Invalid thread number '0'. Please set a positive integer. | `-fd` | `--fail_dir` | Target directory for `mv`/`cp` actions on failure. Required if `-of` is `mv`/`cp`. The file name will be flattened and concatenated with the original file name. | Conditional | `${EXEC_DIR}/fail` | | `-tp` | `--timestamp_precision` | TsFile timestamp precision: `ms`, `us`, `ns`.
For non-remote TsFile imports: Use -tp to specify the timestamp precision of the TsFile. The system will manually verify if the timestamp precision matches the server. If it does not match, an error will be returned.
​For remote TsFile imports: Use -tp to specify the timestamp precision of the TsFile. The Pipe system will automatically verify if the timestamp precision matches. If it does not match, a Pipe error will be returned. | No | `ms` | -#### 2.4.3 Examples +#### 2.4.3 Examples ```Shell # Valid Example @@ -242,3 +247,54 @@ Add the following parameters to `iotdb-system.properties` (template: `iotdb-syst 2. ​​**Restricted Directories**​: Do NOT set Pipe receiver directories, data directories, or other system paths as monitored directories. 3. ​​**Directory Conflicts**​: Ensure `load_active_listening_fail_dir` does not overlap with `load_active_listening_dirs` or its subdirectories. 4. ​​**Permissions**​: The monitored directory must have write permissions. Files are deleted after successful loading; insufficient permissions may cause duplicate loading. + +## 4. Load SQL + +IoTDB supports importing one or multiple TsFile files containing time series into another running IoTDB instance directly via SQL execution through the CLI. + +### 4.1 Command + +```SQL +load '' with ( + 'attribute-key1'='attribute-value1', + 'attribute-key2'='attribute-value2', +) +``` + +* `` : The path to a TsFile or a folder containing multiple TsFiles. +* ``: Optional parameters, as described below. + +| Key | Key Description | Value Type | Value Range | Value is Required | Default Value | +|--------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------|--------------------------------|-------------------|----------------------------| +| `database-level` | When the database corresponding to the TsFile does not exist, the database hierarchy level can be specified via the ` database-level` parameter. The default is the level set in `iotdb-common.properties`. For example, setting level=1 means the prefix path of level 1 in all time series in the TsFile will be used as the database. | Integer | `[1: Integer.MAX_VALUE]` | No | 1 | +| `on-success` | Action for successfully loaded TsFiles: `delete` (delete the TsFile after successful import) or `none` (retain the TsFile in the source folder). | String | `delete / none` | No | delete | +| `model` | Specifies whether the TsFile uses the `table` model or `tree` model. | String | `tree / table` | No | Aligns with `-sql_dialect` | +| `database-name` | Table model only: Target database for import. Automatically created if it does not exist. The database-name must not include the `root.` prefix (an error will occur if included). | String | `-` | No | null | +| `convert-on-type-mismatch` | Whether to perform type conversion during loading if data types in the TsFile mismatch the target schema. | Boolean | `true / false` | No | true | +| `verify` | Whether to validate the schema before loading the TsFile. | Boolean | `true / false` | No | true | +| `tablet-conversion-threshold` | Size threshold (in bytes) for converting TsFiles into tablet format during loading. Default: `-1` (no conversion for any TsFile). | Integer | `[-1,0 :`​`Integer.MAX_VALUE]` | No | -1 | +| `async` | Whether to enable asynchronous loading. If enabled, TsFiles are moved to an active-load directory and loaded into the `database-name` asynchronously. | Boolean | `true / false` | No | false | + +### 4.2 Example + +```SQL +-- Before import +IoTDB> show databases ++-------------+-----------------------+---------------------+-------------------+---------------------+ +| Database|SchemaReplicationFactor|DataReplicationFactor|TimePartitionOrigin|TimePartitionInterval| ++-------------+-----------------------+---------------------+-------------------+---------------------+ +|root.__system| 1| 1| 0| 604800000| ++-------------+-----------------------+---------------------+-------------------+---------------------+ + +-- Import tsfile by excuting load sql +IoTDB> load '/home/dump1.tsfile' with ( 'on-success'='none') +Msg: The statement is executed successfully. + +-- Verify whether the import was successful +IoTDB> select * from root.testdb.** ++-----------------------------+------------------------------------+---------------------------------+-------------------------------+ +| Time|root.testdb.device.model.temperature|root.testdb.device.model.humidity|root.testdb.device.model.status| ++-----------------------------+------------------------------------+---------------------------------+-------------------------------+ +|2025-04-17T10:35:47.218+08:00| 22.3| 19.4| true| ++-----------------------------+------------------------------------+---------------------------------+-------------------------------+ +``` \ No newline at end of file diff --git a/src/UserGuide/latest-Table/Tools-System/Data-Import-Tool.md b/src/UserGuide/latest-Table/Tools-System/Data-Import-Tool.md index b56e4a4a0..726ada898 100644 --- a/src/UserGuide/latest-Table/Tools-System/Data-Import-Tool.md +++ b/src/UserGuide/latest-Table/Tools-System/Data-Import-Tool.md @@ -1,7 +1,11 @@ # Data Import ## 1. Functional Overview -The data import tool `import-data.sh/bat` is located in the `tools` directory and can import data in ​CSV, ​SQL, and ​TsFile (an open-source time-series file format) into ​IoTDB. Its specific functionalities are as follows: + +IoTDB supports three methods for data import: +- Data Import Tool: Use the `import-data.sh/bat` script in the `tools` directory to manually import CSV, SQL, or TsFile (open-source time-series file format) data into IoTDB. +- `TsFile` Auto-Loading Feature +- Load `TsFile` SQL @@ -19,13 +23,21 @@ The data import tool `import-data.sh/bat` is located in the `tools` directory an - + - + + + + + + + + +
Can be used for single or batch import of SQL files into IoTDB
TsFileTsFile Can be used for single or batch import of TsFile files into IoTDB
TsFile Auto-Loading FeatureCan automatically monitor a specified directory for newly generated TsFiles and load them into IoTDB
Load SQLCan be used for single or batch import of TsFile files into IoTDB
-## 2. Detailed Features +## 2. Data Import Tool ### 2.1 Common Parameters | Short | Full Parameter | Description | Required | Default | @@ -183,3 +195,105 @@ Fail to insert measurements '[column.name]' caused by [data type is not consiste > tools/import-data.sh -ft tsfile -sql_dialect table -s ./tsfile -db database1 Parse error: Missing required options: os, of ``` + +## 3. TsFile Auto-Loading + +This feature enables IoTDB to automatically monitor a specified directory for new TsFiles and load them into the database without manual intervention. + +![](/img/Data-import2.png) + +### 3.1 Configuration + +Add the following parameters to `iotdb-system.properties` (template: `iotdb-system.properties.template`): + +| Parameter | Description | Value Range | Required | Default | Hot-Load? | +| ---------------------------------------------------- |----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------| ---------- | ----------------------------- | ----------------------- | +| `load_active_listening_enable` | Enable auto-loading. | `true`/`false` | Optional | `true` | Yes | +| `load_active_listening_dirs` | Directories to monitor (subdirectories included). Multiple paths separated by commas.
Note: In the table model, the directory name where the file is located will be used as the database. | String | Optional | `ext/load/pending` | Yes | +| `load_active_listening_fail_dir` | Directory to store failed TsFiles. Only can set one. | String | Optional | `ext/load/failed` | Yes | +| `load_active_listening_max_thread_num` | Maximum Threads for TsFile Loading Tasks:The default value for this parameter, when commented out, is max(1, CPU cores / 2). If the value set by the user falls outside the range [1, CPU cores / 2], it will be reset to the default value of max(1, CPU cores / 2). | `1` to `Long.MAX_VALUE` | Optional | `max(1, CPU_CORES / 2)` | No (restart required) | +| `load_active_listening_check_interval_seconds` | Active Listening Polling Interval (in seconds):The active listening feature for TsFiles is implemented through polling the target directory. This configuration specifies the time interval between two consecutive checks of the `load_active_listening_dirs`. After each check, the next check will be performed after `load_active_listening_check_interval_seconds` seconds. If the polling interval set by the user is less than 1, it will be reset to the default value of 5 seconds. | `1` to `Long.MAX_VALUE` | Optional | `5` | No (restart required) | + +### 3.2 Examples + +```bash +load_active_listening_dir/ +├─sensors/ +│ ├─temperature/ +│ │ └─temperature-table.TSFILE + +``` + +- Table model TsFile + - `temperature-table.TSFILE`: will be imported into the `temperature` database (because it is located in the `sensors/temperature/` directory) + + +### 3.3 Notes + +1. ​​**Mods Files**​: If TsFiles have associated `.mods` files, move `.mods` files to the monitored directory ​**before** their corresponding TsFiles. Ensure `.mods` and TsFiles are in the same directory. +2. ​​**Restricted Directories**​: Do NOT set Pipe receiver directories, data directories, or other system paths as monitored directories. +3. ​​**Directory Conflicts**​: Ensure `load_active_listening_fail_dir` does not overlap with `load_active_listening_dirs` or its subdirectories. +4. ​​**Permissions**​: The monitored directory must have write permissions. Files are deleted after successful loading; insufficient permissions may cause duplicate loading. + + +## 4. Load SQL + +IoTDB supports importing one or multiple TsFile files containing time series into another running IoTDB instance directly via SQL execution through the CLI. + +### 4.1 Command + +```SQL +load '' with ( + 'attribute-key1'='attribute-value1', + 'attribute-key2'='attribute-value2', +) +``` + +* `` : The path to a TsFile or a folder containing multiple TsFiles. +* ``: Optional parameters, as described below. + +| Key | Key Description | Value Type | Value Range | Value is Required | Default Value | +|--------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------|--------------------------------|-------------------|----------------------------| +| `database-level` | When the database corresponding to the TsFile does not exist, the database hierarchy level can be specified via the ` database-level` parameter. The default is the level set in `iotdb-common.properties`. For example, setting level=1 means the prefix path of level 1 in all time series in the TsFile will be used as the database. | Integer | `[1: Integer.MAX_VALUE]` | No | 1 | +| `on-success` | Action for successfully loaded TsFiles: `delete` (delete the TsFile after successful import) or `none` (retain the TsFile in the source folder). | String | `delete / none` | No | delete | +| `model` | Specifies whether the TsFile uses the `table` model or `tree` model. | String | `tree / table` | No | Aligns with `-sql_dialect` | +| `database-name` | Table model only: Target database for import. Automatically created if it does not exist. The database-name must not include the `root.` prefix (an error will occur if included). | String | `-` | No | null | +| `convert-on-type-mismatch` | Whether to perform type conversion during loading if data types in the TsFile mismatch the target schema. | Boolean | `true / false` | No | true | +| `verify` | Whether to validate the schema before loading the TsFile. | Boolean | `true / false` | No | true | +| `tablet-conversion-threshold` | Size threshold (in bytes) for converting TsFiles into tablet format during loading. Default: `-1` (no conversion for any TsFile). | Integer | `[-1,0 :`​`Integer.MAX_VALUE]` | No | -1 | +| `async` | Whether to enable asynchronous loading. If enabled, TsFiles are moved to an active-load directory and loaded into the `database-name` asynchronously. | Boolean | `true / false` | No | false | + +### 4.2 Example + +```SQL +-- Create target database: database2 +IoTDB> create database database2 +Msg: The statement is executed successfully. + +IoTDB> use database2 +Msg: The statement is executed successfully. + +IoTDB:database2> show tables details ++---------+-------+------+-------+ +|TableName|TTL(ms)|Status|Comment| ++---------+-------+------+-------+ ++---------+-------+------+-------+ +Empty set. + +-- Import tsfile by excuting load sql +IoTDB:database2> load '/home/dump0.tsfile' with ( 'on-success'='none', 'database-name'='database2') +Msg: The statement is executed successfully. + +-- Verify whether the import was successful +IoTDB:database2> select * from table2 ++-----------------------------+------+--------+---------+-----------+--------+------+-----------------------------+ +| time|region|plant_id|device_id|temperature|humidity|status| arrival_time| ++-----------------------------+------+--------+---------+-----------+--------+------+-----------------------------+ +|2024-11-30T00:00:00.000+08:00| 上海| 3002| 101| 90.0| 35.2| true| null| +|2024-11-29T00:00:00.000+08:00| 上海| 3001| 101| 85.0| 35.1| null|2024-11-29T10:00:13.000+08:00| +|2024-11-27T00:00:00.000+08:00| 北京| 1001| 101| 85.0| 35.1| true|2024-11-27T16:37:01.000+08:00| +|2024-11-29T11:00:00.000+08:00| 上海| 3002| 100| null| 45.1| true| null| +|2024-11-28T08:00:00.000+08:00| 上海| 3001| 100| 85.0| 35.2| false|2024-11-28T08:00:09.000+08:00| +|2024-11-26T13:37:00.000+08:00| 北京| 1001| 100| 90.0| 35.1| true|2024-11-26T13:37:34.000+08:00| ++-----------------------------+------+--------+---------+-----------+--------+------+-----------------------------+ +``` diff --git a/src/UserGuide/latest/Tools-System/Data-Import-Tool.md b/src/UserGuide/latest/Tools-System/Data-Import-Tool.md index 1f5f83c01..4387a9aea 100644 --- a/src/UserGuide/latest/Tools-System/Data-Import-Tool.md +++ b/src/UserGuide/latest/Tools-System/Data-Import-Tool.md @@ -1,9 +1,10 @@ # Data Import ## 1. Overview -IoTDB supports two methods for data import: -* Data Import Tool: Use the import-data.sh (Unix/OS X) or import-data.bat (Windows) script in the tools directory to manually import CSV, SQL, or TsFile (open-source time-series file format) data into IoTDB. -* TsFile Auto-Loading Feature +IoTDB supports three methods for data import: +- Data Import Tool: Use the `import-data.sh/bat` script in the `tools` directory to manually import CSV, SQL, or TsFile (open-source time-series file format) data into IoTDB. +- `TsFile` Auto-Loading Feature +- Load `TsFile` SQL @@ -21,12 +22,16 @@ IoTDB supports two methods for data import: - + - - + + + + + +
Can be used for single or batch import of SQL files into IoTDB
TsFileTsFile Can be used for single or batch import of TsFile files into IoTDB
TsFile Auto-Loading FeatureCan automatically monitor a specified directory for newly generated TsFiles and load them into IoTDB.TsFile Auto-Loading FeatureCan automatically monitor a specified directory for newly generated TsFiles and load them into IoTDB
Load SQLCan be used for single or batch import of TsFile files into IoTDB
@@ -242,3 +247,54 @@ Add the following parameters to `iotdb-system.properties` (template: `iotdb-syst 2. ​​**Restricted Directories**​: Do NOT set Pipe receiver directories, data directories, or other system paths as monitored directories. 3. ​​**Directory Conflicts**​: Ensure `load_active_listening_fail_dir` does not overlap with `load_active_listening_dirs` or its subdirectories. 4. ​​**Permissions**​: The monitored directory must have write permissions. Files are deleted after successful loading; insufficient permissions may cause duplicate loading. + +## 4. Load SQL + +IoTDB supports importing one or multiple TsFile files containing time series into another running IoTDB instance directly via SQL execution through the CLI. + +### 4.1 Command + +```SQL +load '' with ( + 'attribute-key1'='attribute-value1', + 'attribute-key2'='attribute-value2', +) +``` + +* `` : The path to a TsFile or a folder containing multiple TsFiles. +* ``: Optional parameters, as described below. + +| Key | Key Description | Value Type | Value Range | Value is Required | Default Value | +|--------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------|--------------------------------|-------------------|----------------------------| +| `database-level` | When the database corresponding to the TsFile does not exist, the database hierarchy level can be specified via the ` database-level` parameter. The default is the level set in `iotdb-common.properties`. For example, setting level=1 means the prefix path of level 1 in all time series in the TsFile will be used as the database. | Integer | `[1: Integer.MAX_VALUE]` | No | 1 | +| `on-success` | Action for successfully loaded TsFiles: `delete` (delete the TsFile after successful import) or `none` (retain the TsFile in the source folder). | String | `delete / none` | No | delete | +| `model` | Specifies whether the TsFile uses the `table` model or `tree` model. | String | `tree / table` | No | Aligns with `-sql_dialect` | +| `database-name` | Table model only: Target database for import. Automatically created if it does not exist. The database-name must not include the `root.` prefix (an error will occur if included). | String | `-` | No | null | +| `convert-on-type-mismatch` | Whether to perform type conversion during loading if data types in the TsFile mismatch the target schema. | Boolean | `true / false` | No | true | +| `verify` | Whether to validate the schema before loading the TsFile. | Boolean | `true / false` | No | true | +| `tablet-conversion-threshold` | Size threshold (in bytes) for converting TsFiles into tablet format during loading. Default: `-1` (no conversion for any TsFile). | Integer | `[-1,0 :`​`Integer.MAX_VALUE]` | No | -1 | +| `async` | Whether to enable asynchronous loading. If enabled, TsFiles are moved to an active-load directory and loaded into the `database-name` asynchronously. | Boolean | `true / false` | No | false | + +### 4.2 Example + +```SQL +-- Before import +IoTDB> show databases ++-------------+-----------------------+---------------------+-------------------+---------------------+ +| Database|SchemaReplicationFactor|DataReplicationFactor|TimePartitionOrigin|TimePartitionInterval| ++-------------+-----------------------+---------------------+-------------------+---------------------+ +|root.__system| 1| 1| 0| 604800000| ++-------------+-----------------------+---------------------+-------------------+---------------------+ + +-- Import tsfile by excuting load sql +IoTDB> load '/home/dump1.tsfile' with ( 'on-success'='none') +Msg: The statement is executed successfully. + +-- Verify whether the import was successful +IoTDB> select * from root.testdb.** ++-----------------------------+------------------------------------+---------------------------------+-------------------------------+ +| Time|root.testdb.device.model.temperature|root.testdb.device.model.humidity|root.testdb.device.model.status| ++-----------------------------+------------------------------------+---------------------------------+-------------------------------+ +|2025-04-17T10:35:47.218+08:00| 22.3| 19.4| true| ++-----------------------------+------------------------------------+---------------------------------+-------------------------------+ +``` \ No newline at end of file diff --git a/src/zh/UserGuide/Master/Table/Tools-System/Data-Import-Tool.md b/src/zh/UserGuide/Master/Table/Tools-System/Data-Import-Tool.md index b77509a3f..738b440d4 100644 --- a/src/zh/UserGuide/Master/Table/Tools-System/Data-Import-Tool.md +++ b/src/zh/UserGuide/Master/Table/Tools-System/Data-Import-Tool.md @@ -2,7 +2,10 @@ ## 1. 功能概述 -数据导出工具 `import-data.sh/bat` 位于 `tools` 目录下,可以将 CSV、SQL、及 TsFile(开源时序文件格式)的数据导入 IoTDB。具体功能如下: +IoTDB 支持三种方式进行数据导入: +- 数据导入工具 :`import-data.sh/bat` 位于 `tools` 目录下,可以将 `CSV`、`SQL`、及`TsFile`(开源时序文件格式)的数据导入 `IoTDB`。 +- `TsFile` 自动加载功能。 +- `Load SQL` 导入 `TsFile` 。 @@ -20,15 +23,23 @@ - + + + + + + + + +
可用于单个或一个目录的 SQL 文件批量导入 IoTDB
TsFileTsFile 可用于单个或一个目录的 TsFile 文件批量导入 IoTDB
TsFile 自动加载可以监听指定路径下新产生的 TsFile 文件,并将其加载进 IoTDB
Load SQL可用于单个或一个目录的 TsFile 文件批量导入 IoTDB
- **表模型 TsFile 导入暂时只支持本地导入。** -## 2. 功能详解 +## 2. 数据导入工具 ### 2.1 公共参数 @@ -98,11 +109,11 @@ There are no tables or the target table table5 does not exist 1. CSV 导入规范 - - 特殊字符转义规则:若Text类型的字段中包含特殊字符(例如逗号`,`),需使用反斜杠(`\`)​进行转义处理。 - - 支持的时间格式:`yyyy-MM-dd'T'HH:mm:ss`, `yyy-MM-dd HH:mm:ss`, 或者 `yyyy-MM-dd'T'HH:mm:ss.SSSZ` 。 - - 时间戳列​必须作为数据文件的首列存在。 +- 特殊字符转义规则:若Text类型的字段中包含特殊字符(例如逗号`,`),需使用反斜杠(`\`)​进行转义处理。 +- 支持的时间格式:`yyyy-MM-dd'T'HH:mm:ss`, `yyy-MM-dd HH:mm:ss`, 或者 `yyyy-MM-dd'T'HH:mm:ss.SSSZ` 。 +- 时间戳列​必须作为数据文件的首列存在。 -2. CSV 文件示例 +2. CSV 文件示例 ```sql time,region,device,model,temperature,humidity @@ -110,6 +121,7 @@ time,region,device,model,temperature,humidity 1970-01-01T08:00:00.002+08:00,"上海","101","F",90.0,34.8 ``` + ### 2.3 SQL 格式 #### 2.3.1 运行命令 @@ -189,3 +201,103 @@ Fail to insert measurements '[column.name]' caused by [data type is not consiste > tools/import-data.sh -ft tsfile -sql_dialect table -s ./tsfile -db database1 Parse error: Missing required options: os, of ``` + +## 3. TsFile 自动加载功能 + +本功能允许 IoTDB 主动监听指定目录下的新增 TsFile,并将 TsFile 自动加载至 IoTDB 中。通过此功能,IoTDB 能自动检测并加载 TsFile,无需手动执行任何额外的加载操作。 + +![](/img/Data-import1.png) + +### 3.1 配置参数 + +可通过从配置文件模版 `iotdb-system.properties.template` 中找到下列参数,添加到 IoTDB 配置文件 `iotdb-system.properties` 中开启 TsFile 自动加载功能。完整配置如下: + +| **配置参数** | **参数说明** | **value 取值范围** | **是否必填** | **默认值** | **加载方式** | +| --------------------------------------------------- |-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| ---------------------------- | -------------------- | ------------------------ | -------------------- | +| load\_active\_listening\_enable | 是否开启 DataNode 主动监听并且加载 tsfile 的功能(默认开启)。 | Boolean: true,false | 选填 | true | 热加载 | +| load\_active\_listening\_dirs | 需要监听的目录(自动包括目录中的子目录),如有多个使用 “,“ 隔开;
默认的目录为 `ext/load/pending`;
支持热装载;
**注意:表模型中,文件所在的目录名会作为 database**; | String: 一个或多个文件目录 | 选填 | `ext/load/pending` | 热加载 | +| load\_active\_listening\_fail\_dir | 执行加载 tsfile 文件失败后将文件转存的目录,只能配置一个 | String: 一个文件目录 | 选填 | `ext/load/failed` | 热加载 | +| load\_active\_listening\_max\_thread\_num | 同时执行加载 tsfile 任务的最大线程数,参数被注释掉时的默值为 max(1, CPU 核心数 / 2),当用户设置的值不在这个区间[1, CPU核心数 /2]内时,会设置为默认值 (1, CPU 核心数 / 2) | Long: [1, Long.MAX\_VALUE] | 选填 | max(1, CPU 核心数 / 2) | 重启后生效 | +| load\_active\_listening\_check\_interval\_seconds | 主动监听轮询间隔,单位秒。主动监听 tsfile 的功能是通过轮询检查文件夹实现的。该配置指定了两次检查 `load_active_listening_dirs` 的时间间隔,每次检查完成 `load_active_listening_check_interval_seconds` 秒后,会执行下一次检查。当用户设置的轮询间隔小于 1 时,会被设置为默认值 5 秒 | Long: [1, Long.MAX\_VALUE] | 选填 | 5 | 重启后生效 | + +### 3.2 示例说明 + +```bash +load_active_listening_dir/ +├─sensors/ +│ ├─temperature/ +│ │ └─temperature-table.TSFILE + +``` + +- 表模型 TsFile + - `temperature-table.TSFILE`: 会被导入到 `temperature` database 下(因为它位于`sensors/temperature/` 目录下) + +### 3.3 注意事项 + +1. 如果待加载的文件中,存在 mods 文件,应优先将 mods 文件移动到监听目录下面,然后再移动 tsfile 文件,且 mods 文件应和对应的 tsfile 文件处于同一目录。防止加载到 tsfile 文件时,加载不到对应的 mods 文件 +2. 禁止设置 Pipe 的 receiver 目录、存放数据的 data 目录等作为监听目录 +3. 禁止 `load_active_listening_fail_dir` 与 `load_active_listening_dirs` 存在相同的目录,或者互相嵌套 +4. 保证 `load_active_listening_dirs` 目录有足够的权限,在加载成功之后,文件将会被删除,如果没有删除权限,则会重复加载 + +## 4. Load SQL + +IoTDB 支持通过 CLI 执行 SQL 直接将存有时间序列的一个或多个 TsFile 文件导入到另外一个正在运行的 IoTDB 实例中。 + +### 4.1 运行命令 + +```SQL +load '' with ( + 'attribute-key1'='attribute-value1', + 'attribute-key2'='attribute-value2', +) +``` + +* `` :文件本身,或是包含若干文件的文件夹路径 +* ``:可选参数,具体如下表所示 + +| Key | Key 描述 | Value 类型 | Value 取值范围 | Value 是否必填 | Value 默认值 | +| --------------------------------------- |------------------------------------------------------------------------------------------------------------------------------------------------------------------------| ------------ | ----------------------------------------- | ---------------- | -------------------------- | +| `database-level` | 当 tsfile 对应的 database 不存在时,可以通过` database-level`参数的值来制定 database 的级别,默认为`iotdb-common.properties`中设置的级别。
例如当设置 level 参数为 1 时表明此 tsfile 中所有时间序列中层级为1的前缀路径是 database。 | Integer | `[1: Integer.MAX_VALUE]` | 否 | 1 | +| `on-success` | 表示对于成功载入的 tsfile 的处置方式:默认为`delete`,即tsfile 成功加载后将被删除;`none `表明 tsfile 成功加载之后依然被保留在源文件夹, | String | `delete / none` | 否 | delete | +| `model` | 指定写入的 tsfile 是表模型还是树模型 | String | `tree / table` | 否 | 与`-sql_dialect`一致 | +| `database-name` | **仅限表模型有效**: 文件导入的目标 database,不存在时会自动创建,`database-name`中不允许包括"`root.`"前缀,如果包含,将会报错。 | String | `-` | 否 | null | +| `convert-on-type-mismatch` | 加载 tsfile 时,如果数据类型不一致,是否进行转换 | Boolean | `true / false` | 否 | true | +| `verify` | 加载 tsfile 前是否校验 schema | Boolean | `true / false` | 否 | true | +| `tablet-conversion-threshold` | 转换为 tablet 形式的 tsfile 大小阈值,针对小文件 tsfile 加载,采用将其转换为 tablet 形式进行写入:默认值为 -1,即任意大小 tsfile 都不进行转换 | Integer | `[-1,0 :`​`Integer.MAX_VALUE]` | 否 | -1 | +| `async` | 是否开启异步加载 tsfile,将文件移到 active load 目录下面,所有的 tsfile 都 load 到`database-name`下. | Boolean | `true / false` | 否 | false | + +### 4.2 运行示例 + +```SQL +-- 准备目标数据库 database2 +IoTDB> create database database2 +Msg: The statement is executed successfully. + +IoTDB> use database2 +Msg: The statement is executed successfully. + +IoTDB:database2> show tables details ++---------+-------+------+-------+ +|TableName|TTL(ms)|Status|Comment| ++---------+-------+------+-------+ ++---------+-------+------+-------+ +Empty set. + +--通过执行load sql 导入tsfile +IoTDB:database2> load '/home/dump0.tsfile' with ( 'on-success'='none', 'database-name'='database2') +Msg: The statement is executed successfully. + +-- 验证数据导入成功 +IoTDB:database2> select * from table2 ++-----------------------------+------+--------+---------+-----------+--------+------+-----------------------------+ +| time|region|plant_id|device_id|temperature|humidity|status| arrival_time| ++-----------------------------+------+--------+---------+-----------+--------+------+-----------------------------+ +|2024-11-30T00:00:00.000+08:00| 上海| 3002| 101| 90.0| 35.2| true| null| +|2024-11-29T00:00:00.000+08:00| 上海| 3001| 101| 85.0| 35.1| null|2024-11-29T10:00:13.000+08:00| +|2024-11-27T00:00:00.000+08:00| 北京| 1001| 101| 85.0| 35.1| true|2024-11-27T16:37:01.000+08:00| +|2024-11-29T11:00:00.000+08:00| 上海| 3002| 100| null| 45.1| true| null| +|2024-11-28T08:00:00.000+08:00| 上海| 3001| 100| 85.0| 35.2| false|2024-11-28T08:00:09.000+08:00| +|2024-11-26T13:37:00.000+08:00| 北京| 1001| 100| 90.0| 35.1| true|2024-11-26T13:37:34.000+08:00| ++-----------------------------+------+--------+---------+-----------+--------+------+-----------------------------+ +``` diff --git a/src/zh/UserGuide/Master/Tree/Tools-System/Data-Import-Tool.md b/src/zh/UserGuide/Master/Tree/Tools-System/Data-Import-Tool.md index fd267bcb7..0338fcdbe 100644 --- a/src/zh/UserGuide/Master/Tree/Tools-System/Data-Import-Tool.md +++ b/src/zh/UserGuide/Master/Tree/Tools-System/Data-Import-Tool.md @@ -2,10 +2,10 @@ ## 1. 功能概述 -IoTDB 支持两种方式进行数据导入 - -* 数据导入工具:tools 目录下的手动数据导入工具 `import-data.sh/bat`,可以将 CSV、SQL、及TsFile(开源时序文件格式)的数据导入 IoTDB。 -* TsFile 自动加载功能 +IoTDB 支持三种方式进行数据导入: +- 数据导入工具 :`import-data.sh/bat` 位于 `tools` 目录下,可以将 `CSV`、`SQL`、及`TsFile`(开源时序文件格式)的数据导入 `IoTDB`。 +- `TsFile` 自动加载功能。 +- `Load SQL` 导入 `TsFile` 。 @@ -23,12 +23,16 @@ IoTDB 支持两种方式进行数据导入 - + - - + + + + + +
可用于单个或一个目录的 SQL 文件批量导入 IoTDB
TsFileTsFile 可用于单个或一个目录的 TsFile 文件批量导入 IoTDB
TsFile 自动加载功能 可以监听指定路径下新产生的TsFile文件,并将其加载进IoTDBTsFile 自动加载可以监听指定路径下新产生的 TsFile 文件,并将其加载进 IoTDB
Load SQL可用于单个或一个目录的 TsFile 文件批量导入 IoTDB
@@ -250,3 +254,54 @@ error: Invalid thread number '0'. Please set a positive integer. 2. 禁止设置 Pipe 的 receiver 目录、存放数据的 data 目录等作为监听目录 3. 禁止 `load_active_listening_fail_dir` 与 `load_active_listening_dirs` 存在相同的目录,或者互相嵌套 4. 保证 `load_active_listening_dirs` 目录有足够的权限,在加载成功之后,文件将会被删除,如果没有删除权限,则会重复加载 + +## 4. Load SQL + +IoTDB 支持通过 CLI 执行 SQL 直接将存有时间序列的一个或多个 TsFile 文件导入到另外一个正在运行的 IoTDB 实例中。 + +### 4.1 运行命令 + +```SQL +load '' with ( + 'attribute-key1'='attribute-value1', + 'attribute-key2'='attribute-value2', +) +``` + +* `` :文件本身,或是包含若干文件的文件夹路径 +* ``:可选参数,具体如下表所示 + +| Key | Key 描述 | Value 类型 | Value 取值范围 | Value 是否必填 | Value 默认值 | +| --------------------------------------- |------------------------------------------------------------------------------------------------------------------------------------------------------------------------| ------------ | ----------------------------------------- | ---------------- | -------------------------- | +| `database-level` | 当 tsfile 对应的 database 不存在时,可以通过` database-level`参数的值来制定 database 的级别,默认为`iotdb-common.properties`中设置的级别。
例如当设置 level 参数为 1 时表明此 tsfile 中所有时间序列中层级为1的前缀路径是 database。 | Integer | `[1: Integer.MAX_VALUE]` | 否 | 1 | +| `on-success` | 表示对于成功载入的 tsfile 的处置方式:默认为`delete`,即tsfile 成功加载后将被删除;`none `表明 tsfile 成功加载之后依然被保留在源文件夹, | String | `delete / none` | 否 | delete | +| `model` | 指定写入的 tsfile 是表模型还是树模型 | String | `tree / table` | 否 | 与`-sql_dialect`一致 | +| `database-name` | **仅限表模型有效**: 文件导入的目标 database,不存在时会自动创建,`database-name`中不允许包括"`root.`"前缀,如果包含,将会报错。 | String | `-` | 否 | null | +| `convert-on-type-mismatch` | 加载 tsfile 时,如果数据类型不一致,是否进行转换 | Boolean | `true / false` | 否 | true | +| `verify` | 加载 tsfile 前是否校验 schema | Boolean | `true / false` | 否 | true | +| `tablet-conversion-threshold` | 转换为 tablet 形式的 tsfile 大小阈值,针对小文件 tsfile 加载,采用将其转换为 tablet 形式进行写入:默认值为 -1,即任意大小 tsfile 都不进行转换 | Integer | `[-1,0 :`​`Integer.MAX_VALUE]` | 否 | -1 | +| `async` | 是否开启异步加载 tsfile,将文件移到 active load 目录下面,所有的 tsfile 都 load 到`database-name`下. | Boolean | `true / false` | 否 | false | + +### 4.2 运行示例 + +```SQL +-- 准备待导入环境 +IoTDB> show databases ++-------------+-----------------------+---------------------+-------------------+---------------------+ +| Database|SchemaReplicationFactor|DataReplicationFactor|TimePartitionOrigin|TimePartitionInterval| ++-------------+-----------------------+---------------------+-------------------+---------------------+ +|root.__system| 1| 1| 0| 604800000| ++-------------+-----------------------+---------------------+-------------------+---------------------+ + +-- 通过load sql 导入 tsfile +IoTDB> load '/home/dump1.tsfile' with ( 'on-success'='none') +Msg: The statement is executed successfully. + +-- 验证数据导入成功 +IoTDB> select * from root.testdb.** ++-----------------------------+------------------------------------+---------------------------------+-------------------------------+ +| Time|root.testdb.device.model.temperature|root.testdb.device.model.humidity|root.testdb.device.model.status| ++-----------------------------+------------------------------------+---------------------------------+-------------------------------+ +|2025-04-17T10:35:47.218+08:00| 22.3| 19.4| true| ++-----------------------------+------------------------------------+---------------------------------+-------------------------------+ +``` \ No newline at end of file diff --git a/src/zh/UserGuide/latest-Table/Tools-System/Data-Import-Tool.md b/src/zh/UserGuide/latest-Table/Tools-System/Data-Import-Tool.md index aec8b0017..ac4ac40e7 100644 --- a/src/zh/UserGuide/latest-Table/Tools-System/Data-Import-Tool.md +++ b/src/zh/UserGuide/latest-Table/Tools-System/Data-Import-Tool.md @@ -2,7 +2,10 @@ ## 1. 功能概述 -数据导出工具 `import-data.sh/bat` 位于 `tools` 目录下,可以将 CSV、SQL、及 TsFile(开源时序文件格式)的数据导入 IoTDB。具体功能如下: +IoTDB 支持三种方式进行数据导入: +- 数据导入工具 :`import-data.sh/bat` 位于 `tools` 目录下,可以将 `CSV`、`SQL`、及`TsFile`(开源时序文件格式)的数据导入 `IoTDB`。 +- `TsFile` 自动加载功能。 +- `Load SQL` 导入 `TsFile` 。 @@ -20,15 +23,23 @@ - + + + + + + + + +
可用于单个或一个目录的 SQL 文件批量导入 IoTDB
TsFileTsFile 可用于单个或一个目录的 TsFile 文件批量导入 IoTDB
TsFile 自动加载可以监听指定路径下新产生的 TsFile 文件,并将其加载进 IoTDB
Load SQL可用于单个或一个目录的 TsFile 文件批量导入 IoTDB
- **表模型 TsFile 导入暂时只支持本地导入。** -## 2. 功能详解 +## 2. 数据导入工具 ### 2.1 公共参数 @@ -190,3 +201,103 @@ Fail to insert measurements '[column.name]' caused by [data type is not consiste > tools/import-data.sh -ft tsfile -sql_dialect table -s ./tsfile -db database1 Parse error: Missing required options: os, of ``` + +## 3. TsFile 自动加载功能 + +本功能允许 IoTDB 主动监听指定目录下的新增 TsFile,并将 TsFile 自动加载至 IoTDB 中。通过此功能,IoTDB 能自动检测并加载 TsFile,无需手动执行任何额外的加载操作。 + +![](/img/Data-import1.png) + +### 3.1 配置参数 + +可通过从配置文件模版 `iotdb-system.properties.template` 中找到下列参数,添加到 IoTDB 配置文件 `iotdb-system.properties` 中开启 TsFile 自动加载功能。完整配置如下: + +| **配置参数** | **参数说明** | **value 取值范围** | **是否必填** | **默认值** | **加载方式** | +| --------------------------------------------------- |-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| ---------------------------- | -------------------- | ------------------------ | -------------------- | +| load\_active\_listening\_enable | 是否开启 DataNode 主动监听并且加载 tsfile 的功能(默认开启)。 | Boolean: true,false | 选填 | true | 热加载 | +| load\_active\_listening\_dirs | 需要监听的目录(自动包括目录中的子目录),如有多个使用 “,“ 隔开;
默认的目录为 `ext/load/pending`;
支持热装载;
**注意:表模型中,文件所在的目录名会作为 database**; | String: 一个或多个文件目录 | 选填 | `ext/load/pending` | 热加载 | +| load\_active\_listening\_fail\_dir | 执行加载 tsfile 文件失败后将文件转存的目录,只能配置一个 | String: 一个文件目录 | 选填 | `ext/load/failed` | 热加载 | +| load\_active\_listening\_max\_thread\_num | 同时执行加载 tsfile 任务的最大线程数,参数被注释掉时的默值为 max(1, CPU 核心数 / 2),当用户设置的值不在这个区间[1, CPU核心数 /2]内时,会设置为默认值 (1, CPU 核心数 / 2) | Long: [1, Long.MAX\_VALUE] | 选填 | max(1, CPU 核心数 / 2) | 重启后生效 | +| load\_active\_listening\_check\_interval\_seconds | 主动监听轮询间隔,单位秒。主动监听 tsfile 的功能是通过轮询检查文件夹实现的。该配置指定了两次检查 `load_active_listening_dirs` 的时间间隔,每次检查完成 `load_active_listening_check_interval_seconds` 秒后,会执行下一次检查。当用户设置的轮询间隔小于 1 时,会被设置为默认值 5 秒 | Long: [1, Long.MAX\_VALUE] | 选填 | 5 | 重启后生效 | + +### 3.2 示例说明 + +```bash +load_active_listening_dir/ +├─sensors/ +│ ├─temperature/ +│ │ └─temperature-table.TSFILE + +``` + +- 表模型 TsFile + - `temperature-table.TSFILE`: 会被导入到 `temperature` database 下(因为它位于`sensors/temperature/` 目录下) + +### 3.3 注意事项 + +1. 如果待加载的文件中,存在 mods 文件,应优先将 mods 文件移动到监听目录下面,然后再移动 tsfile 文件,且 mods 文件应和对应的 tsfile 文件处于同一目录。防止加载到 tsfile 文件时,加载不到对应的 mods 文件 +2. 禁止设置 Pipe 的 receiver 目录、存放数据的 data 目录等作为监听目录 +3. 禁止 `load_active_listening_fail_dir` 与 `load_active_listening_dirs` 存在相同的目录,或者互相嵌套 +4. 保证 `load_active_listening_dirs` 目录有足够的权限,在加载成功之后,文件将会被删除,如果没有删除权限,则会重复加载 + +## 4. Load SQL + +IoTDB 支持通过 CLI 执行 SQL 直接将存有时间序列的一个或多个 TsFile 文件导入到另外一个正在运行的 IoTDB 实例中。 + +### 4.1 运行命令 + +```SQL +load '' with ( + 'attribute-key1'='attribute-value1', + 'attribute-key2'='attribute-value2', +) +``` + +* `` :文件本身,或是包含若干文件的文件夹路径 +* ``:可选参数,具体如下表所示 + +| Key | Key 描述 | Value 类型 | Value 取值范围 | Value 是否必填 | Value 默认值 | +| --------------------------------------- |------------------------------------------------------------------------------------------------------------------------------------------------------------------------| ------------ | ----------------------------------------- | ---------------- | -------------------------- | +| `database-level` | 当 tsfile 对应的 database 不存在时,可以通过` database-level`参数的值来制定 database 的级别,默认为`iotdb-common.properties`中设置的级别。
例如当设置 level 参数为 1 时表明此 tsfile 中所有时间序列中层级为1的前缀路径是 database。 | Integer | `[1: Integer.MAX_VALUE]` | 否 | 1 | +| `on-success` | 表示对于成功载入的 tsfile 的处置方式:默认为`delete`,即tsfile 成功加载后将被删除;`none `表明 tsfile 成功加载之后依然被保留在源文件夹, | String | `delete / none` | 否 | delete | +| `model` | 指定写入的 tsfile 是表模型还是树模型 | String | `tree / table` | 否 | 与`-sql_dialect`一致 | +| `database-name` | **仅限表模型有效**: 文件导入的目标 database,不存在时会自动创建,`database-name`中不允许包括"`root.`"前缀,如果包含,将会报错。 | String | `-` | 否 | null | +| `convert-on-type-mismatch` | 加载 tsfile 时,如果数据类型不一致,是否进行转换 | Boolean | `true / false` | 否 | true | +| `verify` | 加载 tsfile 前是否校验 schema | Boolean | `true / false` | 否 | true | +| `tablet-conversion-threshold` | 转换为 tablet 形式的 tsfile 大小阈值,针对小文件 tsfile 加载,采用将其转换为 tablet 形式进行写入:默认值为 -1,即任意大小 tsfile 都不进行转换 | Integer | `[-1,0 :`​`Integer.MAX_VALUE]` | 否 | -1 | +| `async` | 是否开启异步加载 tsfile,将文件移到 active load 目录下面,所有的 tsfile 都 load 到`database-name`下. | Boolean | `true / false` | 否 | false | + +### 4.2 运行示例 + +```SQL +-- 准备目标数据库 database2 +IoTDB> create database database2 +Msg: The statement is executed successfully. + +IoTDB> use database2 +Msg: The statement is executed successfully. + +IoTDB:database2> show tables details ++---------+-------+------+-------+ +|TableName|TTL(ms)|Status|Comment| ++---------+-------+------+-------+ ++---------+-------+------+-------+ +Empty set. + +--通过执行load sql 导入tsfile +IoTDB:database2> load '/home/dump0.tsfile' with ( 'on-success'='none', 'database-name'='database2') +Msg: The statement is executed successfully. + +-- 验证数据导入成功 +IoTDB:database2> select * from table2 ++-----------------------------+------+--------+---------+-----------+--------+------+-----------------------------+ +| time|region|plant_id|device_id|temperature|humidity|status| arrival_time| ++-----------------------------+------+--------+---------+-----------+--------+------+-----------------------------+ +|2024-11-30T00:00:00.000+08:00| 上海| 3002| 101| 90.0| 35.2| true| null| +|2024-11-29T00:00:00.000+08:00| 上海| 3001| 101| 85.0| 35.1| null|2024-11-29T10:00:13.000+08:00| +|2024-11-27T00:00:00.000+08:00| 北京| 1001| 101| 85.0| 35.1| true|2024-11-27T16:37:01.000+08:00| +|2024-11-29T11:00:00.000+08:00| 上海| 3002| 100| null| 45.1| true| null| +|2024-11-28T08:00:00.000+08:00| 上海| 3001| 100| 85.0| 35.2| false|2024-11-28T08:00:09.000+08:00| +|2024-11-26T13:37:00.000+08:00| 北京| 1001| 100| 90.0| 35.1| true|2024-11-26T13:37:34.000+08:00| ++-----------------------------+------+--------+---------+-----------+--------+------+-----------------------------+ +``` diff --git a/src/zh/UserGuide/latest/Tools-System/Data-Import-Tool.md b/src/zh/UserGuide/latest/Tools-System/Data-Import-Tool.md index fd267bcb7..0338fcdbe 100644 --- a/src/zh/UserGuide/latest/Tools-System/Data-Import-Tool.md +++ b/src/zh/UserGuide/latest/Tools-System/Data-Import-Tool.md @@ -2,10 +2,10 @@ ## 1. 功能概述 -IoTDB 支持两种方式进行数据导入 - -* 数据导入工具:tools 目录下的手动数据导入工具 `import-data.sh/bat`,可以将 CSV、SQL、及TsFile(开源时序文件格式)的数据导入 IoTDB。 -* TsFile 自动加载功能 +IoTDB 支持三种方式进行数据导入: +- 数据导入工具 :`import-data.sh/bat` 位于 `tools` 目录下,可以将 `CSV`、`SQL`、及`TsFile`(开源时序文件格式)的数据导入 `IoTDB`。 +- `TsFile` 自动加载功能。 +- `Load SQL` 导入 `TsFile` 。 @@ -23,12 +23,16 @@ IoTDB 支持两种方式进行数据导入 - + - - + + + + + +
可用于单个或一个目录的 SQL 文件批量导入 IoTDB
TsFileTsFile 可用于单个或一个目录的 TsFile 文件批量导入 IoTDB
TsFile 自动加载功能 可以监听指定路径下新产生的TsFile文件,并将其加载进IoTDBTsFile 自动加载可以监听指定路径下新产生的 TsFile 文件,并将其加载进 IoTDB
Load SQL可用于单个或一个目录的 TsFile 文件批量导入 IoTDB
@@ -250,3 +254,54 @@ error: Invalid thread number '0'. Please set a positive integer. 2. 禁止设置 Pipe 的 receiver 目录、存放数据的 data 目录等作为监听目录 3. 禁止 `load_active_listening_fail_dir` 与 `load_active_listening_dirs` 存在相同的目录,或者互相嵌套 4. 保证 `load_active_listening_dirs` 目录有足够的权限,在加载成功之后,文件将会被删除,如果没有删除权限,则会重复加载 + +## 4. Load SQL + +IoTDB 支持通过 CLI 执行 SQL 直接将存有时间序列的一个或多个 TsFile 文件导入到另外一个正在运行的 IoTDB 实例中。 + +### 4.1 运行命令 + +```SQL +load '' with ( + 'attribute-key1'='attribute-value1', + 'attribute-key2'='attribute-value2', +) +``` + +* `` :文件本身,或是包含若干文件的文件夹路径 +* ``:可选参数,具体如下表所示 + +| Key | Key 描述 | Value 类型 | Value 取值范围 | Value 是否必填 | Value 默认值 | +| --------------------------------------- |------------------------------------------------------------------------------------------------------------------------------------------------------------------------| ------------ | ----------------------------------------- | ---------------- | -------------------------- | +| `database-level` | 当 tsfile 对应的 database 不存在时,可以通过` database-level`参数的值来制定 database 的级别,默认为`iotdb-common.properties`中设置的级别。
例如当设置 level 参数为 1 时表明此 tsfile 中所有时间序列中层级为1的前缀路径是 database。 | Integer | `[1: Integer.MAX_VALUE]` | 否 | 1 | +| `on-success` | 表示对于成功载入的 tsfile 的处置方式:默认为`delete`,即tsfile 成功加载后将被删除;`none `表明 tsfile 成功加载之后依然被保留在源文件夹, | String | `delete / none` | 否 | delete | +| `model` | 指定写入的 tsfile 是表模型还是树模型 | String | `tree / table` | 否 | 与`-sql_dialect`一致 | +| `database-name` | **仅限表模型有效**: 文件导入的目标 database,不存在时会自动创建,`database-name`中不允许包括"`root.`"前缀,如果包含,将会报错。 | String | `-` | 否 | null | +| `convert-on-type-mismatch` | 加载 tsfile 时,如果数据类型不一致,是否进行转换 | Boolean | `true / false` | 否 | true | +| `verify` | 加载 tsfile 前是否校验 schema | Boolean | `true / false` | 否 | true | +| `tablet-conversion-threshold` | 转换为 tablet 形式的 tsfile 大小阈值,针对小文件 tsfile 加载,采用将其转换为 tablet 形式进行写入:默认值为 -1,即任意大小 tsfile 都不进行转换 | Integer | `[-1,0 :`​`Integer.MAX_VALUE]` | 否 | -1 | +| `async` | 是否开启异步加载 tsfile,将文件移到 active load 目录下面,所有的 tsfile 都 load 到`database-name`下. | Boolean | `true / false` | 否 | false | + +### 4.2 运行示例 + +```SQL +-- 准备待导入环境 +IoTDB> show databases ++-------------+-----------------------+---------------------+-------------------+---------------------+ +| Database|SchemaReplicationFactor|DataReplicationFactor|TimePartitionOrigin|TimePartitionInterval| ++-------------+-----------------------+---------------------+-------------------+---------------------+ +|root.__system| 1| 1| 0| 604800000| ++-------------+-----------------------+---------------------+-------------------+---------------------+ + +-- 通过load sql 导入 tsfile +IoTDB> load '/home/dump1.tsfile' with ( 'on-success'='none') +Msg: The statement is executed successfully. + +-- 验证数据导入成功 +IoTDB> select * from root.testdb.** ++-----------------------------+------------------------------------+---------------------------------+-------------------------------+ +| Time|root.testdb.device.model.temperature|root.testdb.device.model.humidity|root.testdb.device.model.status| ++-----------------------------+------------------------------------+---------------------------------+-------------------------------+ +|2025-04-17T10:35:47.218+08:00| 22.3| 19.4| true| ++-----------------------------+------------------------------------+---------------------------------+-------------------------------+ +``` \ No newline at end of file