From 3f9b0a6556b324cc1332e3ecc0786062dcb883e0 Mon Sep 17 00:00:00 2001 From: Leto_b Date: Wed, 26 Mar 2025 16:25:11 +0800 Subject: [PATCH] add schema sync in table model --- .../Table/User-Manual/Data-Sync_apache.md | 40 +++++++++++++++++- .../Table/User-Manual/Data-Sync_timecho.md | 40 +++++++++++++++++- .../User-Manual/Data-Sync_apache.md | 40 +++++++++++++++++- .../User-Manual/Data-Sync_timecho.md | 40 +++++++++++++++++- .../Table/User-Manual/Data-Sync_apache.md | 41 ++++++++++++++++++- .../Table/User-Manual/Data-Sync_timecho.md | 41 ++++++++++++++++++- .../User-Manual/Data-Sync_apache.md | 41 ++++++++++++++++++- .../User-Manual/Data-Sync_timecho.md | 40 +++++++++++++++++- 8 files changed, 315 insertions(+), 8 deletions(-) diff --git a/src/UserGuide/Master/Table/User-Manual/Data-Sync_apache.md b/src/UserGuide/Master/Table/User-Manual/Data-Sync_apache.md index df9ed3a38..b4f2399f2 100644 --- a/src/UserGuide/Master/Table/User-Manual/Data-Sync_apache.md +++ b/src/UserGuide/Master/Table/User-Manual/Data-Sync_apache.md @@ -35,7 +35,43 @@ A data synchronization task consists of three stages: - Process Stage: This stage is used to process the data extracted from the source IoTDB, defined in the `processor` section of the SQL statement. - Sink Stage: This stage is used to send data to the target IoTDB, defined in the `sink` section of the SQL statement. -By declaratively configuring these three parts in an SQL statement, flexible data synchronization capabilities can be achieved. +By declaratively configuring these three parts in an SQL statement, flexible data synchronization capabilities can be achieved. Currently, data synchronization supports the synchronization of the following information, and you can select the synchronization scope when creating a synchronization task (the default is data.insert, which means synchronizing newly written data): + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Synchronization ScopeSynchronization Content Description
allAll scopes
data(Data)insertSynchronize newly written data
schemadatabaseSynchronize database creation, modification or deletion operations
tableSynchronize table creation, modification or deletion operations
TTLSynchronize the data retention time
auth-Synchronize user permissions and access control
### 1.2 Functional Limitations and Notes @@ -471,6 +507,8 @@ pipe_all_sinks_rate_limit_bytes_per_second=-1 | **Parameter** | **Description** | **Value Range** | **Required** | **Default Value** | | :----------------------- | :----------------------------------------------------------- | :----------------------------------------------------------- | :----------- | :---------------------------------------------------------- | | source | iotdb-source | String: iotdb-source | Yes | - | +| inclusion | Used to specify the range of data to be synchronized in the data synchronization task, including data, schema, and auth | String:all, data(insert), schema(database,timeseries,ttl), auth | Optional | data.insert | +| inclusion.exclusion | Used to exclude specific operations from the range specified by inclusion, reducing the amount of data synchronized | String:all, data(insert), schema(database,timeseries,ttl), auth | Optional | - | | mode.streaming | This parameter specifies the source of time-series data capture. It applies to scenarios where `mode.streaming` is set to `false`, determining the capture source for `data.insert` in `inclusion`. Two capture strategies are available: - **true**: Dynamically selects the capture type. The system adapts to downstream processing speed, choosing between capturing each write request or only capturing TsFile file sealing requests. When downstream processing is fast, write requests are prioritized to reduce latency; when processing is slow, only file sealing requests are captured to prevent processing backlogs. This mode suits most scenarios, optimizing the balance between processing latency and throughput. - **false**: Uses a fixed batch capture approach, capturing only TsFile file sealing requests. This mode is suitable for resource-constrained applications, reducing system load. **Note**: Snapshot data captured when the pipe starts will only be provided for downstream processing as files. | Boolean: true / false | No | true | | mode.strict | Determines whether to strictly filter data when using the `time`, `path`, `database-name`, or `table-name` parameters: - **true**: Strict filtering. The system will strictly filter captured data according to the specified conditions, ensuring that only matching data is selected. - **false**: Non-strict filtering. Some extra data may be included during the selection process to optimize performance and reduce CPU and I/O consumption. | Boolean: true / false | No | true | | mode.snapshot | This parameter determines the data capture mode, affecting the `data` in `inclusion`. Two modes are available: - **true**: Static data capture. A one-time data snapshot is taken when the pipe starts. Once the snapshot data is fully consumed, the pipe automatically terminates (executing `DROP PIPE` SQL automatically). - **false**: Dynamic data capture. In addition to capturing snapshot data when the pipe starts, it continuously captures subsequent data changes. The pipe remains active to process the dynamic data stream. | Boolean: true / false | No | false | diff --git a/src/UserGuide/Master/Table/User-Manual/Data-Sync_timecho.md b/src/UserGuide/Master/Table/User-Manual/Data-Sync_timecho.md index 655087533..98961b336 100644 --- a/src/UserGuide/Master/Table/User-Manual/Data-Sync_timecho.md +++ b/src/UserGuide/Master/Table/User-Manual/Data-Sync_timecho.md @@ -34,7 +34,43 @@ A data synchronization task consists of three stages: - Process Stage: This stage is used to process the data extracted from the source IoTDB, defined in the `processor` section of the SQL statement. - Sink Stage: This stage is used to send data to the target IoTDB, defined in the `sink` section of the SQL statement. -By declaratively configuring these three parts in an SQL statement, flexible data synchronization capabilities can be achieved. +By declaratively configuring these three parts in an SQL statement, flexible data synchronization capabilities can be achieved. Currently, data synchronization supports the synchronization of the following information, and you can select the synchronization scope when creating a synchronization task (the default is data.insert, which means synchronizing newly written data): + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Synchronization ScopeSynchronization Content Description
allAll scopes
data(Data)insertSynchronize newly written data
schemadatabaseSynchronize database creation, modification or deletion operations
tableSynchronize table creation, modification or deletion operations
TTLSynchronize the data retention time
auth-Synchronize user permissions and access control
### 1.2 Functional Limitations and Notes @@ -498,6 +534,8 @@ pipe_all_sinks_rate_limit_bytes_per_second=-1 | **Parameter** | **Description** | **Value Range** | **Required** | **Default Value** | | :----------------------- | :----------------------------------------------------------- | :----------------------------------------------------------- | :----------- | :---------------------------------------------------------- | | source | iotdb-source | String: iotdb-source | Yes | - | +| inclusion | Used to specify the range of data to be synchronized in the data synchronization task, including data, schema, and auth | String:all, data(insert), schema(database,timeseries,ttl), auth | Optional | data.insert | +| inclusion.exclusion | Used to exclude specific operations from the range specified by inclusion, reducing the amount of data synchronized | String:all, data(insert), schema(database,timeseries,ttl), auth | Optional | - | | mode.streaming | This parameter specifies the source of time-series data capture. It applies to scenarios where `mode.streaming` is set to `false`, determining the capture source for `data.insert` in `inclusion`. Two capture strategies are available: - **true**: Dynamically selects the capture type. The system adapts to downstream processing speed, choosing between capturing each write request or only capturing TsFile file sealing requests. When downstream processing is fast, write requests are prioritized to reduce latency; when processing is slow, only file sealing requests are captured to prevent processing backlogs. This mode suits most scenarios, optimizing the balance between processing latency and throughput. - **false**: Uses a fixed batch capture approach, capturing only TsFile file sealing requests. This mode is suitable for resource-constrained applications, reducing system load. **Note**: Snapshot data captured when the pipe starts will only be provided for downstream processing as files. | Boolean: true / false | No | true | | mode.strict | Determines whether to strictly filter data when using the `time`, `path`, `database-name`, or `table-name` parameters: - **true**: Strict filtering. The system will strictly filter captured data according to the specified conditions, ensuring that only matching data is selected. - **false**: Non-strict filtering. Some extra data may be included during the selection process to optimize performance and reduce CPU and I/O consumption. | Boolean: true / false | No | true | | mode.snapshot | This parameter determines the data capture mode, affecting the `data` in `inclusion`. Two modes are available: - **true**: Static data capture. A one-time data snapshot is taken when the pipe starts. Once the snapshot data is fully consumed, the pipe automatically terminates (executing `DROP PIPE` SQL automatically). - **false**: Dynamic data capture. In addition to capturing snapshot data when the pipe starts, it continuously captures subsequent data changes. The pipe remains active to process the dynamic data stream. | Boolean: true / false | No | false | diff --git a/src/UserGuide/latest-Table/User-Manual/Data-Sync_apache.md b/src/UserGuide/latest-Table/User-Manual/Data-Sync_apache.md index df9ed3a38..4ce43b605 100644 --- a/src/UserGuide/latest-Table/User-Manual/Data-Sync_apache.md +++ b/src/UserGuide/latest-Table/User-Manual/Data-Sync_apache.md @@ -35,7 +35,43 @@ A data synchronization task consists of three stages: - Process Stage: This stage is used to process the data extracted from the source IoTDB, defined in the `processor` section of the SQL statement. - Sink Stage: This stage is used to send data to the target IoTDB, defined in the `sink` section of the SQL statement. -By declaratively configuring these three parts in an SQL statement, flexible data synchronization capabilities can be achieved. +By declaratively configuring these three parts in an SQL statement, flexible data synchronization capabilities can be achieved. Currently, data synchronization supports the synchronization of the following information, and you can select the synchronization scope when creating a synchronization task (the default is data.insert, which means synchronizing newly written data): + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Synchronization ScopeSynchronization Content Description
allAll scopes
data(Data)insertSynchronize newly written data
schemadatabaseSynchronize database creation, modification or deletion operations
tableSynchronize table creation, modification or deletion operations
TTLSynchronize the data retention time
auth-Synchronize user permissions and access control
### 1.2 Functional Limitations and Notes @@ -471,6 +507,8 @@ pipe_all_sinks_rate_limit_bytes_per_second=-1 | **Parameter** | **Description** | **Value Range** | **Required** | **Default Value** | | :----------------------- | :----------------------------------------------------------- | :----------------------------------------------------------- | :----------- | :---------------------------------------------------------- | | source | iotdb-source | String: iotdb-source | Yes | - | +| inclusion | Used to specify the range of data to be synchronized in the data synchronization task, including data, schema, and auth | String:all, data(insert), schema(database,timeseries,ttl), auth | Optional | data.insert | +| inclusion.exclusion | Used to exclude specific operations from the range specified by inclusion, reducing the amount of data synchronized | String:all, data(insert), schema(database,timeseries,ttl), auth | Optional | - | | mode.streaming | This parameter specifies the source of time-series data capture. It applies to scenarios where `mode.streaming` is set to `false`, determining the capture source for `data.insert` in `inclusion`. Two capture strategies are available: - **true**: Dynamically selects the capture type. The system adapts to downstream processing speed, choosing between capturing each write request or only capturing TsFile file sealing requests. When downstream processing is fast, write requests are prioritized to reduce latency; when processing is slow, only file sealing requests are captured to prevent processing backlogs. This mode suits most scenarios, optimizing the balance between processing latency and throughput. - **false**: Uses a fixed batch capture approach, capturing only TsFile file sealing requests. This mode is suitable for resource-constrained applications, reducing system load. **Note**: Snapshot data captured when the pipe starts will only be provided for downstream processing as files. | Boolean: true / false | No | true | | mode.strict | Determines whether to strictly filter data when using the `time`, `path`, `database-name`, or `table-name` parameters: - **true**: Strict filtering. The system will strictly filter captured data according to the specified conditions, ensuring that only matching data is selected. - **false**: Non-strict filtering. Some extra data may be included during the selection process to optimize performance and reduce CPU and I/O consumption. | Boolean: true / false | No | true | | mode.snapshot | This parameter determines the data capture mode, affecting the `data` in `inclusion`. Two modes are available: - **true**: Static data capture. A one-time data snapshot is taken when the pipe starts. Once the snapshot data is fully consumed, the pipe automatically terminates (executing `DROP PIPE` SQL automatically). - **false**: Dynamic data capture. In addition to capturing snapshot data when the pipe starts, it continuously captures subsequent data changes. The pipe remains active to process the dynamic data stream. | Boolean: true / false | No | false | diff --git a/src/UserGuide/latest-Table/User-Manual/Data-Sync_timecho.md b/src/UserGuide/latest-Table/User-Manual/Data-Sync_timecho.md index 655087533..98961b336 100644 --- a/src/UserGuide/latest-Table/User-Manual/Data-Sync_timecho.md +++ b/src/UserGuide/latest-Table/User-Manual/Data-Sync_timecho.md @@ -34,7 +34,43 @@ A data synchronization task consists of three stages: - Process Stage: This stage is used to process the data extracted from the source IoTDB, defined in the `processor` section of the SQL statement. - Sink Stage: This stage is used to send data to the target IoTDB, defined in the `sink` section of the SQL statement. -By declaratively configuring these three parts in an SQL statement, flexible data synchronization capabilities can be achieved. +By declaratively configuring these three parts in an SQL statement, flexible data synchronization capabilities can be achieved. Currently, data synchronization supports the synchronization of the following information, and you can select the synchronization scope when creating a synchronization task (the default is data.insert, which means synchronizing newly written data): + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Synchronization ScopeSynchronization Content Description
allAll scopes
data(Data)insertSynchronize newly written data
schemadatabaseSynchronize database creation, modification or deletion operations
tableSynchronize table creation, modification or deletion operations
TTLSynchronize the data retention time
auth-Synchronize user permissions and access control
### 1.2 Functional Limitations and Notes @@ -498,6 +534,8 @@ pipe_all_sinks_rate_limit_bytes_per_second=-1 | **Parameter** | **Description** | **Value Range** | **Required** | **Default Value** | | :----------------------- | :----------------------------------------------------------- | :----------------------------------------------------------- | :----------- | :---------------------------------------------------------- | | source | iotdb-source | String: iotdb-source | Yes | - | +| inclusion | Used to specify the range of data to be synchronized in the data synchronization task, including data, schema, and auth | String:all, data(insert), schema(database,timeseries,ttl), auth | Optional | data.insert | +| inclusion.exclusion | Used to exclude specific operations from the range specified by inclusion, reducing the amount of data synchronized | String:all, data(insert), schema(database,timeseries,ttl), auth | Optional | - | | mode.streaming | This parameter specifies the source of time-series data capture. It applies to scenarios where `mode.streaming` is set to `false`, determining the capture source for `data.insert` in `inclusion`. Two capture strategies are available: - **true**: Dynamically selects the capture type. The system adapts to downstream processing speed, choosing between capturing each write request or only capturing TsFile file sealing requests. When downstream processing is fast, write requests are prioritized to reduce latency; when processing is slow, only file sealing requests are captured to prevent processing backlogs. This mode suits most scenarios, optimizing the balance between processing latency and throughput. - **false**: Uses a fixed batch capture approach, capturing only TsFile file sealing requests. This mode is suitable for resource-constrained applications, reducing system load. **Note**: Snapshot data captured when the pipe starts will only be provided for downstream processing as files. | Boolean: true / false | No | true | | mode.strict | Determines whether to strictly filter data when using the `time`, `path`, `database-name`, or `table-name` parameters: - **true**: Strict filtering. The system will strictly filter captured data according to the specified conditions, ensuring that only matching data is selected. - **false**: Non-strict filtering. Some extra data may be included during the selection process to optimize performance and reduce CPU and I/O consumption. | Boolean: true / false | No | true | | mode.snapshot | This parameter determines the data capture mode, affecting the `data` in `inclusion`. Two modes are available: - **true**: Static data capture. A one-time data snapshot is taken when the pipe starts. Once the snapshot data is fully consumed, the pipe automatically terminates (executing `DROP PIPE` SQL automatically). - **false**: Dynamic data capture. In addition to capturing snapshot data when the pipe starts, it continuously captures subsequent data changes. The pipe remains active to process the dynamic data stream. | Boolean: true / false | No | false | diff --git a/src/zh/UserGuide/Master/Table/User-Manual/Data-Sync_apache.md b/src/zh/UserGuide/Master/Table/User-Manual/Data-Sync_apache.md index 20e2a316b..c6ce3dc83 100644 --- a/src/zh/UserGuide/Master/Table/User-Manual/Data-Sync_apache.md +++ b/src/zh/UserGuide/Master/Table/User-Manual/Data-Sync_apache.md @@ -34,7 +34,44 @@ - 处理(Process)阶段:该部分用于处理从源 IoTDB 抽取出的数据,在 SQL 语句中的 processor 部分定义 - 发送(Sink)阶段:该部分用于向目标 IoTDB 发送数据,在 SQL 语句中的 sink 部分定义 -通过 SQL 语句声明式地配置 3 个部分的具体内容,可实现灵活的数据同步能力。 +通过 SQL 语句声明式地配置 3 个部分的具体内容,可实现灵活的数据同步能力。目前数据同步支持以下信息的同步,您可以在创建同步任务时对同步范围进行选择(默认选择 data.insert,即同步新写入的数据): + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
同步范围同步内容说明
all所有范围
data(数据)insert(增量)同步新写入的数据
schema(元数据)database(数据库)同步数据库的创建、修改或删除操作
table(表)同步表的创建、修改或删除操作
TTL(数据到期时间)同步数据的存活时间
auth(权限)-同步用户权限和访问控制
+ ### 1.2 功能限制及说明 @@ -463,6 +500,8 @@ pipe_all_sinks_rate_limit_bytes_per_second=-1 | **参数** | **描述** | **value 取值范围** | **是否必填** | **默认取值** | | ------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ | ------------ | ------------------------------- | | source | iotdb-source | String: iotdb-source | 必填 | - | +| inclusion | 用于指定数据同步任务中需要同步范围,分为数据、元数据和权限 | String:all, data(insert), schema(database,timeseries,ttl), auth | 选填 | data.insert | +| inclusion.exclusion | 用于从 inclusion 指定的同步范围内排除特定的操作,减少同步的数据量 | String:all, data(insert), schema(database,timeseries,ttl), auth | 选填 | 空字符串 | | mode.streaming | 此参数指定时序数据写入的捕获来源。适用于 `mode.streaming`为 `false` 模式下的场景,决定`inclusion`中`data.insert`数据的捕获来源。提供两种捕获策略:true: 动态选择捕获的类型。系统将根据下游处理速度,自适应地选择是捕获每个写入请求还是仅捕获 TsFile 文件的封口请求。当下游处理速度快时,优先捕获写入请求以减少延迟;当处理速度慢时,仅捕获文件封口请求以避免处理堆积。这种模式适用于大多数场景,能够实现处理延迟和吞吐量的最优平衡。false:固定按批捕获方式。仅捕获 TsFile 文件的封口请求,适用于资源受限的应用场景,以降低系统负载。注意,pipe 启动时捕获的快照数据只会以文件的方式供下游处理。 | Boolean: true / false | 否 | true | | mode.strict | 在使用 time / path / database-name / table-name 参数过滤数据时,是否需要严格按照条件筛选:`true`: 严格筛选。系统将完全按照给定条件过滤筛选被捕获的数据,确保只有符合条件的数据被选中。`false`:非严格筛选。系统在筛选被捕获的数据时可能会包含一些额外的数据,适用于性能敏感的场景,可降低 CPU 和 IO 消耗。 | Boolean: true / false | 否 | true | | mode.snapshot | 此参数决定时序数据的捕获方式,影响`inclusion`中的`data`数据。提供两种模式:`true`:静态数据捕获。启动 pipe 时,会进行一次性的数据快照捕获。当快照数据被完全消费后,**pipe 将自动终止(DROP PIPE SQL 会自动执行)**。`false`:动态数据捕获。除了在 pipe 启动时捕获快照数据外,还会持续捕获后续的数据变更。pipe 将持续运行以处理动态数据流。 | Boolean: true / false | 否 | false | diff --git a/src/zh/UserGuide/Master/Table/User-Manual/Data-Sync_timecho.md b/src/zh/UserGuide/Master/Table/User-Manual/Data-Sync_timecho.md index fc65c8d9f..2e426d3cc 100644 --- a/src/zh/UserGuide/Master/Table/User-Manual/Data-Sync_timecho.md +++ b/src/zh/UserGuide/Master/Table/User-Manual/Data-Sync_timecho.md @@ -34,7 +34,44 @@ - 处理(Process)阶段:该部分用于处理从源 IoTDB 抽取出的数据,在 SQL 语句中的 processor 部分定义 - 发送(Sink)阶段:该部分用于向目标 IoTDB 发送数据,在 SQL 语句中的 sink 部分定义 -通过 SQL 语句声明式地配置 3 个部分的具体内容,可实现灵活的数据同步能力。 +通过 SQL 语句声明式地配置 3 个部分的具体内容,可实现灵活的数据同步能力。目前数据同步支持以下信息的同步,您可以在创建同步任务时对同步范围进行选择(默认选择 data.insert,即同步新写入的数据): + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
同步范围同步内容说明
all所有范围
data(数据)insert(增量)同步新写入的数据
schema(元数据)database(数据库)同步数据库的创建、修改或删除操作
table(表)同步表的创建、修改或删除操作
TTL(数据到期时间)同步数据的存活时间
auth(权限)-同步用户权限和访问控制
+ ### 1.2 功能限制及说明 @@ -505,6 +542,8 @@ pipe_all_sinks_rate_limit_bytes_per_second=-1 | **参数** | **描述** | **value 取值范围** | **是否必填** | **默认取值** | | ------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ | ------------ | ------------------------------- | | source | iotdb-source | String: iotdb-source | 必填 | - | +| inclusion | 用于指定数据同步任务中需要同步范围,分为数据、元数据和权限 | String:all, data(insert), schema(database,timeseries,ttl), auth | 选填 | data.insert | +| inclusion.exclusion | 用于从 inclusion 指定的同步范围内排除特定的操作,减少同步的数据量 | String:all, data(insert), schema(database,timeseries,ttl), auth | 选填 | 空字符串 | | mode.streaming | 此参数指定时序数据写入的捕获来源。适用于 `mode.streaming`为 `false` 模式下的场景,决定`inclusion`中`data.insert`数据的捕获来源。提供两种捕获策略:true: 动态选择捕获的类型。系统将根据下游处理速度,自适应地选择是捕获每个写入请求还是仅捕获 TsFile 文件的封口请求。当下游处理速度快时,优先捕获写入请求以减少延迟;当处理速度慢时,仅捕获文件封口请求以避免处理堆积。这种模式适用于大多数场景,能够实现处理延迟和吞吐量的最优平衡。false:固定按批捕获方式。仅捕获 TsFile 文件的封口请求,适用于资源受限的应用场景,以降低系统负载。注意,pipe 启动时捕获的快照数据只会以文件的方式供下游处理。 | Boolean: true / false | 否 | true | | mode.strict | 在使用 time / path / database-name / table-name 参数过滤数据时,是否需要严格按照条件筛选:`true`: 严格筛选。系统将完全按照给定条件过滤筛选被捕获的数据,确保只有符合条件的数据被选中。`false`:非严格筛选。系统在筛选被捕获的数据时可能会包含一些额外的数据,适用于性能敏感的场景,可降低 CPU 和 IO 消耗。 | Boolean: true / false | 否 | true | | mode.snapshot | 此参数决定时序数据的捕获方式,影响`inclusion`中的`data`数据。提供两种模式:`true`:静态数据捕获。启动 pipe 时,会进行一次性的数据快照捕获。当快照数据被完全消费后,**pipe 将自动终止(DROP PIPE SQL 会自动执行)**。`false`:动态数据捕获。除了在 pipe 启动时捕获快照数据外,还会持续捕获后续的数据变更。pipe 将持续运行以处理动态数据流。 | Boolean: true / false | 否 | false | diff --git a/src/zh/UserGuide/latest-Table/User-Manual/Data-Sync_apache.md b/src/zh/UserGuide/latest-Table/User-Manual/Data-Sync_apache.md index 20e2a316b..db6e551f8 100644 --- a/src/zh/UserGuide/latest-Table/User-Manual/Data-Sync_apache.md +++ b/src/zh/UserGuide/latest-Table/User-Manual/Data-Sync_apache.md @@ -34,7 +34,44 @@ - 处理(Process)阶段:该部分用于处理从源 IoTDB 抽取出的数据,在 SQL 语句中的 processor 部分定义 - 发送(Sink)阶段:该部分用于向目标 IoTDB 发送数据,在 SQL 语句中的 sink 部分定义 -通过 SQL 语句声明式地配置 3 个部分的具体内容,可实现灵活的数据同步能力。 +通过 SQL 语句声明式地配置 3 个部分的具体内容,可实现灵活的数据同步能力。目前数据同步支持以下信息的同步,您可以在创建同步任务时对同步范围进行选择(默认选择 data.insert,即同步新写入的数据): + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
同步范围同步内容说明
all所有范围
data(数据)insert(增量)同步新写入的数据
schema(元数据)database(数据库)同步数据库的创建、修改或删除操作
table(表)同步表的创建、修改或删除操作
TTL(数据到期时间)同步数据的存活时间
auth(权限)-同步用户权限和访问控制
+ ### 1.2 功能限制及说明 @@ -463,6 +500,8 @@ pipe_all_sinks_rate_limit_bytes_per_second=-1 | **参数** | **描述** | **value 取值范围** | **是否必填** | **默认取值** | | ------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ | ------------ | ------------------------------- | | source | iotdb-source | String: iotdb-source | 必填 | - | +| inclusion | 用于指定数据同步任务中需要同步范围,分为数据、元数据和权限 | String:all, data(insert), schema(database,table,ttl), auth | 选填 | data.insert | +| inclusion.exclusion | 用于从 inclusion 指定的同步范围内排除特定的操作,减少同步的数据量 | String:all, data(insert), schema(database,table,ttl), auth | 选填 | 空字符串 | | mode.streaming | 此参数指定时序数据写入的捕获来源。适用于 `mode.streaming`为 `false` 模式下的场景,决定`inclusion`中`data.insert`数据的捕获来源。提供两种捕获策略:true: 动态选择捕获的类型。系统将根据下游处理速度,自适应地选择是捕获每个写入请求还是仅捕获 TsFile 文件的封口请求。当下游处理速度快时,优先捕获写入请求以减少延迟;当处理速度慢时,仅捕获文件封口请求以避免处理堆积。这种模式适用于大多数场景,能够实现处理延迟和吞吐量的最优平衡。false:固定按批捕获方式。仅捕获 TsFile 文件的封口请求,适用于资源受限的应用场景,以降低系统负载。注意,pipe 启动时捕获的快照数据只会以文件的方式供下游处理。 | Boolean: true / false | 否 | true | | mode.strict | 在使用 time / path / database-name / table-name 参数过滤数据时,是否需要严格按照条件筛选:`true`: 严格筛选。系统将完全按照给定条件过滤筛选被捕获的数据,确保只有符合条件的数据被选中。`false`:非严格筛选。系统在筛选被捕获的数据时可能会包含一些额外的数据,适用于性能敏感的场景,可降低 CPU 和 IO 消耗。 | Boolean: true / false | 否 | true | | mode.snapshot | 此参数决定时序数据的捕获方式,影响`inclusion`中的`data`数据。提供两种模式:`true`:静态数据捕获。启动 pipe 时,会进行一次性的数据快照捕获。当快照数据被完全消费后,**pipe 将自动终止(DROP PIPE SQL 会自动执行)**。`false`:动态数据捕获。除了在 pipe 启动时捕获快照数据外,还会持续捕获后续的数据变更。pipe 将持续运行以处理动态数据流。 | Boolean: true / false | 否 | false | diff --git a/src/zh/UserGuide/latest-Table/User-Manual/Data-Sync_timecho.md b/src/zh/UserGuide/latest-Table/User-Manual/Data-Sync_timecho.md index fc65c8d9f..73f16f84a 100644 --- a/src/zh/UserGuide/latest-Table/User-Manual/Data-Sync_timecho.md +++ b/src/zh/UserGuide/latest-Table/User-Manual/Data-Sync_timecho.md @@ -34,7 +34,43 @@ - 处理(Process)阶段:该部分用于处理从源 IoTDB 抽取出的数据,在 SQL 语句中的 processor 部分定义 - 发送(Sink)阶段:该部分用于向目标 IoTDB 发送数据,在 SQL 语句中的 sink 部分定义 -通过 SQL 语句声明式地配置 3 个部分的具体内容,可实现灵活的数据同步能力。 +通过 SQL 语句声明式地配置 3 个部分的具体内容,可实现灵活的数据同步能力。目前数据同步支持以下信息的同步,您可以在创建同步任务时对同步范围进行选择(默认选择 data.insert,即同步新写入的数据): + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
同步范围同步内容说明
all所有范围
data(数据)insert(增量)同步新写入的数据
schema(元数据)database(数据库)同步数据库的创建、修改或删除操作
table(表)同步表的创建、修改或删除操作
TTL(数据到期时间)同步数据的存活时间
auth(权限)-同步用户权限和访问控制
### 1.2 功能限制及说明 @@ -505,6 +541,8 @@ pipe_all_sinks_rate_limit_bytes_per_second=-1 | **参数** | **描述** | **value 取值范围** | **是否必填** | **默认取值** | | ------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ | ------------ | ------------------------------- | | source | iotdb-source | String: iotdb-source | 必填 | - | +| inclusion | 用于指定数据同步任务中需要同步范围,分为数据、元数据和权限 | String:all, data(insert), schema(database,timeseries,ttl), auth | 选填 | data.insert | +| inclusion.exclusion | 用于从 inclusion 指定的同步范围内排除特定的操作,减少同步的数据量 | String:all, data(insert), schema(database,timeseries,ttl), auth | 选填 | 空字符串 | | mode.streaming | 此参数指定时序数据写入的捕获来源。适用于 `mode.streaming`为 `false` 模式下的场景,决定`inclusion`中`data.insert`数据的捕获来源。提供两种捕获策略:true: 动态选择捕获的类型。系统将根据下游处理速度,自适应地选择是捕获每个写入请求还是仅捕获 TsFile 文件的封口请求。当下游处理速度快时,优先捕获写入请求以减少延迟;当处理速度慢时,仅捕获文件封口请求以避免处理堆积。这种模式适用于大多数场景,能够实现处理延迟和吞吐量的最优平衡。false:固定按批捕获方式。仅捕获 TsFile 文件的封口请求,适用于资源受限的应用场景,以降低系统负载。注意,pipe 启动时捕获的快照数据只会以文件的方式供下游处理。 | Boolean: true / false | 否 | true | | mode.strict | 在使用 time / path / database-name / table-name 参数过滤数据时,是否需要严格按照条件筛选:`true`: 严格筛选。系统将完全按照给定条件过滤筛选被捕获的数据,确保只有符合条件的数据被选中。`false`:非严格筛选。系统在筛选被捕获的数据时可能会包含一些额外的数据,适用于性能敏感的场景,可降低 CPU 和 IO 消耗。 | Boolean: true / false | 否 | true | | mode.snapshot | 此参数决定时序数据的捕获方式,影响`inclusion`中的`data`数据。提供两种模式:`true`:静态数据捕获。启动 pipe 时,会进行一次性的数据快照捕获。当快照数据被完全消费后,**pipe 将自动终止(DROP PIPE SQL 会自动执行)**。`false`:动态数据捕获。除了在 pipe 启动时捕获快照数据外,还会持续捕获后续的数据变更。pipe 将持续运行以处理动态数据流。 | Boolean: true / false | 否 | false |