diff --git a/docs/case-practices/best-practice/compare-data-model.md b/docs/case-practices/best-practice/compare-data-model.md new file mode 100644 index 00000000..0f1cc2e7 --- /dev/null +++ b/docs/case-practices/best-practice/compare-data-model.md @@ -0,0 +1,50 @@ +# Compare Target Data Models + +Tapdata provides a target model comparison feature to help you identify and resolve structural differences between your logical model and the actual target table during task configuration. This ensures smooth and reliable data synchronization, even in complex production environments. + +## Overview + +When you create a data replication or transformation task, Tapdata automatically generates a logical model based on the source table structure—this includes fields, data types, primary keys, etc. The platform can then create or adjust tables on the target side to ensure consistency throughout the sync process. + +However, in many real-world scenarios, target tables may already exist—created and managed by external systems and used by multiple business processes. In such cases, you typically don’t want Tapdata to modify those tables automatically. This can lead to mismatches between the logical model and the actual target schema, such as: + +* Missing fields +* Incompatible data types +* Read-only or system-maintained fields that can't be written to + +These mismatches can cause task failures or force you to add manual field mapping and filtering, increasing operational overhead and troubleshooting effort. + +To address this, Tapdata's **Target Model Comparison** feature helps you detect schema differences and provides flexible handling options to keep your sync tasks stable and efficient. + +## Prerequisites + +* The task's target must be a relational database (e.g., MySQL, PostgreSQL). +* The target table strategy must be set to **Keep existing table structure and data** (default setting). + +## How to Use + +1. Log in to the Tapdata platform. + +2. Create a [data replication task](../../data-replication/create-task.md) or a [data transformation task](../../data-transformation/create-task.md). + +3. Before starting the task, click the target node in the task canvas. Tapdata will automatically compare the source model with the target table structure and display the differences. + + ![Model Comparison Result](../../images/data_comparison_result.png) + +4. If differences are found, you’ll be able to choose how to handle them. Options include removing missing or non-writable fields, or updating field types to match the target schema. + + In the example below, a field type mismatch is resolved by updating the field type to match the target. + + ![Handle Schema Differences](../../images/data_comparison_handle.png) + + **What the Differences Mean** + + * **Read-Only**: Target table contains fields that cannot be written to (e.g., read-only or system-managed). These should be removed from the model to avoid write failures. + * **Missing Fields**: The target table lacks certain fields present in the source model. This can cause mapping errors during sync. + * **Type Different**: The field exists in both source and target but with different data types. This may result in type conversion errors during sync. + * **Not Defined**: Usually means the source model hasn’t fully loaded or the field is missing. Try reloading the model from the source connection. + * **Type Different(General)**: Slight variations (e.g., field length or precision) that typically don’t impact the sync process. + +5. Once your configuration is complete, click **Start**. + + After the task is successfully launched, you’ll be redirected to the task monitoring page where you can track its progress and status. \ No newline at end of file diff --git a/docs/data-replication/create-task.md b/docs/data-replication/create-task.md index 85b518a2..4457544e 100644 --- a/docs/data-replication/create-task.md +++ b/docs/data-replication/create-task.md @@ -100,7 +100,8 @@ As an example of creating a data replication task, the article demonstrates the * **Data Source Exclusive Configuration**: Choose whether to save deleted data. * **Synchronize Partition Properties**: When this feature is enabled, TapData will automatically create a sharded collection in the target database. This function is only effective when both the source and target databases are MongoDB clusters. * **Data Model** - Displays table structure information of the target table, including field names and field types. + Displays table structure information of the target table, including field names and field types. When the inferred model differs from the target table structure, Tapdata will prompt you and guide you to select an automatic handling strategy to ensure stable task execution. For details, see [Compare Data Models](../case-practices/best-practice/compare-data-model.md). + * **Alert Settings** Defaults as per source node alert settings. diff --git a/docs/data-transformation/create-task.md b/docs/data-transformation/create-task.md index ade14053..6fbf591d 100644 --- a/docs/data-transformation/create-task.md +++ b/docs/data-transformation/create-task.md @@ -89,7 +89,7 @@ As an example, we will show how to change the **birthdate** field's data type fr * **Maximum Wait Time per Batch Write**: Set the maximum waiting time based on the target database’s performance and network latency, measured in milliseconds. * **Full Multi-Threaded Write**: The number of concurrent threads for writing full data. The default is **8**; adjust based on the target end's write performance. * **Incremental Multi-Threaded Write**: The number of concurrent threads for writing incremental data. By default, it is not enabled. Enable and adjust based on the target end's write performance. - * **Schema**: Displays the source table structure information, including field names and field types. + * **Data Model**: Displays the target table structure information, including field names and field types. When the inferred model differs from the target table structure, Tapdata will prompt you and guide you to select an automatic handling strategy to ensure stable task execution. For details, see [Compare Data Models](../case-practices/best-practice/compare-data-model.md). * **Advanced Settings** Choose the data writing mode according to business needs: * **Handle by Event Type**: After selecting this option, you need to also choose the data writing strategy for insert, update, and delete events. diff --git a/docs/images/create_api_service.png b/docs/images/create_api_service.png index e1c7d851..d9f841fc 100644 Binary files a/docs/images/create_api_service.png and b/docs/images/create_api_service.png differ diff --git a/docs/images/data_comparison_handle.png b/docs/images/data_comparison_handle.png new file mode 100644 index 00000000..e2ca8fd1 Binary files /dev/null and b/docs/images/data_comparison_handle.png differ diff --git a/docs/images/data_comparison_result.png b/docs/images/data_comparison_result.png new file mode 100644 index 00000000..5097109b Binary files /dev/null and b/docs/images/data_comparison_result.png differ diff --git a/docs/publish-apis/create-api-service.md b/docs/publish-apis/create-api-service.md index fc097e1a..3ceae636 100644 --- a/docs/publish-apis/create-api-service.md +++ b/docs/publish-apis/create-api-service.md @@ -20,7 +20,6 @@ Currently, it supports Doris, MongoDB, MySQL, Oracle, PostgreSQL, SQL Server, an * **Key Configuration Fields** * **Service Name**: Give your API a meaningful name for easier identification and management. * **Owner Application**: Select the business application this API belongs to. This helps categorize your APIs clearly. See [Application Management](manage-app.md) for more details. - * **Connection Type**, **Connection Name**, **Object Name**: Choose the data source and object (e.g. a view like `orders-wide-view`) that the API will query. - **Interface Type**: TapData provides two modes for querying data via APIs: - **Default Query**: A general-purpose mode with built-in pagination and filtering, suitable for client-driven access. - **Custom Query**: A structured mode that enables domain-specific APIs with full control over query logic, sorting, and inputs. @@ -30,7 +29,7 @@ Currently, it supports Doris, MongoDB, MySQL, Oracle, PostgreSQL, SQL Server, an - **Input Parameters**: Define the parameters clients can pass when calling this API. - For **Default Query**, the platform automatically includes three built-in parameters: `page`, `limit`, and `filter`. This allows dynamic pagination and filtering by the client; custom parameters are **not** supported. - For **Custom Query**, you can define your own parameters (such as `region`, `startDate`, or `userLevel`), and map them to specific filter or sort conditions in the UI. In this mode, all filtering is managed server-side; the `filter` parameter is not included unless you explicitly add it. For supported types and configuration rules, see [API Query Parameters](query/api-query-params.md). - - **Output Results**: By default, all fields from the selected object are returned. You can manually adjust the list to return only selected fields. + - **Output Results**: By default, the response will include all fields of the selected object. You can also manually adjust the response to return only specific fields, assign aliases to fields, or apply data masking rules. 4. Click **Save** at the top right of the page. diff --git a/docs/release-notes-on-prem.md b/docs/release-notes-on-prem.md index 854d0340..bdb79a6b 100644 --- a/docs/release-notes-on-prem.md +++ b/docs/release-notes-on-prem.md @@ -14,6 +14,35 @@ import TabItem from '@theme/TabItem'; ``` +## 4.8.0 + +### New Features + +* Added support for detecting and [comparing target data models](case-practices/best-practice/compare-data-model.md) to ensure greater data consistency. + +### Enhancements + +* Enabled rolling upgrades for the API service to ensure uninterrupted availability during updates. + +### Bug Fixes + +* Fixed an issue where the audit log could not identify client information when the token had expired. +* Fixed an issue where the `Limit` parameter in API requests was not being applied correctly. + +## 4.7.0 + +### New Features + +* Added support for configuring field masking rules when [creating APIs](publish-apis/create-api-service.md). Masked fields will also appear masked in audit logs and debug results, improving data security. +* When publishing custom query APIs, users can now preview and edit the final query statement and use custom parameter values. +* API responses now support returning only specific fields from nested documents or arrays, reducing data redundancy and improving performance. + +### Enhancements + +* Added the ability to configure API database timeout settings in the system settings. When a request times out, clear log messages are generated to simplify troubleshooting. +* Improved the generated API Swagger documentation to fully include request bodies, response formats, and error codes, enhancing integration efficiency and consistency. + + ## 4.6.0 ### New Features diff --git a/sidebars.js b/sidebars.js index 30e3ce9f..5639c6af 100644 --- a/sidebars.js +++ b/sidebars.js @@ -450,6 +450,7 @@ const sidebars = { link: {type: 'doc', id: 'case-practices/best-practice/README'}, items: [ 'case-practices/best-practice/data-sync', + 'case-practices/best-practice/compare-data-model', 'case-practices/best-practice/handle-schema-changes', 'case-practices/best-practice/heart-beat-task', 'case-practices/best-practice/alert-via-qqmail',