diff --git a/.github/actions/spelling/allow.txt b/.github/actions/spelling/allow.txt
index 500f57a8..226dba80 100644
--- a/.github/actions/spelling/allow.txt
+++ b/.github/actions/spelling/allow.txt
@@ -647,6 +647,50 @@ placeholders
integratetax
webhooks
zohodesk
+ACLs
+apps
+auditability
+CCPA
+CIAM
+CRMs
+deduplicators
+DWH
+evolvability
+extname
+fdm
+feb
+fintech
+fmd
+imv
+informatica
+KDYz
+lte
+MSSQL
+odh
+operationalize
+PAYPAL
+Porcedure
+prebuilt
+Productization
+readdir
+roadmap
+SCD
+Shopify
+shrcno
+signup
+Smartphone
+Takeaways
+Unpublish
+UUc
+WMS
+YXtxk
+IMVs
+nocheck
+Inno
+ipv
+netdev
+somaxconn
+JVM
diff --git a/.gitignore b/.gitignore
index 810cb7bc..acda9de3 100644
--- a/.gitignore
+++ b/.gitignore
@@ -24,3 +24,8 @@ yarn-error.log*
.idea/workspace.xml
.idea/workspace.xml
.idea/workspace.xml
+
+# Broken links reports
+broken-links-report-*.json
+broken-links-report-*.csv
+quick-broken-links-*.csv
diff --git a/README.md b/README.md
index 7779ba80..0dd994bb 100644
--- a/README.md
+++ b/README.md
@@ -49,6 +49,17 @@ We welcome contributions to help improve the documentation! Here is the steps:
npm run start
```
+5. (Optional) Check broken links.
+
+ ```bash
+ cd tools
+ # Quick Check
+ npm run check-links:quick
+
+ # or Detailed Check
+ npm run check-links
+ ```
+
5. Create a pull request.
## Project structure
diff --git a/docs/administration/README.md b/docs/administration/README.md
deleted file mode 100644
index 441efa66..00000000
--- a/docs/administration/README.md
+++ /dev/null
@@ -1,9 +0,0 @@
-# Admin & Operations
-
-import Content from '../reuse-content/_all-features.md';
-
-
-
-import DocCardList from '@theme/DocCardList';
-
-
\ No newline at end of file
diff --git a/docs/appendix/benchmark.md b/docs/appendix/benchmark.md
index 1929618e..b8a96d43 100644
--- a/docs/appendix/benchmark.md
+++ b/docs/appendix/benchmark.md
@@ -1,9 +1,5 @@
# Performance Testing
-import Content from '../reuse-content/_enterprise-features.md';
-
-
-
This document aims to detail the methods and steps for conducting performance tests on TapData. We will explore how to accurately assess TapData's data processing capabilities, response times, and system stability under various conditions. This helps you understand TapData's performance under different loads, allowing for better resource planning and configuration optimization.
## Testing Environment
diff --git a/docs/appendix/enhanced-js.md b/docs/appendix/enhanced-js.md
index 512af072..66694eaa 100644
--- a/docs/appendix/enhanced-js.md
+++ b/docs/appendix/enhanced-js.md
@@ -1,12 +1,8 @@
# Enhanced JS Built-in Function
-import Content from '../reuse-content/_all-features.md';
-
-
-
Enhanced JS nodes allow you to utilize all built-in functions for external calls, such as networking and database operations. If your requirement is solely to process and operate on data records, it is recommended to use [standard JS nodes](standard-js.md).
-For detailed instructions on how to use enhanced JS nodes and explore various scenarios, please refer to the documentation and resources available for [JS processing node](../user-guide/data-development/process-node.md#js-process).
+For detailed instructions on how to use enhanced JS nodes and explore various scenarios, please refer to the documentation and resources available for [JS processing node](../data-transformation/process-node.md#js-process).
:::tip
diff --git a/docs/appendix/standard-js.md b/docs/appendix/standard-js.md
index c8208ef5..2f8262f1 100644
--- a/docs/appendix/standard-js.md
+++ b/docs/appendix/standard-js.md
@@ -1,12 +1,9 @@
# Standard JS Built-in Function
-import Content from '../reuse-content/_all-features.md';
-
-
Standard JS nodes can only process and operate on data records. If you require the usage of system built-in functions for external calls, such as networking or database operations, you can utilize [enhanced JS nodes](enhanced-js.md).
-For information on how to use and scenarios, see [JS processing node](../user-guide/data-development/process-node.md#js-process).
+For information on how to use and scenarios, see [JS processing node](../data-transformation/process-node.md#js-process).
## DateUtil
diff --git a/docs/appendix/support.md b/docs/appendix/support.md
index 155ee767..a7e2a431 100644
--- a/docs/appendix/support.md
+++ b/docs/appendix/support.md
@@ -1,9 +1,5 @@
# Technical Support
-import Content from '../reuse-content/_all-features.md';
-
-
-
In addition to consulting the TapData documentation, you can also access the original technical support of TapData products through the online customer service, user community and ticket system.
## Description of the account
diff --git a/docs/billing/README.md b/docs/billing/README.md
deleted file mode 100644
index 03cf05c4..00000000
--- a/docs/billing/README.md
+++ /dev/null
@@ -1,17 +0,0 @@
-# Billing
-
-import Content from '../reuse-content/_cloud-features.md';
-
-
-
-This section will introduce the billing details of the TapData Cloud product.
-
-:::tip
-
-If you choose the TapData Enterprise, it is paid annually and can be deployed to your local data center, suitable for scenarios with strict requirements on data sensitivity or network isolation. Before making a purchase, you can [apply for a trial](https://tapdata.net/tapdata-on-prem/demo.html).
-
-:::
-
-import DocCardList from '@theme/DocCardList';
-
-
\ No newline at end of file
diff --git a/docs/billing/billing-overview.md b/docs/billing/billing-overview.md
deleted file mode 100644
index 6295e03f..00000000
--- a/docs/billing/billing-overview.md
+++ /dev/null
@@ -1,116 +0,0 @@
-# Billing Overview
-import Content from '../reuse-content/_cloud-features.md';
-
-
-
-This article introduces billing information such as billing items, billing methods and price descriptions in TapData Cloud.
-
-:::tip
-
-If you choose the TapData Enterprise, it is paid annually and can be deployed to your local data center, suitable for scenarios with strict requirements on data sensitivity or network isolation. Before making a purchase, you can [apply for a trial](https://tapdata.net/tapdata-on-prem/demo.html).
-
-:::
-
-## Billing method
-
-TapData Cloud charges based on the **specifications** and **number** of Agent instances you want to subscribe. You will get 1 Agent instance upon completing account registration, and you have the option to purchase additional Agent instances through monthly, annual, consecutive monthly, or consecutive annual subscriptions to meet your business requirements.
-
-There are several subscription options available for TapData Cloud:
-
-- **One Month Only**: This is a one-time purchase of a one-month service. The subscription will not automatically renew after the expiration and can be manually renewed if desired.
-- **One Year Only**: This option allows for a one-time purchase of a one-year service. The subscription will not automatically renew after the expiration and can be manually renewed when needed.
-- **Monthly**: With the monthly subscription option, you pay a monthly fee. The subscription fee for the next month will be automatically deducted before the due date, ensuring uninterrupted service.
-- **Annually**: The annual subscription option requires paying the subscription fee once a year. Similar to the monthly option, the subscription fee for the next year will be automatically deducted before the due date, providing convenience and continuity.
-
-:::tip
-
-- When you select the recurring monthly or annual billing method, TapData Cloud will automatically deduct the subscription fee for the next billing cycle on the expiration date of each period. You can conveniently review the charge details in the **user center**, allowing you to stay informed about the payment information and have a clear understanding of the billing process for your TapData Cloud subscription.
-- When selecting a **fully managed** instance, you will also need to pay for traffic based on the amount of synchronized data (charged monthly)
-
-:::
-
-## Payment Methods
-
-You can pay for TapData Cloud by credit card.
-
-
-
-## Agent Specifications and Descriptions
-
-Please note that the performance of the following tables is provided for reference purposes only, as the data flow can be influenced by various factors such as the load performance of the Agent's machine, network transmission delay, network bandwidth, and the workload of the source and target databases.
-
-
-
-
-
Specifications
-
Running Tasks
-
Host hardware recommendation ①
-
Performance Reference (RPS)
-
-
-
CPU cores
-
RAM
-
-
-
-
-
SMALL
-
3
-
1 core
-
4 GB
-
2,000
-
-
-
LARGE
-
5
-
2 cores
-
6 GB
-
4,000
-
-
-
XLARGE
-
10
-
4 cores
-
10 GB
-
8,000
-
-
-
2XLARGE
-
20
-
8 cores
-
19 GB
-
16,000
-
-
-
3XLARGE
-
30
-
12 cores
-
28 GB
-
24,000
-
-
-
4XLARGE
-
40
-
16 cores
-
37 GB
-
32,000
-
-
-
8XLARGE
-
80
-
32 cores
-
72 GB
-
64,000
-
-
-
-
-
-
-
-:::tip
-
-① In order to ensure the maximum data flow performance, it is recommended that the machine deployed by the Agent (referred to as the **host** in the above table) has sufficient resources such as computing, storage and bandwidth. For more information, see [Install Agent](../installation/install-tapdata-agent.md).
-
-:::
-
diff --git a/docs/billing/expiration.md b/docs/billing/expiration.md
deleted file mode 100644
index f2f5b835..00000000
--- a/docs/billing/expiration.md
+++ /dev/null
@@ -1,21 +0,0 @@
-# Expiration Policy
-import Content from '../reuse-content/_cloud-features.md';
-
-
-
-To prevent any disruptions to your business operations, we highly recommend [renewing](renew-subscribe.md) your annual/monthly instance before it expires. Alternatively, during the initial subscription process, you can choose the continuous annual/monthly payment method. This ensures seamless continuity of your instance and avoids any potential impact on your business.
-
-
-
-After the expiration of a subscribed Agent instance, the following effects can be observed:
-
-* The associated tasks can continue to run, but they will be unable to execute any scheduled policies within the tasks.
-* The data source associated with the Agent will be unable to load the Schema.
-* New tasks associated with the Agent cannot be started.
-* The data source associated with the Agent cannot undergo a connection test.
-
-:::tip
-
-It is important to keep these effects in mind and ensure timely renewal or consider the appropriate actions to maintain uninterrupted functionality and data source operations for your Agent instance.
-
-:::
\ No newline at end of file
diff --git a/docs/billing/purchase.md b/docs/billing/purchase.md
deleted file mode 100644
index 396c812e..00000000
--- a/docs/billing/purchase.md
+++ /dev/null
@@ -1,59 +0,0 @@
-# Subscription Instance
-
-import Content from '../reuse-content/_cloud-features.md';
-
-
-
-After registering with TapData Cloud, you will receive the benefit of creating one free Agent instance. If you require additional agents or desire higher transfer performance, you can refer to the instructions in this article to complete the subscription process for the desired instance.
-
-## Procedure
-
-1. Log in to [TapData Cloud](https://cloud.tapdata.io/).
-
-2. In the left navigation panel, click **Resource Management**.
-
- After successfully creating a free Agent instance, if you find that your business requires additional Agent instances to meet performance needs, you can proceed with subscribing to more instances. This will allow you to scale up the capabilities of TapData Cloud to accommodate your business requirements effectively.
-
- 
-
-3. On the right side of the page, click **Create Agent**.
-
-4. In the pop-up dialog, select deploy mode, spec and subscription period.
-
- 
-
- * **Deploy Mode**: Choose the deploy mode based on your business needs:
- * **Self-Hosted Mode**: You need provide the equipment for [deploying](../installation/install-tapdata-agent.md) and maintaining the Agent. This allows for the optimal utilization of existing hardware resources, resulting in lower costs and enhanced security.
- * **Fully Managed Mode**: TapData Cloud provides the required computing/storage resources for running the Agent and deploys it automatically. Additionally, we offer unified operational maintenance and resource monitoring to enhance reliability. This enables one-click delivery and usage, eliminating the need for deployment and operational efforts, allowing you to focus on your core business activities.
- :::tip
- When selecting the **Fully Managed Mode**, you also need to choose the cloud provider and region where the Agent will be deployed.
- :::
- * **Cloud Provider** and **Region**: Required when choosing the **Fully Managed Mode**.
- * **Agent Spec**: Select product specifications based on the number of tasks and performance requirements required for evaluation. You can create an example of **SMALL** specifications for free. For detailed descriptions of product pricing and specifications, see [Billing Overview](billing-overview.md).
- * **Traffic Billing**: When choosing **Fully Managed Mode**, you will also need to pay for traffic based on the amount of synchronized data (billed monthly).
- * **Subscription Method**: Select the required subscription period, in order to avoid the expiration of the instance affecting the execution of the task, it is recommended to choose the Annually (**10% off**) or Monthly (**5% off**).
-
-5. Click **Next**, on the following page, carefully review and confirm the specifications you wish to purchase. Ensure that the selected billing method aligns with your preferences. Additionally, verify that the email address provided is accurate and where you would like to receive the bill.
-
- Once you have double-checked all the information, click on the **OK** button to proceed with the purchase.
-
-6. You will redirected to payment page. Please follow the instructions on the payment page to complete the payment process. After completing the payment, you will be able to download the payment credentials.
-
-7. After the payment is successful, return to the TapData Cloud platform to see that the Agent instance you purchased is **To be deployed**.
-
- Next, you can deploy the Agent on your server. For more information, see [Install Agent](../installation/install-tapdata-agent.md).
-
- 
-
-
-
-## Next Steps
-
-To ensure the proper use of subsequent data replication/transformation functions, you need to adjust the relevant firewalls to ensure that the Agent can communicate normally with TapData Cloud and the source/target databases. The Agent workflow is shown below:
-
-
-
-If you have subscribed to the [Fully Managed Agent](#hosted-mode) and the connected data sources only accept connections from specific IP addresses, you need to add the Agent's server address to the security settings of the corresponding data source. For example, add it to the firewall whitelist rules of a self-hosted database to allow the Agent to establish communication and transfer data with your data sources. The Agent server addresses for each region are as follows:
-
-- Alibaba Cloud Beijing: **47.93.190.224**
-- Alibaba Cloud Hong Kong: **47.242.251.110**
diff --git a/docs/billing/renew-subscribe.md b/docs/billing/renew-subscribe.md
deleted file mode 100644
index e0ed317d..00000000
--- a/docs/billing/renew-subscribe.md
+++ /dev/null
@@ -1,25 +0,0 @@
-# Manage Subscription
-
-import Content from '../reuse-content/_cloud-features.md';
-
-
-
-For annual/monthly subscription instances, TapData Cloud will remind you to renew one month before the expiration date. To avoid any impact on your business, please renew your subscription in time before it expires, or choose the **Continuous Annual/Monthly** billing method at the time of purchase. Additionally, you can view traffic bills in the Subscription Center.
-
-
-
-## Procedure
-
-1. Log in to [TapData Cloud](https://cloud.tapdata.io/).
-
-2. In the left navigation bar, click **Subscriptions**.
-
-3. View your current subscription information, locate the target instance, and choose the action you wish to perform:
-
- 
-
- - **Renew**: Renew the instance. If you have a continuous annual/monthly subscription, this action is not required.
- - **Change**: Upgrade the instance specifications and follow the prompts to complete the payment process. You can also view the instance change history.
- - **Unsubscribe**: If the instance is no longer needed, you can [unsubscribe](refund.md) as long as its associated tasks do not impact your business.
- - **Authorization Code**: This section displays the authorization code information for Agent instances purchased through the Alibaba Cloud Marketplace. You can choose to issue invoices or renew subscriptions.
- - **Traffic Bill**: View the traffic bills for data synchronization, with charges billed monthly.
diff --git a/docs/case-practices/README.md b/docs/case-practices/README.md
index fb3fd7ac..e84a84c9 100644
--- a/docs/case-practices/README.md
+++ b/docs/case-practices/README.md
@@ -1,8 +1,6 @@
# Practical Cases
-import Content from '../reuse-content/_all-features.md';
-
import DocCardList from '@theme/DocCardList';
diff --git a/docs/case-practices/best-practice/alert-via-qqmail.md b/docs/case-practices/best-practice/alert-via-qqmail.md
index d5e6550e..c52d394d 100644
--- a/docs/case-practices/best-practice/alert-via-qqmail.md
+++ b/docs/case-practices/best-practice/alert-via-qqmail.md
@@ -1,7 +1,4 @@
# Send Alert Emails via QQ Mail
-import Content from '../../reuse-content/_enterprise-features.md';
-
-
TapData supports sending alert emails through SMTP protocol, enabling users to receive timely notifications in their commonly used email accounts, thus helping you promptly perceive operational anomalies and ensure the stability and reliability of task operations.
@@ -14,7 +11,7 @@ You can also integrate other email services (such as 163 Mail) in TapData platfo
## Notes
Tapdata can monitor task status and trigger alerts when specific events occur.
-For configurable alert types, see [Alert Settings](../../user-guide/other-settings/notification.md). You can choose which events should trigger email notifications based on your needs.
+For configurable alert types, see [Alert Settings](../../system-admin/other-settings/notification.md). You can choose which events should trigger email notifications based on your needs.
## Step One: Obtain Email Authorization Code
@@ -54,7 +51,7 @@ The email authorization code is a special password used by QQ Mail to log into t
## Step Two: Configure SMTP Service
-1. [Log in to TapData platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. In the top right corner of the page, click the  icon, then select **System Settings**.
diff --git a/docs/case-practices/best-practice/data-sync.md b/docs/case-practices/best-practice/data-sync.md
index 25acac77..f8ab6704 100644
--- a/docs/case-practices/best-practice/data-sync.md
+++ b/docs/case-practices/best-practice/data-sync.md
@@ -1,7 +1,5 @@
# Data Sync Best Practices
-import Content from '../../reuse-content/_all-features.md';
-
This guide aims to provide best practices for data synchronization using TapData Cloud. We will discuss in detail aspects like data source analysis, task configuration, and monitoring, to help you build efficient and reliable data synchronization tasks.
@@ -14,13 +12,8 @@ Analyzing the data sources is fundamental to data synchronization. It helps asse
| **Number of Tables to Synchronize** | Estimate the scale and complexity of the synchronization task based on this data. If there are many tables, create data synchronization tasks in batches or prioritize synchronizing key data. |
| **Volume of Data Changes** | Estimate the daily data change volume to adjust the synchronization frequency and performance parameters, ensuring real-time or near-real-time data updates. |
| **Primary Keys/Unique Indexes** | Primary keys or unique indexes play a crucial role in synchronization performance and data consistency. If absent, special configurations may be needed for these tables in subsequent task settings. |
-| **Target Database Type** | Confirm the type of target database. For heterogeneous data synchronization, ensure data type compatibility. For more information, see [Data Type Support](../../user-guide/no-supported-data-type.md). |
+| **Target Database Type** | Confirm the type of target database. For heterogeneous data synchronization, ensure data type compatibility. For more information, see [Data Type Support](../../faq/no-supported-data-type.md). |
-:::tip
-
-When subscribe an instance, you can choose the specifications based on the estimated scale of table data and data change volume. For more details, see [Specification Description](../../billing/billing-overview.md#spec).
-
-:::
## Configure and Optimize Tasks
@@ -34,11 +27,11 @@ Based on the understanding of the data source, the next step is to configure dat
## Monitor and Maintain
-After starting the task, regularly check the task [monitoring page](../../user-guide/copy-data/monitor-task.md) for details such as the synchronization rate during the full synchronization phase and changes in the source database data, so you can ensure timely identification and resolution of any issues. If you encounter task anomalies, consult the task logs for detailed [Error Codes and Solutions](../../user-guide/error-code-solution.md) to facilitate troubleshooting.
+After starting the task, regularly check the task [monitoring page](../../data-replication/monitor-task.md) for details such as the synchronization rate during the full synchronization phase and changes in the source database data, so you can ensure timely identification and resolution of any issues. If you encounter task anomalies, consult the task logs for detailed [Error Codes and Solutions](../../faq/error-code-solution.md) to facilitate troubleshooting.
Additionally, during the task execution, you can log into the TapData server and use commands like `top` or `free` to monitor whether the server's compute or memory resources have reached their limits.
## See also
-* [Create Data Replication Tasks](../../user-guide/copy-data/README.md)
+* [Create Data Replication Tasks](../../data-replication/README.md)
* [Frequently Asked Questions](../../faq/README.md)
\ No newline at end of file
diff --git a/docs/case-practices/best-practice/full-breakpoint-resumption.md b/docs/case-practices/best-practice/full-breakpoint-resumption.md
index 565d7599..d0dc50d6 100644
--- a/docs/case-practices/best-practice/full-breakpoint-resumption.md
+++ b/docs/case-practices/best-practice/full-breakpoint-resumption.md
@@ -1,8 +1,6 @@
# Ensure Data Migration with Breakpoint Continuation
-import Content from '../../reuse-content/_all-features.md';
-
In scenarios involving massive data migration, you can utilize TapData's full resumption from breakpoint feature to segment and migrate data, enhancing the reliability of data migration and ensuring successful execution of migration tasks.
@@ -17,13 +15,13 @@ To address this issue, TapData introduces the full resumption from breakpoint fu
## Prerequisites
* The full resumption from breakpoint is currently only supported for MongoDB data sources, i.e., the source database must be MongoDB.
-* Before creating a data transformation task, ensure you have configured the relevant data sources, see [Configuring MongoDB Connection](../../prerequisites/on-prem-databases/mongodb.md) for details.
+* Before creating a data transformation task, ensure you have configured the relevant data sources, see [Configuring MongoDB Connection](../../connectors/on-prem-databases/mongodb.md) for details.
## Procedure
In this case, we will demonstrate the specific configuration process for data migration between MongoDB instances.
-1. [Log in to the TapData platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. In the left navigation bar, select **Data Pipeline** > **Data Replication**.
@@ -127,7 +125,7 @@ In this case, we will demonstrate the specific configuration process for data mi
:::tip
- If the pre-check fails, adjust according to the log prompts on the current page. For more information, see [Task Pre-Check Explanation](../../user-guide/pre-check.md).
+ If the pre-check fails, adjust according to the log prompts on the current page. For more information, see [Task Pre-Check Explanation](../../connectors/pre-check.md).
:::
diff --git a/docs/case-practices/best-practice/handle-schema-changes.md b/docs/case-practices/best-practice/handle-schema-changes.md
index 8cb966f5..19e79251 100644
--- a/docs/case-practices/best-practice/handle-schema-changes.md
+++ b/docs/case-practices/best-practice/handle-schema-changes.md
@@ -1,7 +1,4 @@
# Handle DDL Changes During Data Sync
-import Content from '../../reuse-content/_all-features.md';
-
-
During data migration and synchronization with TapData Cloud, recognizing the impact of table structure modifications, such as DDL (Data Definition Language) operations, is crucial for continuous business operations. The platform seamlessly manages most DDL changes, ensuring a smooth synchronization process.
@@ -9,7 +6,7 @@ During data migration and synchronization with TapData Cloud, recognizing the im
To ensure the high availability and fault tolerance of data replication/transformation tasks, by default, TapData does not synchronize the DDL statements from the source database to the target database. If you need to enable this feature, please follow these steps:
-1. When creating or editing a [data replication](../../user-guide/copy-data/create-task.md) or [data transformation](../../user-guide/data-development/create-task.md) task, go to the configuration page of the source database node.
+1. When creating or editing a [data replication](../../data-replication/create-task.md) or [data transformation](../../data-transformation/create-views/README.md) task, go to the configuration page of the source database node.
2. Find the **Advanced Settings** tab and check if the **Sync DDL Events** option is available.
@@ -21,7 +18,7 @@ To ensure the high availability and fault tolerance of data replication/transfor
:::tip
- Besides enabling this switch, the target database must also support **DDL** **application**. You can check the support status of various data sources for DDL event collection and DDL apply through the [supported data sources](../../prerequisites/supported-databases.md) document, or in the **Advanced Settings** of the target node, see the hint for **DDL Event Apply**.
+ Besides enabling this switch, the target database must also support **DDL** **application**. You can check the support status of various data sources for DDL event collection and DDL apply through the [supported data sources](../../connectors/supported-data-sources.md) document, or in the **Advanced Settings** of the target node, see the hint for **DDL Event Apply**.
:::
@@ -29,7 +26,7 @@ To ensure the high availability and fault tolerance of data replication/transfor
| DDL Collection | DDL Apply |
| ----------------------- | ------------------------------------------------------------ |
-| Add Fields | TapData will automatically adapt the field type when adding fields to the target database, for example, converting from MySQL's **INT** to Oracle's **NUMBER(38,0)**. If there are [unsupported column types](../../user-guide/no-supported-data-type.md), this may lead to the failure of adding fields. |
+| Add Fields | TapData will automatically adapt the field type when adding fields to the target database, for example, converting from MySQL's **INT** to Oracle's **NUMBER(38,0)**. If there are [unsupported column types](../../faq/no-supported-data-type.md), this may lead to the failure of adding fields. |
| Modify Field Names | TapData will automatically complete this operation in the target database, be aware of the target database's field naming restrictions. |
| Modify Field Attributes | When synchronizing between different types of databases (for example, from MySQL to Oracle), ensure that the target database supports the changed data types and attributes. Otherwise, this may lead to errors or interruptions in the synchronization task. |
| Delete Fields | Deleting columns from the source table can have a severe impact on the data pipeline, especially when the column is a key part of the data processing logic, such as a primary key or as a field for update conditions in the synchronization link. Before making such changes, ensure that other components in the data pipeline no longer depend on this column. |
@@ -42,9 +39,9 @@ For data sources with weak Schema constraints, the Schema information of histori
-If a DDL synchronization error causes a data synchronization task to interrupt, you can either undo the relevant DDL operation in the source database or choose to [reload the source database's Schema](../../user-guide/manage-connection.md), then reset and restart the task to repair it.
+If a DDL synchronization error causes a data synchronization task to interrupt, you can either undo the relevant DDL operation in the source database or choose to [reload the source database's Schema](../../connectors/manage-connection.md), then reset and restart the task to repair it.
-Additionally, consulting the [task log](../../user-guide/copy-data/monitor-task.md) and [error codes](../../user-guide/error-code-solution.md) can aid in identifying and rectifying the root causes of the task failure. A normal DDL collection and application log example is as follows:
+Additionally, consulting the [task log](../../data-replication/monitor-task.md) and [error codes](../../faq/error-code-solution.md) can aid in identifying and rectifying the root causes of the task failure. A normal DDL collection and application log example is as follows:

@@ -55,4 +52,4 @@ To safely manage DDL changes and reduce the potential risk to data synchronizati
1. **Pre-change Verification**: Verify the full extent of DDL changes in a test environment before applying them in a production environment to identify potential issues that may interrupt synchronization.
2. **Planning and Notification**: Schedule DDL changes during off-peak business hours and notify related teams in advance.
3. **Pipeline Configuration Updates**: Regularly review and update your data pipeline configurations to match the latest table structures.
-4. **Monitoring and Alerts**: Set up [monitoring](../../user-guide/copy-data/monitor-task.md) and alerts for your data pipelines to respond quickly to unsupported DDL operations.
+4. **Monitoring and Alerts**: Set up [monitoring](../../data-replication/monitor-task.md) and alerts for your data pipelines to respond quickly to unsupported DDL operations.
diff --git a/docs/case-practices/best-practice/heart-beat-task.md b/docs/case-practices/best-practice/heart-beat-task.md
index e8be12a3..14e59b02 100644
--- a/docs/case-practices/best-practice/heart-beat-task.md
+++ b/docs/case-practices/best-practice/heart-beat-task.md
@@ -1,7 +1,5 @@
# Monitor Data Synchronization with Heartbeat Tables
-import Content from '../../reuse-content/_all-features.md';
-
TapData uses heartbeat tables to write timestamp information to the source database every **10 seconds**. By checking the timestamp information in the heartbeat tables, we can quickly determine the activity and health of the data source, thereby better monitoring the data synchronization path and ensuring the stability and reliability of the data synchronization path.
@@ -15,18 +13,18 @@ TapData uses heartbeat tables to write timestamp information to the source datab
## Considerations
-* Since the heartbeat table function needs to automatically create the table and update the timestamp in the source database, ensure the database account has the necessary permissions before enabling. For example, in MySQL, ensure the account has **CREATE**, **INSERT**, and **UPDATE** permissions. For more on permissions, see [Data Source Preparation](../../prerequisites/README.md).
+* Since the heartbeat table function needs to automatically create the table and update the timestamp in the source database, ensure the database account has the necessary permissions before enabling. For example, in MySQL, ensure the account has **CREATE**, **INSERT**, and **UPDATE** permissions. For more on permissions, see [Data Source Preparation](../../connectors/README.md).
* The heartbeat table is named **_tapdata_heartbeat_table**. Ensure the integrity and reliability of the heartbeat table data and avoid operations on it in the source database (such as deleting the table).
## Enabling Heartbeat Tables for a Data Source
-1. [Log in to the TapData platform](../../user-guide/log-in.md).
+1. Log in to the TapData platform.
2. In the left navigation bar, click **Connection Management**.
3. Find the created data source and click **Edit** in the **Actions** column.
- If setting this up during the creation of a new data source, the method is the same. For more information, see [Connecting Data Sources](../../prerequisites/README.md).
+ If setting this up during the creation of a new data source, the method is the same. For more information, see [Connecting Data Sources](../../connectors/README.md).
4. Scroll down to the bottom of the page and turn on the heartbeat table switch.
@@ -34,7 +32,7 @@ TapData uses heartbeat tables to write timestamp information to the source datab
:::tip
- If you do not see this switch, check the **Connection Type** setting on the page to ensure it is set to **Source and Target**. Additionally, some data sources do not support being used as both source and target. For more information, see [Supported Data Sources](../../prerequisites/supported-databases.md).
+ If you do not see this switch, check the **Connection Type** setting on the page to ensure it is set to **Source and Target**. Additionally, some data sources do not support being used as both source and target. For more information, see [Supported Data Sources](../../connectors/supported-data-sources.md).
:::
diff --git a/docs/case-practices/best-practice/raw-logs-solution.md b/docs/case-practices/best-practice/raw-logs-solution.md
index d5675a79..86a73662 100644
--- a/docs/case-practices/best-practice/raw-logs-solution.md
+++ b/docs/case-practices/best-practice/raw-logs-solution.md
@@ -1,7 +1,5 @@
# Deploy Oracle Raw Log Parsing Service
-import Content from '../../reuse-content/_enterprise-and-cloud-features.md';
-
To enhance the efficiency of capturing data changes, TapData supports not only using the native log parsing tools of databases (LogMiner) but also has developed the capability to directly parse the incremental log files of the database. This allows for more efficient event capture, achieving higher data collection performance (RPS over 20,000), reducing the impact on the source database during incremental data collection, but it requires the deployment of an additional component, which increases operational costs, making it suitable for scenarios with frequent data changes.
@@ -13,7 +11,7 @@ To enhance the efficiency of capturing data changes, TapData supports not only u
* **Operating System**: Linux 64 or Windows 64 platforms.
* **Storage**: Supported file systems include ext4, btrfs, zfs, xfs, sshfs; supported database block sizes are 2k, 4k, 8k, 16k, 32k.
* **Port Requirements**: Some server ports must be open for service communication, including: default data transfer port: **8203**, web management default port: **8303**, raw log service port: **8190**.
-* **Permission**: The operating system user running the raw log plugin must have read access to redo log files; in addition to the permissions required for the source database as per the [Oracle Preparation Work](../../prerequisites/on-prem-databases/oracle.md#source) and enabling archive logs, additional permissions must be granted to simulate Oracle's data information structure and processes to cache part of Oracle Schema information to support the parsing of redo logs.
+* **Permission**: The operating system user running the raw log plugin must have read access to redo log files; in addition to the permissions required for the source database as per the [Oracle Preparation Work](../../connectors/on-prem-databases/oracle.md#source) and enabling archive logs, additional permissions must be granted to simulate Oracle's data information structure and processes to cache part of Oracle Schema information to support the parsing of redo logs.
```sql
-- Replace with the actual username
@@ -106,7 +104,7 @@ Next, we'll demonstrate the deployment process for raw log querying using Oracle
## Next Steps
-When [configuring an Oracle connection](../../prerequisites/on-prem-databases/oracle.md), choose the log plugin as **bridge** and then enter the IP address of the raw log service, with the default service port being **8190**.
+When [configuring an Oracle connection](../../connectors/on-prem-databases/oracle.md), choose the log plugin as **bridge** and then enter the IP address of the raw log service, with the default service port being **8190**.
## Common Questions
diff --git a/docs/case-practices/pipeline-tutorial/README.md b/docs/case-practices/pipeline-tutorial/README.md
index c68a7089..832ddb0c 100644
--- a/docs/case-practices/pipeline-tutorial/README.md
+++ b/docs/case-practices/pipeline-tutorial/README.md
@@ -1,8 +1,6 @@
# Data Pipeline Tutorial
-import Content from '../../reuse-content/_all-features.md';
-
import DocCardList from '@theme/DocCardList';
diff --git a/docs/case-practices/pipeline-tutorial/excel-to-mysql.md b/docs/case-practices/pipeline-tutorial/excel-to-mysql.md
index b2fc2920..834a3e1a 100644
--- a/docs/case-practices/pipeline-tutorial/excel-to-mysql.md
+++ b/docs/case-practices/pipeline-tutorial/excel-to-mysql.md
@@ -1,8 +1,6 @@
# Excel to MySQL Real-Time Sync
-import Content from '../../reuse-content/_all-features.md';
-
Excel is a wide range of data statistics and data analysis software. TapData enables reading local, FTP, SFTP, SMB, OSS, or S3FS on Excel files to meet diverse data flow needs.
@@ -20,12 +18,12 @@ The business provides real-time data updates and improved data analysis capabili
Before you create a replication task, make sure you have configured the relevant data source:
-1. [Configure Excel Connection](../../prerequisites/files/excel.md)
-2. [Configure MySQL Connection](../../prerequisites/on-prem-databases/mysql.md)
+1. [Configure Excel Connection](../../connectors/files/excel.md)
+2. [Configure MySQL Connection](../../connectors/on-prem-databases/mysql.md)
## Procedure
-1. [Log in to TapData Platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. Based on the product type, select the operation entry:
diff --git a/docs/case-practices/pipeline-tutorial/extract-array.md b/docs/case-practices/pipeline-tutorial/extract-array.md
index aa9aad1a..13433185 100644
--- a/docs/case-practices/pipeline-tutorial/extract-array.md
+++ b/docs/case-practices/pipeline-tutorial/extract-array.md
@@ -1,7 +1,5 @@
# Building an Array Extraction Link to Simplify Data Analysis
-import Content from '../../reuse-content/_all-features.md';
-
In modern payment systems, the analysis of payment data is crucial for understanding user behavior, optimizing business processes, and making decisions. For database tables storing payment data, payment data is sometimes written as a JSON string in a field, complicating its structure and making subsequent analysis complex.
@@ -52,11 +50,11 @@ Next, we will introduce how to use the built-in **Standard JS** node in TapData
## Prerequisites
-Before creating a data conversion task, you need to add the data source to which the settlement table belongs to TapData. Also, you need to add a data source (such as a MySQL database) as the target database. For specific operations, see [Configure MySQL Connection](../../prerequisites/on-prem-databases/mysql.md).
+Before creating a data conversion task, you need to add the data source to which the settlement table belongs to TapData. Also, you need to add a data source (such as a MySQL database) as the target database. For specific operations, see [Configure MySQL Connection](../../connectors/on-prem-databases/mysql.md).
## Procedure
-1. [Log in to TapData Platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. Based on the product type, select the operation entry:
* **TapData Cloud**: In the left navigation panel, click **Data Transformation**.
@@ -74,7 +72,7 @@ Before creating a data conversion task, you need to add the data source to which

- For more configuration introductions, see [Create Data Transform Task](../../user-guide/data-development/create-task.md).
+ For more configuration introductions, see [Create Data Transform Task](../../data-transformation/create-views/README.md).
6. Click the middle Standard JS node and enter the following code in the script text box on the right.
@@ -121,7 +119,7 @@ Before creating a data conversion task, you need to add the data source to which

- For more configuration introductions, see [Create Data Transform Task](../../user-guide/data-development/create-task.md).
+ For more configuration introductions, see [Create Data Transform Task](../../data-transformation/create-views/README.md).
9. After the configuration is complete, click **Save** in the lower right corner. Name the task and select the relevant directory to save. Click **Start**.
diff --git a/docs/case-practices/pipeline-tutorial/mysql-bi-directional-sync.md b/docs/case-practices/pipeline-tutorial/mysql-bi-directional-sync.md
index 7f91ac2c..33a0912a 100644
--- a/docs/case-practices/pipeline-tutorial/mysql-bi-directional-sync.md
+++ b/docs/case-practices/pipeline-tutorial/mysql-bi-directional-sync.md
@@ -1,8 +1,6 @@
# Implement Multi-Active with MySQL Bi-directional Sync
-import Content from '../../reuse-content/_all-features.md';
-
With the rapid development of enterprise business, ensuring data consistency and high availability has become a core requirement. Data synchronization across different regions not only allows for local access and reduced response latency but also helps build a multi-active architecture, enhancing system stability and reliability against single points of failure.
@@ -41,7 +39,7 @@ Tapdata supports bi-directional data synchronization for MySQL ↔ MySQL, Postgr
## Preparation
-[Connect MySQL databases in two regions separately](../../prerequisites/on-prem-databases/mysql.md).
+[Connect MySQL databases in two regions separately](../../connectors/on-prem-databases/mysql.md).
:::tip
@@ -51,7 +49,7 @@ Follow the instructions in the document to complete the Binlog configuration and
## Operation Steps
-1. [Log in to the TapData platform](../../user-guide/log-in.md).
+1. Log in to the TapData platform.
2. Based on the product type, select the operation entry:
@@ -64,7 +62,7 @@ Follow the instructions in the document to complete the Binlog configuration and
2. On the left side of the page, drag the MySQL data sources created in the preparation work (named Region A and Region B) to the right canvas, and then connect them.
- 3. Click the Region A node, select the table to be synchronized, which is `customer` in this case. For more parameter details (such as advanced settings), see [Creating Data Replication Task](../../user-guide/copy-data/create-task.md).
+ 3. Click the Region A node, select the table to be synchronized, which is `customer` in this case. For more parameter details (such as advanced settings), see [Creating Data Replication Task](../../data-replication/create-task.md).

@@ -123,7 +121,7 @@ Follow the instructions in the document to complete the Binlog configuration and
:::tip
- For Tapdata Enterprise, you can use the [data verification](../../user-guide/verify-data.md) to continuously verify the data of the two bi-directional sync tasks, better meeting your business needs.
+ For Tapdata Enterprise, you can use the [data verification](../../operational-data-hub/fdm-layer/validate-data-quality.md) to continuously verify the data of the two bi-directional sync tasks, better meeting your business needs.
:::
@@ -131,4 +129,4 @@ Follow the instructions in the document to complete the Binlog configuration and
In the task list page, you can start/stop, monitor, edit, copy, reset, and delete tasks.
-For detailed operations, see [Manage Tasks](../../user-guide/copy-data/manage-task.md).
\ No newline at end of file
+For detailed operations, see [Manage Tasks](../../data-transformation/manage-task.md).
\ No newline at end of file
diff --git a/docs/case-practices/pipeline-tutorial/mysql-to-aliyun.md b/docs/case-practices/pipeline-tutorial/mysql-to-aliyun.md
index 48aec16c..334aeddb 100644
--- a/docs/case-practices/pipeline-tutorial/mysql-to-aliyun.md
+++ b/docs/case-practices/pipeline-tutorial/mysql-to-aliyun.md
@@ -1,7 +1,5 @@
# MySQL to Alibaba Cloud Real-Time Sync
-import Content from '../../reuse-content/_enterprise-and-cloud-features.md';
-
As cloud computing evolves and becomes more prevalent, an increasing number of enterprises are looking to migrate their business from on-premises data centers to the cloud to leverage benefits such as lower operational costs and flexible scalability. For businesses with an on-premises MySQL database, migrating to the cloud is a critical step.
@@ -25,7 +23,7 @@ Next, we will introduce the specific operational procedures.
## Preparation
-1. [Connect to your on-prem MySQL database](../../prerequisites/on-prem-databases/mysql.md).
+1. [Connect to your on-prem MySQL database](../../connectors/on-prem-databases/mysql.md).
:::tip
@@ -33,11 +31,11 @@ Next, we will introduce the specific operational procedures.
:::
-2. [Connect to Alibaba Cloud RDS MySQL](../../prerequisites/cloud-databases/aliyun-rds-for-mysql.md).
+2. [Connect to Alibaba Cloud RDS MySQL](../../connectors/cloud-databases/aliyun-rds-for-mysql.md).
## Steps
-1. [Log in to the TapData platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. Based on the product type, select the operation entry:
@@ -109,4 +107,4 @@ Next, we will introduce the specific operational procedures.
In the task list page, you can also start/stop, monitor, edit, copy, reset, or delete tasks.
-For detailed operations, see [Manage Tasks](../../user-guide/copy-data/manage-task.md).
\ No newline at end of file
+For detailed operations, see [Manage Tasks](../../data-transformation/manage-task.md).
\ No newline at end of file
diff --git a/docs/case-practices/pipeline-tutorial/mysql-to-bigquery.md b/docs/case-practices/pipeline-tutorial/mysql-to-bigquery.md
index e13dcaec..e48896a5 100644
--- a/docs/case-practices/pipeline-tutorial/mysql-to-bigquery.md
+++ b/docs/case-practices/pipeline-tutorial/mysql-to-bigquery.md
@@ -1,8 +1,6 @@
# MySQL to BigQuery Real-Time Sync
-import Content from '../../reuse-content/_all-features.md';
-
[BigQuery](https://cloud.google.com/bigquery/docs?hl=zh-cn) is a fully serverless and cost-effective enterprise data warehouse that operates seamlessly across different cloud platforms and effortlessly scales with your data. It incorporates business intelligence, machine learning, and AI functionalities. TapData, on the other hand, enables real-time synchronization of multiple data sources with BigQuery, facilitating smooth data flow and effectively accommodating changes in data architecture or big data analysis requirements.
@@ -12,14 +10,14 @@ To illustrate this synchronization process, let's consider MySQL as the source d
Before you create a replication task, make sure you have configured the relevant data source:
-1. [Configure MySQL Connection](../../prerequisites/on-prem-databases/mysql.md)
-2. [Configure BigQuery Connection](../../prerequisites/warehouses-and-lake/big-query.md)
+1. [Configure MySQL Connection](../../connectors/on-prem-databases/mysql.md)
+2. [Configure BigQuery Connection](../../connectors/warehouses-and-lake/big-query.md)
-Also note the reference [data type support](../../user-guide/no-supported-data-type.md).
+Also note the reference [data type support](../../faq/no-supported-data-type.md).
## Configure Task
-1. [Log in to TapData Platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. Based on the product type, select the operation entry:
@@ -83,7 +81,7 @@ Also note the reference [data type support](../../user-guide/no-supported-data-t
On the Task List page, you can also start, stop, monitor, edit, copy, reset, and delete tasks.
-For more information, See [Management Tasks](../../user-guide/copy-data/manage-task.md).
+For more information, See [Management Tasks](../../data-transformation/manage-task.md).
diff --git a/docs/case-practices/pipeline-tutorial/mysql-to-clickhouse.md b/docs/case-practices/pipeline-tutorial/mysql-to-clickhouse.md
index 09702928..478f86b9 100644
--- a/docs/case-practices/pipeline-tutorial/mysql-to-clickhouse.md
+++ b/docs/case-practices/pipeline-tutorial/mysql-to-clickhouse.md
@@ -1,8 +1,6 @@
# How to Build a Real-time Data Warehouse by Syncing MySQL to ClickHouse
-import Content from '../../reuse-content/_all-features.md';
-
ClickHouse® is an open-source column-oriented database management system that allows generating analytical data reports in real-time. Its official ClickHouse Cloud offers scalable, real-time analytical processing without the need to manage infrastructure. With storage and computation decoupled, ClickHouse Cloud can auto-scale to accommodate modern workloads, ensuring high-speed query processing.
@@ -25,12 +23,12 @@ Recognizing the demand for data migration, TapData introduced ClickHouse as a sy
Before setting up a data sync pipeline on TapData Cloud, connect your data sources:
-1. [Connect to MySQL](../../prerequisites/on-prem-databases/mysql.md)
-2. [Connect to ClickHouse](../../prerequisites/warehouses-and-lake/clickhouse.md)
+1. [Connect to MySQL](../../connectors/on-prem-databases/mysql.md)
+2. [Connect to ClickHouse](../../connectors/warehouses-and-lake/clickhouse.md)
## Configure Task
-1. [Log in to TapData Platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. Based on the product type, select the operation entry:
@@ -67,4 +65,4 @@ Before setting up a data sync pipeline on TapData Cloud, connect your data sourc
## See also
-For more advanced features like table merging or building wide tables, you can [create data transformation task](../../user-guide/data-development/create-task.md) on TapData Cloud. Additionally, you can explore the [Real-Time Data Hub](../../user-guide/real-time-data-hub/daas-mode/enable-daas-mode.md), simply drag the source table to generate a data pipeline, which will then automatically start the task. This greatly simplifies the task configuration process.
\ No newline at end of file
+For more advanced features like table merging or building wide tables, you can [create data transformation task](../../data-transformation/create-views/README.md) on TapData Cloud. Additionally, you can explore the [Real-Time Data Hub](../../operational-data-hub/set-up-odh.md), simply drag the source table to generate a data pipeline, which will then automatically start the task. This greatly simplifies the task configuration process.
\ No newline at end of file
diff --git a/docs/case-practices/pipeline-tutorial/mysql-to-oracle.md b/docs/case-practices/pipeline-tutorial/mysql-to-oracle.md
index 9ca24c7c..f961b798 100644
--- a/docs/case-practices/pipeline-tutorial/mysql-to-oracle.md
+++ b/docs/case-practices/pipeline-tutorial/mysql-to-oracle.md
@@ -1,7 +1,5 @@
# Real-time Heterogeneous Sync from MySQL to Oracle
-import Content from '../../reuse-content/_all-features.md';
-
With the rapid development of modern enterprises, data has become one of the most important assets. In many organizations, to meet a variety of business and technical requirements, various types of databases might be in use. Through a real case of migration from Oracle to MySQL, this article introduces how to achieve real-time synchronization of heterogeneous databases through TapData. This helps to quickly complete data flow between databases of different types, structures, and technologies, building a unified data service platform and preventing data silos.
@@ -24,12 +22,12 @@ Having understood the differences between TapData and traditional solutions, we
Before building the data sync pipeline, we first need to establish a connection to the data source on TapData. The specific steps are as follows:
-1. [Connect to MySQL](../../prerequisites/on-prem-databases/mysql.md)
-2. [Connect to Oracle](../../prerequisites/on-prem-databases/oracle.md)
+1. [Connect to MySQL](../../connectors/on-prem-databases/mysql.md)
+2. [Connect to Oracle](../../connectors/on-prem-databases/oracle.md)
## Configure Task
-1. [Log in to TapData Platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. Based on the product type, select the operation entry:
@@ -66,4 +64,4 @@ Before building the data sync pipeline, we first need to establish a connection
## See also
-For more advanced features like table merging or building wide tables, you can [create data transformation task](../../user-guide/data-development/create-task.md) on TapData. Additionally, you can explore the [Real-Time Data Hub](../../user-guide/real-time-data-hub/daas-mode/enable-daas-mode.md), simply drag the source table to generate a data pipeline, which will then automatically start the task. This greatly simplifies the task configuration process.
\ No newline at end of file
+For more advanced features like table merging or building wide tables, you can [create data transformation task](../../data-transformation/create-views/README.md) on TapData. Additionally, you can explore the [Real-Time Data Hub](../../operational-data-hub/set-up-odh.md), simply drag the source table to generate a data pipeline, which will then automatically start the task. This greatly simplifies the task configuration process.
\ No newline at end of file
diff --git a/docs/case-practices/pipeline-tutorial/mysql-to-redis.md b/docs/case-practices/pipeline-tutorial/mysql-to-redis.md
index c0bfd43f..f5c3b0e6 100644
--- a/docs/case-practices/pipeline-tutorial/mysql-to-redis.md
+++ b/docs/case-practices/pipeline-tutorial/mysql-to-redis.md
@@ -1,7 +1,5 @@
# MySQL to Redis Real-Time Sync
-import Content from '../../reuse-content/_all-features.md';
-
Redis is an in-memory key-value database, suitable for scenarios such as data caching, event publishing/subscribing, and high-speed queues. TapData allows you to sync data from relational databases (Oracle, MySQL, MongoDB, PostgreSQL, SQL Server) to Redis in real-time, helping you complete data flows quickly.
@@ -9,7 +7,7 @@ This article explains how to sync data from MySQL to Redis using a data transfor
:::tip
-If you need to sync a table from the source MySQL to Redis at the same time, you can [create a data replication task](../../user-guide/copy-data/create-task.md). The setup process is similar to this article.
+If you need to sync a table from the source MySQL to Redis at the same time, you can [create a data replication task](../../data-replication/create-task.md). The setup process is similar to this article.
:::
@@ -17,12 +15,12 @@ If you need to sync a table from the source MySQL to Redis at the same time, you
Before creating a data transformation task, make sure you have set up the relevant data sources:
-1. [Configure MySQL connection](../../prerequisites/on-prem-databases/mysql.md)
-2. [Configure Redis connection](../../prerequisites/on-prem-databases/redis.md)
+1. [Configure MySQL connection](../../connectors/on-prem-databases/mysql.md)
+2. [Configure Redis connection](../../connectors/on-prem-databases/redis.md)
## Procedure
-1. [Log in to TapData Platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. Based on the product type, select the operation entry:
@@ -141,4 +139,4 @@ Then in Redis, we query the corresponding data:
On the task list page, you can start/stop, monitor, edit, copy, reset, delete, etc. tasks.
-For detailed operations, refer to [Manage Tasks](../../user-guide/data-development/monitor-task.md).
+For detailed operations, refer to [Manage Tasks](../../data-transformation/manage-task).
diff --git a/docs/case-practices/pipeline-tutorial/oracle-to-kafka.md b/docs/case-practices/pipeline-tutorial/oracle-to-kafka.md
index d02fe3ad..41c51a99 100644
--- a/docs/case-practices/pipeline-tutorial/oracle-to-kafka.md
+++ b/docs/case-practices/pipeline-tutorial/oracle-to-kafka.md
@@ -1,7 +1,5 @@
# Real-Time Oracle to Kafka Synchronization
-import Content from '../../reuse-content/_all-features.md';
-
In the era of big data, more and more enterprises need to synchronize data from traditional relational databases to big data processing platforms to support real-time data processing, data lake construction, and alternative data warehousing scenarios. Oracle, widely used in enterprise applications, increasingly requires synchronization to big data platforms.
@@ -34,12 +32,12 @@ In this case, we aim to read real-time data from the car insurance claims table
Before creating a data transformation task, ensure you have configured the necessary data sources:
-1. [Configure Oracle Connection](../../prerequisites/on-prem-databases/oracle.md)
-2. [Configure Kafka Connection](../../prerequisites/mq-and-middleware/kafka.md)
+1. [Configure Oracle Connection](../../connectors/on-prem-databases/oracle.md)
+2. [Configure Kafka Connection](../../connectors/mq-and-middleware/kafka.md)
## Configure Task
-1. [Log in to TapData Platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. Based on the product type, select the operation entry:
@@ -110,7 +108,7 @@ Before creating a data transformation task, ensure you have configured the neces
:::tip
- For more information on how to use the JS node and supported functions, see [Process Node](../../user-guide/data-development/process-node.md).
+ For more information on how to use the JS node and supported functions, see [Process Node](../../data-transformation/process-node.md).
:::
@@ -261,4 +259,4 @@ Processed a total of 1 messages
On the task list page, you can also start/stop, monitor, edit, copy, reset, delete, and perform other operations on the task.
-For more information, see [Managing Tasks](../../user-guide/data-development/monitor-task.md).
+For more information, see [Managing Tasks](../../data-transformation/manage-task.md).
diff --git a/docs/case-practices/pipeline-tutorial/oracle-to-tablestore.md b/docs/case-practices/pipeline-tutorial/oracle-to-tablestore.md
index 338dd65d..f0b409b6 100644
--- a/docs/case-practices/pipeline-tutorial/oracle-to-tablestore.md
+++ b/docs/case-practices/pipeline-tutorial/oracle-to-tablestore.md
@@ -1,7 +1,5 @@
# Oracle to Tablestore Real-Time Sync
-import Content from '../../reuse-content/_all-features.md';
-
[Alibaba Cloud Tablestore](https://www.alibabacloud.com/help/en/tablestore) is a serverless table storage service designed for handling large volumes of structured data. It also provides a comprehensive solution for IoT scenarios, offering optimized data storage capabilities. TapData enables real-time synchronization of Oracle data to Tablestore, providing seamless data flow and facilitating easy adaptation to data architecture changes and big data analysis scenarios.
@@ -9,14 +7,14 @@ import Content from '../../reuse-content/_all-features.md';
Before you create a replication task, make sure you have configured the relevant data source:
-1. [Configure Oracle Connection](../../prerequisites/on-prem-databases/oracle.md)
-2. [Configure Tablestore Connection](../../prerequisites/warehouses-and-lake/tablestore.md)
+1. [Configure Oracle Connection](../../connectors/on-prem-databases/oracle.md)
+2. [Configure Tablestore Connection](../../connectors/warehouses-and-lake/tablestore.md)
-Also note the reference [data type support](../../user-guide/no-supported-data-type.md).
+Also note the reference [data type support](../../faq/no-supported-data-type.md).
## Configure Task
-1. [Log in to TapData Platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. Based on the product type, select the operation entry:
@@ -33,7 +31,7 @@ Also note the reference [data type support](../../user-guide/no-supported-data-t
:::tip
- Since the number of columns in a single table in Tablestore cannot exceed 32, if the number of columns in the Oracle tables to be synchronized exceeds 32, you can address this limitation by adding a **Field Edit** node between the Oracle and Tablestore data sources. This node allows you to handle the situation and selectively exclude business-independent columns from the synchronization process. For more information, see [Processing Node](../../user-guide/data-development/process-node.md).
+ Since the number of columns in a single table in Tablestore cannot exceed 32, if the number of columns in the Oracle tables to be synchronized exceeds 32, you can address this limitation by adding a **Field Edit** node between the Oracle and Tablestore data sources. This node allows you to handle the situation and selectively exclude business-independent columns from the synchronization process. For more information, see [Processing Node](../../data-transformation/process-node.md).
:::
@@ -47,4 +45,4 @@ Also note the reference [data type support](../../user-guide/no-supported-data-t
On the Task List page, you can also start, stop, monitor, edit, copy, reset, and delete tasks.
-For more information, See [Management Tasks](../../user-guide/copy-data/manage-task.md).
\ No newline at end of file
+For more information, See [Management Tasks](../../data-transformation/manage-task.md).
\ No newline at end of file
diff --git a/docs/case-practices/pipeline-tutorial/sql-server-to-bigquery.md b/docs/case-practices/pipeline-tutorial/sql-server-to-bigquery.md
index 9f99bdf6..7cb5789f 100644
--- a/docs/case-practices/pipeline-tutorial/sql-server-to-bigquery.md
+++ b/docs/case-practices/pipeline-tutorial/sql-server-to-bigquery.md
@@ -1,7 +1,5 @@
# SQL Server to BigQuery Real-Time Sync
-import Content from '../../reuse-content/_all-features.md';
-
In today's age of rapidly expanding data, companies are increasingly turning to [BigQuery](https://cloud.google.com/bigquery/docs) in order to extract valuable insights and further modernize their data analysis strategies. Through BigQuery, they aim to run large-scale critical business applications, optimizing operations, enhancing customer experience, and reducing overall costs.
@@ -31,16 +29,16 @@ To fully tap into these advantages, the initial step is to ensure effective sync
Before you create a replication task, make sure you have configured the relevant data source:
-1. [Configure SQL Server Connection](../../prerequisites/on-prem-databases/sqlserver.md)
-2. [Configure BigQuery Connection](../../prerequisites/warehouses-and-lake/big-query.md)
+1. [Configure SQL Server Connection](../../connectors/on-prem-databases/sqlserver.md)
+2. [Configure BigQuery Connection](../../connectors/warehouses-and-lake/big-query.md)
-Also note the reference [data type support](../../user-guide/no-supported-data-type.md).
+Also note the reference [data type support](../../faq/no-supported-data-type.md).
## Configure Task
-1. [Log in to TapData Platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. Based on the product type, select the operation entry:
diff --git a/docs/connectors/README.md b/docs/connectors/README.md
new file mode 100644
index 00000000..daf0596c
--- /dev/null
+++ b/docs/connectors/README.md
@@ -0,0 +1,9 @@
+# Connect to Data Sources
+
+
+
+TapData supports nearly a hundred [diverse data sources](supported-data-sources.md), including commercial databases, open-source databases, cloud databases, data warehouses, data lakes, message queues, SaaS platforms, files, and custom data sources.
+
+import DocCardList from '@theme/DocCardList';
+
+
diff --git a/docs/connectors/allow-access-network.md b/docs/connectors/allow-access-network.md
new file mode 100644
index 00000000..c82d6419
--- /dev/null
+++ b/docs/connectors/allow-access-network.md
@@ -0,0 +1,14 @@
+# Configure Network Access
+
+Before deploying the Agent, you need to refer to the requirements in this document and adjust the relevant firewall to ensure its communication ability. The workflow of the Agent is shown below:
+
+
+
+
+
+| Requirements | Description |
+| ---------------------------------- | ------------------------------------------------------------ |
+| Agent can connect to source database's port. | Ensure that the Agent can read data from the source database. |
+| Agent can connect to target database's port. | Ensure that the Agent can write data to the target database. |
+| Agent can connect to extranet. | Ensure the Agent can report task status, retrieve configuration, and execute tasks to/from TapData Cloud. |
+
diff --git a/docs/prerequisites/cloud-databases/README.md b/docs/connectors/cloud-databases/README.md
similarity index 59%
rename from docs/prerequisites/cloud-databases/README.md
rename to docs/connectors/cloud-databases/README.md
index efd5d305..ab269986 100644
--- a/docs/prerequisites/cloud-databases/README.md
+++ b/docs/connectors/cloud-databases/README.md
@@ -1,9 +1,5 @@
# Cloud Databases
-import Content from '../../reuse-content/_enterprise-and-cloud-features.md';
-
-
-
Please select the database you would like to add:
import DocCardList from '@theme/DocCardList';
diff --git a/docs/prerequisites/cloud-databases/aliyun-adb-mysql.md b/docs/connectors/cloud-databases/aliyun-adb-mysql.md
similarity index 96%
rename from docs/prerequisites/cloud-databases/aliyun-adb-mysql.md
rename to docs/connectors/cloud-databases/aliyun-adb-mysql.md
index 0a9ff87e..f715bdee 100644
--- a/docs/prerequisites/cloud-databases/aliyun-adb-mysql.md
+++ b/docs/connectors/cloud-databases/aliyun-adb-mysql.md
@@ -1,8 +1,6 @@
# Aliyun ADB MySQL
-import Content from '../../reuse-content/_all-features.md';
-
Please follow the instructions below to successfully add and use Aliyun ADB MySQL database in TapData Cloud.
diff --git a/docs/prerequisites/cloud-databases/aliyun-adb-postgresql.md b/docs/connectors/cloud-databases/aliyun-adb-postgresql.md
similarity index 88%
rename from docs/prerequisites/cloud-databases/aliyun-adb-postgresql.md
rename to docs/connectors/cloud-databases/aliyun-adb-postgresql.md
index d1e22975..4b6faa90 100644
--- a/docs/prerequisites/cloud-databases/aliyun-adb-postgresql.md
+++ b/docs/connectors/cloud-databases/aliyun-adb-postgresql.md
@@ -1,8 +1,6 @@
# Aliyun ADB PostgreSQL
-import Content from '../../reuse-content/_all-features.md';
-
Please follow the instructions below to successfully add and use Aliyun ADB PostgreSQL database in TapData.
diff --git a/docs/prerequisites/cloud-databases/aliyun-mongodb.md b/docs/connectors/cloud-databases/aliyun-mongodb.md
similarity index 98%
rename from docs/prerequisites/cloud-databases/aliyun-mongodb.md
rename to docs/connectors/cloud-databases/aliyun-mongodb.md
index f3504053..223d50d1 100644
--- a/docs/prerequisites/cloud-databases/aliyun-mongodb.md
+++ b/docs/connectors/cloud-databases/aliyun-mongodb.md
@@ -1,8 +1,6 @@
# Aliyun MongoDB
-import Content from '../../reuse-content/_all-features.md';
-
Please follow the instructions below to ensure that the Aliyun MongoDB database is successfully added and used in TapData Cloud.
diff --git a/docs/prerequisites/cloud-databases/aliyun-rds-for-mariadb.md b/docs/connectors/cloud-databases/aliyun-rds-for-mariadb.md
similarity index 94%
rename from docs/prerequisites/cloud-databases/aliyun-rds-for-mariadb.md
rename to docs/connectors/cloud-databases/aliyun-rds-for-mariadb.md
index 23b19e19..9e5e8a38 100644
--- a/docs/prerequisites/cloud-databases/aliyun-rds-for-mariadb.md
+++ b/docs/connectors/cloud-databases/aliyun-rds-for-mariadb.md
@@ -1,8 +1,6 @@
# Aliyun RDS for MariaDB
-import Content from '../../reuse-content/_all-features.md';
-
Please follow the instructions below to successfully add and use Aliyun RDS for MariaDB database in TapData Cloud.
diff --git a/docs/prerequisites/cloud-databases/aliyun-rds-for-mongodb.md b/docs/connectors/cloud-databases/aliyun-rds-for-mongodb.md
similarity index 98%
rename from docs/prerequisites/cloud-databases/aliyun-rds-for-mongodb.md
rename to docs/connectors/cloud-databases/aliyun-rds-for-mongodb.md
index 77dddc6e..12d4b96b 100644
--- a/docs/prerequisites/cloud-databases/aliyun-rds-for-mongodb.md
+++ b/docs/connectors/cloud-databases/aliyun-rds-for-mongodb.md
@@ -1,8 +1,6 @@
# Aliyun MongoDB
-import Content from '../../reuse-content/_all-features.md';
-
Please follow the instructions below to ensure that the MongoDB database is successfully added and used in TapData.
diff --git a/docs/prerequisites/cloud-databases/aliyun-rds-for-mysql.md b/docs/connectors/cloud-databases/aliyun-rds-for-mysql.md
similarity index 98%
rename from docs/prerequisites/cloud-databases/aliyun-rds-for-mysql.md
rename to docs/connectors/cloud-databases/aliyun-rds-for-mysql.md
index f1438429..df55d221 100644
--- a/docs/prerequisites/cloud-databases/aliyun-rds-for-mysql.md
+++ b/docs/connectors/cloud-databases/aliyun-rds-for-mysql.md
@@ -1,8 +1,6 @@
# Aliyun RDS MySQL
-import Content from '../../reuse-content/_all-features.md';
-
ApsaraDB RDS MySQL is a relational database service with high availability, scalability, security and reliability provided by Alibaba Cloud.
diff --git a/docs/prerequisites/cloud-databases/aliyun-rds-for-pg.md b/docs/connectors/cloud-databases/aliyun-rds-for-pg.md
similarity index 95%
rename from docs/prerequisites/cloud-databases/aliyun-rds-for-pg.md
rename to docs/connectors/cloud-databases/aliyun-rds-for-pg.md
index ae133030..a6494c38 100644
--- a/docs/prerequisites/cloud-databases/aliyun-rds-for-pg.md
+++ b/docs/connectors/cloud-databases/aliyun-rds-for-pg.md
@@ -1,8 +1,6 @@
# Aliyun RDS for PostgreSQL
-import Content from '../../reuse-content/_all-features.md';
-
Follow the instructions below to successfully add and use PostgreSQL database in TapData Cloud.
diff --git a/docs/prerequisites/cloud-databases/aliyun-rds-for-sql-server.md b/docs/connectors/cloud-databases/aliyun-rds-for-sql-server.md
similarity index 94%
rename from docs/prerequisites/cloud-databases/aliyun-rds-for-sql-server.md
rename to docs/connectors/cloud-databases/aliyun-rds-for-sql-server.md
index dc77426c..0062f545 100644
--- a/docs/prerequisites/cloud-databases/aliyun-rds-for-sql-server.md
+++ b/docs/connectors/cloud-databases/aliyun-rds-for-sql-server.md
@@ -1,8 +1,6 @@
# Aliyun RDS for SQL Server
-import Content from '../../reuse-content/_enterprise-and-cloud-features.md';
-
Aliyun RDS for SQL Server is an on-demand database hosting service for SQL Server with automated monitoring, backup and disaster recovery capabilities.
@@ -14,7 +12,7 @@ SQL Server 2005, 2008, 2008 R2, 2012, 2014, 2016, and 2017.
## Connect to Aliyun RDS for SQL Server
-1. [Log in to TapData Platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. In the left navigation panel, click **Connections**.
diff --git a/docs/prerequisites/cloud-databases/amazon-rds-mysql.md b/docs/connectors/cloud-databases/amazon-rds-mysql.md
similarity index 97%
rename from docs/prerequisites/cloud-databases/amazon-rds-mysql.md
rename to docs/connectors/cloud-databases/amazon-rds-mysql.md
index 665b6262..65c2de1b 100644
--- a/docs/prerequisites/cloud-databases/amazon-rds-mysql.md
+++ b/docs/connectors/cloud-databases/amazon-rds-mysql.md
@@ -1,8 +1,6 @@
# Amazon RDS for MySQL
-import Content from '../../reuse-content/_all-features.md';
-
Follow the instructions below to successfully add and use Amazon RDS for MySQL database in TapData Cloud.
diff --git a/docs/prerequisites/cloud-databases/azure-cosmos-db.md b/docs/connectors/cloud-databases/azure-cosmos-db.md
similarity index 95%
rename from docs/prerequisites/cloud-databases/azure-cosmos-db.md
rename to docs/connectors/cloud-databases/azure-cosmos-db.md
index 7f7750ed..a48957f8 100644
--- a/docs/prerequisites/cloud-databases/azure-cosmos-db.md
+++ b/docs/connectors/cloud-databases/azure-cosmos-db.md
@@ -1,8 +1,6 @@
# Azure Cosmos DB
-import Content from '../../reuse-content/_enterprise-and-cloud-features.md';
-
Azure Cosmos DB is a fully managed NoSQL and relational database for modern app development including. This article explains how to connect Azure Cosmos DB on the TapData platform, assisting in the rapid circulation of cloud data.
@@ -30,7 +28,7 @@ Before connecting the data source, you need to log in to the Azure console to ob
## Steps
-1. [Log in to TapData Platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. In the left navigation panel, click **Connections**.
diff --git a/docs/prerequisites/cloud-databases/huawei-cloud-gaussdb.md b/docs/connectors/cloud-databases/huawei-cloud-gaussdb.md
similarity index 94%
rename from docs/prerequisites/cloud-databases/huawei-cloud-gaussdb.md
rename to docs/connectors/cloud-databases/huawei-cloud-gaussdb.md
index 408641c9..054645d9 100644
--- a/docs/prerequisites/cloud-databases/huawei-cloud-gaussdb.md
+++ b/docs/connectors/cloud-databases/huawei-cloud-gaussdb.md
@@ -1,8 +1,6 @@
# Huawei Cloud GaussDB
-import Content from '../../reuse-content/_all-features.md';
-
GaussDB is a distributed relational database independently developed by Huawei, supporting distributed transactions, cross-AZ deployment, and zero data loss. It offers scalability of over 1000 nodes, PB-level massive storage, providing enterprises with a comprehensive, stable, reliable, scalable, and high-performance enterprise-grade database service. TapData supports using GaussDB as a source or target database, helping you quickly build data flow pipelines. Next, we will introduce how to connect GaussDB data sources in the TapData platform.
@@ -74,7 +72,7 @@ To achieve incremental data reading, TapData requires Huawei Cloud GaussDB's [lo
## Connect to GaussDB
-1. [Log in to the TapData platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. In the left navigation bar, click **Connection Management**.
@@ -102,7 +100,7 @@ To achieve incremental data reading, TapData requires Huawei Cloud GaussDB's [lo
* **Advanced Settings**
- * **CDC Log Caching**: [Mining the source database's](../../user-guide/advanced-settings/share-mining.md) incremental logs, this feature allows multiple tasks to share incremental logs from the source database, avoiding redundant reads and thus significantly reducing the load on the source database during incremental synchronization. Upon enabling this feature, an external storage should be selected to store the incremental log.
+ * **CDC Log Caching**: [Mining the source database's](../../operational-data-hub/advanced/share-mining.md) incremental logs, this feature allows multiple tasks to share incremental logs from the source database, avoiding redundant reads and thus significantly reducing the load on the source database during incremental synchronization. Upon enabling this feature, an external storage should be selected to store the incremental log.
* **Contain table**: The default option is **All**, which includes all tables. Alternatively, you can select **Custom** and manually specify the desired tables by separating their names with commas (,).
diff --git a/docs/prerequisites/cloud-databases/mongodb-atlas.md b/docs/connectors/cloud-databases/mongodb-atlas.md
similarity index 98%
rename from docs/prerequisites/cloud-databases/mongodb-atlas.md
rename to docs/connectors/cloud-databases/mongodb-atlas.md
index e4785d64..cd6b4ae6 100644
--- a/docs/prerequisites/cloud-databases/mongodb-atlas.md
+++ b/docs/connectors/cloud-databases/mongodb-atlas.md
@@ -10,9 +10,7 @@ keywords:
# MongoDB Atlas
-import Content from '../../reuse-content/_all-features.md';
-
[TapData](https://tapdata.io/) supports [MongoDB Atlas](https://www.mongodb.com/atlas) as a data source, enabling real-time data sync, incremental replication, and seamless cloud-to-local data integration.
@@ -78,7 +76,7 @@ Before establishing the connection, it is essential to complete the necessary pr
## Connect to MongoDB Atlas
-1. [Log in to TapData Platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. In the left navigation panel, click **Connections**.
diff --git a/docs/prerequisites/cloud-databases/polardb-mysql.md b/docs/connectors/cloud-databases/polardb-mysql.md
similarity index 97%
rename from docs/prerequisites/cloud-databases/polardb-mysql.md
rename to docs/connectors/cloud-databases/polardb-mysql.md
index dde3c2e8..e67b3a93 100644
--- a/docs/prerequisites/cloud-databases/polardb-mysql.md
+++ b/docs/connectors/cloud-databases/polardb-mysql.md
@@ -1,8 +1,6 @@
# PolarDB MySQL
-import Content from '../../reuse-content/_all-features.md';
-
Follow the instructions below to successfully add and use PolarDB MySQL database in TapData Cloud.
diff --git a/docs/prerequisites/cloud-databases/polardb-postgresql.md b/docs/connectors/cloud-databases/polardb-postgresql.md
similarity index 90%
rename from docs/prerequisites/cloud-databases/polardb-postgresql.md
rename to docs/connectors/cloud-databases/polardb-postgresql.md
index f6b0101e..839ad3b0 100644
--- a/docs/prerequisites/cloud-databases/polardb-postgresql.md
+++ b/docs/connectors/cloud-databases/polardb-postgresql.md
@@ -1,8 +1,6 @@
# PolarDB PostgreSQL
-import Content from '../../reuse-content/_all-features.md';
-
Please follow the instructions below to successfully add and use PolarDB PostgreSQL databases in TapData Cloud.
diff --git a/docs/prerequisites/cloud-databases/tencentdb-for-mariadb.md b/docs/connectors/cloud-databases/tencentdb-for-mariadb.md
similarity index 95%
rename from docs/prerequisites/cloud-databases/tencentdb-for-mariadb.md
rename to docs/connectors/cloud-databases/tencentdb-for-mariadb.md
index 1c81bbc9..f5b8af3f 100644
--- a/docs/prerequisites/cloud-databases/tencentdb-for-mariadb.md
+++ b/docs/connectors/cloud-databases/tencentdb-for-mariadb.md
@@ -1,8 +1,6 @@
# TencentDB MariaDB
-import Content from '../../reuse-content/_all-features.md';
-
Please follow the instructions below to successfully add and use MariaDB databases in TapData Cloud.
diff --git a/docs/prerequisites/cloud-databases/tencentdb-for-mongodb.md b/docs/connectors/cloud-databases/tencentdb-for-mongodb.md
similarity index 98%
rename from docs/prerequisites/cloud-databases/tencentdb-for-mongodb.md
rename to docs/connectors/cloud-databases/tencentdb-for-mongodb.md
index 6a758401..d899a226 100644
--- a/docs/prerequisites/cloud-databases/tencentdb-for-mongodb.md
+++ b/docs/connectors/cloud-databases/tencentdb-for-mongodb.md
@@ -1,8 +1,6 @@
# TencentDB for MongoDB
-import Content from '../../reuse-content/_all-features.md';
-
## Supported versions
diff --git a/docs/prerequisites/cloud-databases/tencentdb-for-mysql.md b/docs/connectors/cloud-databases/tencentdb-for-mysql.md
similarity index 97%
rename from docs/prerequisites/cloud-databases/tencentdb-for-mysql.md
rename to docs/connectors/cloud-databases/tencentdb-for-mysql.md
index 48aed34c..ef152dd3 100644
--- a/docs/prerequisites/cloud-databases/tencentdb-for-mysql.md
+++ b/docs/connectors/cloud-databases/tencentdb-for-mysql.md
@@ -1,8 +1,6 @@
# TencentDB TD-SQL
-import Content from '../../reuse-content/_all-features.md';
-
Please follow the instructions below to ensure successful addition and use of the distributed database TD-SQL version database in TapData.
diff --git a/docs/prerequisites/cloud-databases/tencentdb-for-pg.md b/docs/connectors/cloud-databases/tencentdb-for-pg.md
similarity index 95%
rename from docs/prerequisites/cloud-databases/tencentdb-for-pg.md
rename to docs/connectors/cloud-databases/tencentdb-for-pg.md
index a040f34e..b535b49d 100644
--- a/docs/prerequisites/cloud-databases/tencentdb-for-pg.md
+++ b/docs/connectors/cloud-databases/tencentdb-for-pg.md
@@ -1,8 +1,6 @@
# TencentDB for PG
-import Content from '../../reuse-content/_all-features.md';
-
## Supported Versions
diff --git a/docs/prerequisites/cloud-databases/tencentdb-for-sql-server.md b/docs/connectors/cloud-databases/tencentdb-for-sql-server.md
similarity index 94%
rename from docs/prerequisites/cloud-databases/tencentdb-for-sql-server.md
rename to docs/connectors/cloud-databases/tencentdb-for-sql-server.md
index 5dc3098d..dc70b472 100644
--- a/docs/prerequisites/cloud-databases/tencentdb-for-sql-server.md
+++ b/docs/connectors/cloud-databases/tencentdb-for-sql-server.md
@@ -1,8 +1,6 @@
# TencentDB for SQL Server
-import Content from '../../reuse-content/_enterprise-and-cloud-features.md';
-
Tencent Cloud TencentDB for SQL Server is one of the most popular commercial databases in the industry, providing perfect compatibility with Windows-based applications.
@@ -15,7 +13,7 @@ SQL Server 2005, 2008, 2008 R2, 2012, 2014, 2016, and 2017.
## Connect to TencentDB for SQL Server
-1. [Log in to TapData Platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. In the left navigation panel, click **Connections**.
diff --git a/docs/prerequisites/crm-and-sales-analytics/README.md b/docs/connectors/crm-and-sales-analytics/README.md
similarity index 61%
rename from docs/prerequisites/crm-and-sales-analytics/README.md
rename to docs/connectors/crm-and-sales-analytics/README.md
index 5ba60519..99ea88ba 100644
--- a/docs/prerequisites/crm-and-sales-analytics/README.md
+++ b/docs/connectors/crm-and-sales-analytics/README.md
@@ -1,8 +1,6 @@
# CRM and Sales Analytics
-import Content from '../../reuse-content/_enterprise-and-cloud-features.md';
-
Please select the database you would like to add:
diff --git a/docs/prerequisites/crm-and-sales-analytics/hubspot.md b/docs/connectors/crm-and-sales-analytics/hubspot.md
similarity index 88%
rename from docs/prerequisites/crm-and-sales-analytics/hubspot.md
rename to docs/connectors/crm-and-sales-analytics/hubspot.md
index c81e35c3..3169b7c9 100644
--- a/docs/prerequisites/crm-and-sales-analytics/hubspot.md
+++ b/docs/connectors/crm-and-sales-analytics/hubspot.md
@@ -1,13 +1,11 @@
# HubSpot
-import Content from '../../reuse-content/_enterprise-and-cloud-features.md';
-
HubSpot's CRM platform contains the marketing, sales, service, operations, and website-building software you need to grow your business. TapData Cloud supports building data pipelines with HubSpot as the source database, allowing you to read HubSpot operational data and sync it to a specified data source. This document explains how to add a HubSpot data source in TapData Cloud.
## Connect HubSpot
-1. [Log in to TapData Platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. In the left navigation panel, click **Connections**.
@@ -31,7 +29,7 @@ HubSpot's CRM platform contains the marketing, sales, service, operations, and w
After completing the operation, the page will automatically return to the data source configuration page, displaying **Successfully Authorized**.
-7. Click **Connection Test**. After passing the test, click **Save**.
+7. Click **Test**. After passing the test, click **Save**.
:::tip
diff --git a/docs/connectors/crm-and-sales-analytics/metabase.md b/docs/connectors/crm-and-sales-analytics/metabase.md
new file mode 100644
index 00000000..36662083
--- /dev/null
+++ b/docs/connectors/crm-and-sales-analytics/metabase.md
@@ -0,0 +1,39 @@
+# Metabase
+
+Metabase is an open-source business intelligence tool that helps analyze and visualize data to drive business decisions. Tapdata supports using Metabase as a **source** to build data pipelines. This document explains how to add Metabase as a data source in Tapdata.
+
+## Prerequisites
+
+Metabase must be either cloud-registered or deployed locally.
+
+## Connect to Metabase
+
+1. Log in to the Tapdata Platform.
+
+2. In the left navigation menu, click **Connections**.
+
+3. On the right side of the page, click **Create**.
+
+4. In the dialog box, search for and select **Metabase**.
+
+5. Complete the data source configuration as described below:
+
+ 
+
+ * **Connection Settings**
+ * **Name**: Enter a unique, descriptive name relevant to your business.
+ * **Type**: Only supports Metabase as a **source**.
+ * **Username**: Enter the Metabase login account, usually an email address.
+ * **Password**: Enter the password for the Metabase account.
+ * **HTTP Host**: The Metabase connection URL, including port number (e.g., `http://192.168.1.18:3000`).
+ * **Advanced Settings**
+ * **Agent Settings**: Defaults to **Platform Automatic Allocation**, you can also manually specify an agent.
+ * **Model Load Time**: If there are less than 10,000 models in the data source, their information will be updated every hour. But if the number of models exceeds 10,000, the refresh will take place daily at the time you have specified.
+
+6. Click **Test**. After passing the test, click **Save**.
+
+ :::tip
+
+ If the connection test fails, please follow the prompts on the page to fix the issue.
+
+ :::
diff --git a/docs/prerequisites/crm-and-sales-analytics/salesforce.md b/docs/connectors/crm-and-sales-analytics/salesforce.md
similarity index 92%
rename from docs/prerequisites/crm-and-sales-analytics/salesforce.md
rename to docs/connectors/crm-and-sales-analytics/salesforce.md
index b04cc6ec..343eec74 100644
--- a/docs/prerequisites/crm-and-sales-analytics/salesforce.md
+++ b/docs/connectors/crm-and-sales-analytics/salesforce.md
@@ -1,14 +1,12 @@
# Salesforce
-import Content from '../../reuse-content/_enterprise-and-cloud-features.md';
-
Salesforce is a massive infrastructure of customer relationship management software products that help marketing, sales, commerce, service, and IT teams connect with their customers. TapData Cloud supports building data pipelines with Salesforce as a source database, and this article describes how to add Salesforce data sources to TapData Cloud.
## Connect to Salesforce
-1. [Log in to TapData Platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. In the left navigation panel, click **Connections**.
diff --git a/docs/prerequisites/crm-and-sales-analytics/zoho-crm.md b/docs/connectors/crm-and-sales-analytics/zoho-crm.md
similarity index 93%
rename from docs/prerequisites/crm-and-sales-analytics/zoho-crm.md
rename to docs/connectors/crm-and-sales-analytics/zoho-crm.md
index 41cf823f..e1cd361f 100644
--- a/docs/prerequisites/crm-and-sales-analytics/zoho-crm.md
+++ b/docs/connectors/crm-and-sales-analytics/zoho-crm.md
@@ -1,8 +1,6 @@
# Zoho-CRM
-import Content from '../../reuse-content/_enterprise-and-cloud-features.md';
-
Zoho CRM acts as a single repository to bring your sales, marketing, and customer support activities together, and streamline your process, policy, and people in one platform. TapData Cloud supports Zoho CRM as a data source to build a data pipeline to help you read data in CRM and synchronize it to the specified data source, helps you quickly open the data flow channel. This article explains how to add a Zoho CRM data source to TapData Cloud.
@@ -21,7 +19,7 @@ TapData Cloud allows you to read data in Zoho CRM as a table and synchronize it
## Connect to Zoho CRM
-1. [Log in to TapData Platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. In the left navigation panel, click **Connections**.
diff --git a/docs/prerequisites/e-commerce/README.md b/docs/connectors/e-commerce/README.md
similarity index 59%
rename from docs/prerequisites/e-commerce/README.md
rename to docs/connectors/e-commerce/README.md
index a647a9b7..4cedb68e 100644
--- a/docs/prerequisites/e-commerce/README.md
+++ b/docs/connectors/e-commerce/README.md
@@ -1,8 +1,6 @@
# E-Commerce
-import Content from '../../reuse-content/_enterprise-and-cloud-features.md';
-
Please select the database you would like to add:
diff --git a/docs/prerequisites/e-commerce/alibaba-1688.md b/docs/connectors/e-commerce/alibaba-1688.md
similarity index 88%
rename from docs/prerequisites/e-commerce/alibaba-1688.md
rename to docs/connectors/e-commerce/alibaba-1688.md
index 787be8ca..4482f486 100644
--- a/docs/prerequisites/e-commerce/alibaba-1688.md
+++ b/docs/connectors/e-commerce/alibaba-1688.md
@@ -1,14 +1,12 @@
# Alibaba 1688
-import Content from '../../reuse-content/_enterprise-and-cloud-features.md';
-
This article describes how to connect to Alibaba 1688 data sources on TapData Cloud.
## Connect to Alibaba 1688
-1. [Log in to TapData Platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. In the left navigation panel, click **Connections**.
diff --git a/docs/prerequisites/e-commerce/shein.md b/docs/connectors/e-commerce/shein.md
similarity index 92%
rename from docs/prerequisites/e-commerce/shein.md
rename to docs/connectors/e-commerce/shein.md
index 86f3a0a0..9f57e969 100644
--- a/docs/prerequisites/e-commerce/shein.md
+++ b/docs/connectors/e-commerce/shein.md
@@ -1,8 +1,6 @@
# Shein
-import Content from '../../reuse-content/_enterprise-and-cloud-features.md';
-
This article describes how to connect to Shein data sources on TapData Cloud.
diff --git a/docs/prerequisites/files/README.md b/docs/connectors/files/README.md
similarity index 70%
rename from docs/prerequisites/files/README.md
rename to docs/connectors/files/README.md
index ec3e11aa..3bc40d8d 100644
--- a/docs/prerequisites/files/README.md
+++ b/docs/connectors/files/README.md
@@ -1,8 +1,6 @@
# Files
-import Content from '../../reuse-content/_enterprise-and-cloud-features.md';
-
TapData supports reading data from files and synchronizing it to databases. Please select the file data source you would like to add:
diff --git a/docs/prerequisites/files/csv.md b/docs/connectors/files/csv.md
similarity index 97%
rename from docs/prerequisites/files/csv.md
rename to docs/connectors/files/csv.md
index d859df42..56058af4 100644
--- a/docs/prerequisites/files/csv.md
+++ b/docs/connectors/files/csv.md
@@ -1,8 +1,6 @@
# CSV
-import Content from '../../reuse-content/_all-features.md';
-
A comma-separated values (CSV) file is a delimited text file that uses a comma to separate values. Each line of the file is a data record. Each record consists of one or more fields, separated by separator (such as commas, semicolons, or tabs).
@@ -53,7 +51,7 @@ import Content3 from '../../reuse-content/_files_on_oss.md';
## Connect to CSV
-1. [Log in to TapData Platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. In the left navigation panel, click **Connections**.
diff --git a/docs/prerequisites/files/excel.md b/docs/connectors/files/excel.md
similarity index 97%
rename from docs/prerequisites/files/excel.md
rename to docs/connectors/files/excel.md
index e06b39bc..121d1c5a 100644
--- a/docs/prerequisites/files/excel.md
+++ b/docs/connectors/files/excel.md
@@ -1,8 +1,6 @@
# Excel
-import Content from '../../reuse-content/_all-features.md';
-
Excel is a wide range of data statistics and data analysis software. TapData supports reading Excel files stored on local, FTP, SFTP, SMB, or S3FS to meet a variety of data flow needs.
@@ -54,7 +52,7 @@ import Content3 from '../../reuse-content/_files_on_oss.md';
## Connect to Excel
-1. [Log in to TapData Platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. In the left navigation panel, click **Connections**.
diff --git a/docs/prerequisites/files/json.md b/docs/connectors/files/json.md
similarity index 97%
rename from docs/prerequisites/files/json.md
rename to docs/connectors/files/json.md
index 5e7e5d13..a1bf2757 100644
--- a/docs/prerequisites/files/json.md
+++ b/docs/connectors/files/json.md
@@ -1,8 +1,6 @@
# JSON
-import Content from '../../reuse-content/_all-features.md';
-
JavaScript Object Notation (JSON) is a standard text-based format for representing structured data based on JavaScript object syntax. It is commonly used for transmitting data in web applications.
@@ -50,7 +48,7 @@ import Content3 from '../../reuse-content/_files_on_oss.md';
## Connect to JSON
-1. [Log in to TapData Platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. In the left navigation panel, click **Connections**.
diff --git a/docs/prerequisites/files/xml.md b/docs/connectors/files/xml.md
similarity index 97%
rename from docs/prerequisites/files/xml.md
rename to docs/connectors/files/xml.md
index 467d2c3e..2aa8106e 100644
--- a/docs/prerequisites/files/xml.md
+++ b/docs/connectors/files/xml.md
@@ -1,8 +1,6 @@
# XML
-import Content from '../../reuse-content/_all-features.md';
-
Extensible Markup Language (XML) lets you define and store data in a shareable manner. XML supports information exchange between computer systems such as websites, databases, and third-party applications.
@@ -51,7 +49,7 @@ import Content3 from '../../reuse-content/_files_on_oss.md';
## Connect to XML
-1. [Log in to TapData Platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. In the left navigation panel, click **Connections**.
diff --git a/docs/user-guide/manage-connection.md b/docs/connectors/manage-connection.md
similarity index 91%
rename from docs/user-guide/manage-connection.md
rename to docs/connectors/manage-connection.md
index 3ea304c2..70281659 100644
--- a/docs/user-guide/manage-connection.md
+++ b/docs/connectors/manage-connection.md
@@ -1,7 +1,5 @@
# Manage Connection
-import Content from '../reuse-content/_all-features.md';
-
TapData saves the connection information for each database using a connection, allowing you to reference it directly when creating data replication/development tasks. This eliminates the need for repetitive configuration and improves the convenience of operation and maintenance.
@@ -9,7 +7,7 @@ This article provides a guide on the common operations for managing connections.
## Procedure
-1. [Log in to TapData Platform](log-in.md).
+1. Log in to TapData Platform.
2. In the left navigation panel, click **Connections**.
diff --git a/docs/prerequisites/mq-and-middleware/README.md b/docs/connectors/mq-and-middleware/README.md
similarity index 62%
rename from docs/prerequisites/mq-and-middleware/README.md
rename to docs/connectors/mq-and-middleware/README.md
index 20b5bfff..4f7af1e2 100644
--- a/docs/prerequisites/mq-and-middleware/README.md
+++ b/docs/connectors/mq-and-middleware/README.md
@@ -1,8 +1,6 @@
# Message Queue and Middleware
-import Content from '../../reuse-content/_enterprise-and-cloud-features.md';
-
Please select the database you would like to add:
diff --git a/docs/prerequisites/mq-and-middleware/activemq.md b/docs/connectors/mq-and-middleware/activemq.md
similarity index 94%
rename from docs/prerequisites/mq-and-middleware/activemq.md
rename to docs/connectors/mq-and-middleware/activemq.md
index d41a608e..4810578a 100644
--- a/docs/prerequisites/mq-and-middleware/activemq.md
+++ b/docs/connectors/mq-and-middleware/activemq.md
@@ -1,8 +1,6 @@
# ActiveMQ
-import Content from '../../reuse-content/_all-features.md';
-
ActiveMQ is an open-source Java message broker that supports multiple industry-standard protocols. It is widely used for enterprise-level messaging, ideal for scenarios such as asynchronous processing, application decoupling, traffic peak shaving (e.g., flash sales), log processing, and message communication. This guide will walk you through adding ActiveMQ as a data source in TapData, enabling seamless integration as either a source or a target to build efficient data pipelines and achieve real-time data synchronization.
@@ -28,7 +26,7 @@ ActiveMQ is an open-source Java message broker that supports multiple industry-s
## Connect to ActiveMQ
-1. [Log in to Tapdata platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. In the left navigation bar, click **Connections**.
diff --git a/docs/prerequisites/mq-and-middleware/ai-chat.md b/docs/connectors/mq-and-middleware/ai-chat.md
similarity index 68%
rename from docs/prerequisites/mq-and-middleware/ai-chat.md
rename to docs/connectors/mq-and-middleware/ai-chat.md
index 1c49ea20..3cea9b7a 100644
--- a/docs/prerequisites/mq-and-middleware/ai-chat.md
+++ b/docs/connectors/mq-and-middleware/ai-chat.md
@@ -1,8 +1,6 @@
# AI Chat
-import Content from '../../reuse-content/_enterprise-and-cloud-features.md';
-
http://45.120.216.132:5002/openapi/swagger#/
diff --git a/docs/prerequisites/mq-and-middleware/bes-channels.md b/docs/connectors/mq-and-middleware/bes-channels.md
similarity index 95%
rename from docs/prerequisites/mq-and-middleware/bes-channels.md
rename to docs/connectors/mq-and-middleware/bes-channels.md
index 29349746..edc830ed 100644
--- a/docs/prerequisites/mq-and-middleware/bes-channels.md
+++ b/docs/connectors/mq-and-middleware/bes-channels.md
@@ -1,8 +1,6 @@
# BesChannels
-import Content from '../../reuse-content/_enterprise-and-cloud-features.md';
-
BesChannels B2B one-stop marketing cloud that focusing on the B2B track, helping the B2B marketing department obtain more leads, improve lead conversion rates, and deliver more business opportunities for sales.
diff --git a/docs/prerequisites/mq-and-middleware/hazelcast-cloud.md b/docs/connectors/mq-and-middleware/hazelcast-cloud.md
similarity index 89%
rename from docs/prerequisites/mq-and-middleware/hazelcast-cloud.md
rename to docs/connectors/mq-and-middleware/hazelcast-cloud.md
index fb5b9a17..10e8a210 100644
--- a/docs/prerequisites/mq-and-middleware/hazelcast-cloud.md
+++ b/docs/connectors/mq-and-middleware/hazelcast-cloud.md
@@ -1,8 +1,6 @@
# Hazelcast
-import Content from '../../reuse-content/_all-features.md';
-
Please follow the instructions below to successfully add and use the Hazelcast Cloud database in TapData Cloud.
diff --git a/docs/prerequisites/mq-and-middleware/kafka-enhanced.md b/docs/connectors/mq-and-middleware/kafka-enhanced.md
similarity index 96%
rename from docs/prerequisites/mq-and-middleware/kafka-enhanced.md
rename to docs/connectors/mq-and-middleware/kafka-enhanced.md
index a3713316..57a2c303 100644
--- a/docs/prerequisites/mq-and-middleware/kafka-enhanced.md
+++ b/docs/connectors/mq-and-middleware/kafka-enhanced.md
@@ -1,8 +1,6 @@
# Kafka-Enhanced
-import Content from '../../reuse-content/_all-features.md';
-
[Apache Kafka](https://kafka.apache.org/) is a distributed data streaming platform that allows real-time publishing, subscribing, storing, and processing of data streams. Kafka-Enhanced is an upgraded version of the Kafka connector, supporting both standard event structures and native Kafka data structures for data transmission. It removes the limitation of the previous Kafka connector, which only supported JSON Object formats, allowing non-JSON Object structures to be loaded into applications for processing. It also provides a more reliable resume-from-breakpoint mechanism.
@@ -61,7 +59,7 @@ Since Kafka as a message queue only supports append operations, avoid duplicate
## Connect Kafka-Enhanced
-1. [Log in to Tapdata platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. In the left navigation bar, click **Connections**.
@@ -82,7 +80,7 @@ Since Kafka as a message queue only supports append operations, avoid duplicate
* **ACK Confirmation Mechanism**: Choose based on business needs: No confirmation, write to Master partition only, write most ISR partitions (default), or write to all ISR partitions.
* **Compression Type**: Supports **lz4** (default), **gzip**, **snappy**, **zstd**. Enable compression for large messages to improve transmission efficiency.
* **Extended Configuration**: Supports custom advanced connection properties for Kafka managers, producers, and consumers for optimization in specific scenarios.
- * **CDC Log Caching**: [Mining the source database's](../../user-guide/advanced-settings/share-mining.md) incremental logs. This allows multiple tasks to share the same source database’s incremental log mining process, reducing duplicate reads and minimizing the impact of incremental synchronization on the source database. After enabling this feature, you will need to select an external storage to store the incremental log information.
+ * **CDC Log Caching**: [Mining the source database's](../../operational-data-hub/advanced/share-mining.md) incremental logs. This allows multiple tasks to share the same source database’s incremental log mining process, reducing duplicate reads and minimizing the impact of incremental synchronization on the source database. After enabling this feature, you will need to select an external storage to store the incremental log information.
* **Include Tables**: The default option is **All**, which includes all tables. Alternatively, you can select **Custom** and manually specify the desired topics by separating their names with commas (,).
* **Exclude Tables**: Once the switch is enabled, you have the option to specify topics to be excluded. You can do this by listing the table names separated by commas (,) in case there are multiple topics to be excluded.
* **Agent settings**: Defaults to **Platform automatic allocation**, you can also manually specify an agent.
@@ -152,7 +150,7 @@ When configuring data replication or transformation tasks, and using Kafka-Enhan
**Description**: Uses Kafka's native data synchronization method, supporting append-only operations similar to `INSERT`. As a source, it handles complex, unstructured data and passes it downstream; as a target, it allows flexible control over partitions, headers, keys, and values, enabling custom data insertion.
-**Typical Use Case**: Used for **homogeneous data migration** or **unstructured data transformation**, enabling data filtering and transformation through a Kafka -> [JS Processing Node](../../user-guide/data-development/process-node.md#js-process) -> Kafka/MySQL data pipeline.
+**Typical Use Case**: Used for **homogeneous data migration** or **unstructured data transformation**, enabling data filtering and transformation through a Kafka -> [JS Processing Node](../../data-transformation/process-node.md#js-process) -> Kafka/MySQL data pipeline.
**Sample Data**:
diff --git a/docs/prerequisites/mq-and-middleware/kafka.md b/docs/connectors/mq-and-middleware/kafka.md
similarity index 92%
rename from docs/prerequisites/mq-and-middleware/kafka.md
rename to docs/connectors/mq-and-middleware/kafka.md
index 4c64d506..65963657 100644
--- a/docs/prerequisites/mq-and-middleware/kafka.md
+++ b/docs/connectors/mq-and-middleware/kafka.md
@@ -1,8 +1,6 @@
# Kafka
-import Content from '../../reuse-content/_all-features.md';
-
Apache Kafka is an open-source distributed event streaming platform that is utilized by numerous companies for a variety of purposes, including high-performance data pipelines, streaming analytics, data integration, and crucial applications.
@@ -50,7 +48,7 @@ In the subsequent configuration of data replication/data transformation tasks, y
## Connect to Kafka
-1. [Log in to TapData Platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. In the left navigation panel, click **Connections**.
@@ -78,7 +76,7 @@ In the subsequent configuration of data replication/data transformation tasks, y
* Write to all ISR partitions.
* **Message compression type**: When dealing with large message volumes, enabling compression can significantly enhance transmission efficiency. Supports gzip, snappy, lz4, and zstd. By leveraging compression, you can effectively reduce the size of the messages, resulting in improved data transfer efficiency.
* **Ignore push message exception**: Once the switch is turned on, the system will continue to record the offset of the relevant message; however, it will not push any further messages. It's important to note that this approach carries a risk of potential data loss since the system will not deliver subsequent messages.
- * **CDC Log Caching**: [Mining the source database's](../../user-guide/advanced-settings/share-mining.md) incremental logs, this feature allows multiple tasks to share incremental logs from the source database, avoiding redundant reads and thus significantly reducing the load on the source database during incremental synchronization. Upon enabling this feature, an external storage should be selected to store the incremental log.
+ * **CDC Log Caching**: [Mining the source database's](../../operational-data-hub/advanced/share-mining.md) incremental logs, this feature allows multiple tasks to share incremental logs from the source database, avoiding redundant reads and thus significantly reducing the load on the source database during incremental synchronization. Upon enabling this feature, an external storage should be selected to store the incremental log.
* **Agent settings**: Defaults to **Platform automatic allocation**, you can also manually specify an agent.
* **Model load time**: If there are less than 10,000 models in the data source, their information will be updated every hour. But if the number of models exceeds 10,000, the refresh will take place daily at the time you have specified.
diff --git a/docs/prerequisites/mq-and-middleware/rabbitmq.md b/docs/connectors/mq-and-middleware/rabbitmq.md
similarity index 88%
rename from docs/prerequisites/mq-and-middleware/rabbitmq.md
rename to docs/connectors/mq-and-middleware/rabbitmq.md
index c92996ec..1a905be6 100644
--- a/docs/prerequisites/mq-and-middleware/rabbitmq.md
+++ b/docs/connectors/mq-and-middleware/rabbitmq.md
@@ -1,8 +1,6 @@
# RabbitMQ
-import Content from '../../reuse-content/_all-features.md';
-
RabbitMQ is a lightweight, open-source message broker that supports the AMQP protocol. It is widely used in distributed systems for asynchronous communication, application decoupling, and traffic peak shaving (such as handling high-volume order bursts). Combined with TapData, RabbitMQ enables you to build high-performance real-time data pipelines to meet low-latency, high-throughput use cases.
@@ -45,7 +43,7 @@ rabbitmqctl set_user_tags username management
## Connect to RabbitMQ
-1. [Log in to TapData Platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. In the left navigation panel, click **Connections**.
@@ -67,7 +65,7 @@ rabbitmqctl set_user_tags username management
- **Account / Password**: Provide the credentials of the RabbitMQ user with both AMQP and HTTP API permissions. If not created yet, refer to the Prerequisites section above.
- **Virtual Host**: Defaults to `/`. If using a custom vhost, ensure the user has access to it.
- **Advanced Settings**
- - **CDC Log Caching**: [Mining the source database's](../../user-guide/advanced-settings/share-mining.md) incremental logs. This allows multiple tasks to share the same source database’s incremental log mining process, reducing duplicate reads and minimizing the impact of incremental synchronization on the source database. After enabling this feature, you will need to select an external storage to store the incremental log information.
+ - **CDC Log Caching**: [Mining the source database's](../../operational-data-hub/advanced/share-mining.md) incremental logs. This allows multiple tasks to share the same source database’s incremental log mining process, reducing duplicate reads and minimizing the impact of incremental synchronization on the source database. After enabling this feature, you will need to select an external storage to store the incremental log information.
- **Agent settings**: Defaults to **Platform automatic allocation**, you can also manually specify an agent.
- **Model Load Time**: If there are less than 10,000 models in the data source, their schema will be updated every hour. But if the number of models exceeds 10,000, the refresh will take place daily at the time you have specified.
diff --git a/docs/prerequisites/mq-and-middleware/rocketmq.md b/docs/connectors/mq-and-middleware/rocketmq.md
similarity index 72%
rename from docs/prerequisites/mq-and-middleware/rocketmq.md
rename to docs/connectors/mq-and-middleware/rocketmq.md
index 20b661e5..5bdc95ec 100644
--- a/docs/prerequisites/mq-and-middleware/rocketmq.md
+++ b/docs/connectors/mq-and-middleware/rocketmq.md
@@ -1,8 +1,6 @@
# RocketMQ
-import Content from '../../reuse-content/_all-features.md';
-
**RocketMQ** is a high-performance message queue middleware that supports various scenarios such as ordered messaging, delayed messaging, and batch messaging. It is widely used in high-concurrency systems in finance, e-commerce, and other industries. Integrated with Tapdata, RocketMQ enables rapid construction of stable real-time data pipelines, helping enterprises achieve system decoupling and data-driven operations, thereby improving overall responsiveness.
@@ -13,14 +11,14 @@ import Content from '../../reuse-content/_all-features.md';
## Considerations
-Before configuring a RocketMQ connection, make sure your Tapdata platform or Agent version supports this connector. Otherwise, connection tests may return an error like `"the specified group is blank"`.
+After logging into the TapData platform, click the  > **Notification Settings** at the top right corner. You can set up custom notification rules to automatically trigger notification processes. The main types are task operation notifications and Agent notifications. The specific notification items include:
-- For Tapdata Enterprise/Community users: [upgrade the platform](../../administration/operation.md) to the latest version
-- For Tapdata Cloud users: [upgrade the Agent](../../user-guide/manage-agent.md) to the latest version
+- For Tapdata Enterprise/Community users: upgrade the platform to the latest version
+- For Tapdata Cloud users: upgrade the Agent to the latest version
## Connect to RocketMQ
-1. [Log in to TapData Platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. In the left navigation panel, click **Connections**.
@@ -40,7 +38,7 @@ Before configuring a RocketMQ connection, make sure your Tapdata platform or Age
- **MQ Port**: Service port of RocketMQ. Default is **9876**. Tapdata reads message data via this port.
- **Account / Password**: Enter the configured RocketMQ username and password.
- **Advanced Settings**
- - **CDC Log Caching**: [Mining the source database's](../../user-guide/advanced-settings/share-mining.md) incremental logs. This allows multiple tasks to share the same source database’s incremental log mining process, reducing duplicate reads and minimizing the impact of incremental synchronization on the source database. After enabling this feature, you will need to select an external storage to store the incremental log information.
+ - **CDC Log Caching**: [Mining the source database's](../../operational-data-hub/advanced/share-mining.md) incremental logs. This allows multiple tasks to share the same source database’s incremental log mining process, reducing duplicate reads and minimizing the impact of incremental synchronization on the source database. After enabling this feature, you will need to select an external storage to store the incremental log information.
- **Agent Settings**: Defaults to **Platform automatic allocation**, you can also manually specify an agent.
- **Model Load Time**: If there are less than 10,000 models in the data source, their schema will be updated every hour. But if the number of models exceeds 10,000, the refresh will take place daily at the time you have specified.
- **Enable Heartbeat Table**: When the connection type is source or target, you can enable this switch. TapData will create a `_tapdata_heartbeat_table` heartbeat table in the source database and update it every 10 seconds (requires appropriate permissions) to monitor the health of the data source connection and tasks. The heartbeat task starts automatically after the data replication/development task starts, and you can view the heartbeat task in the data source editing page.
diff --git a/docs/prerequisites/on-prem-databases/README.md b/docs/connectors/on-prem-databases/README.md
similarity index 73%
rename from docs/prerequisites/on-prem-databases/README.md
rename to docs/connectors/on-prem-databases/README.md
index ed1d2273..4771bd24 100644
--- a/docs/prerequisites/on-prem-databases/README.md
+++ b/docs/connectors/on-prem-databases/README.md
@@ -1,8 +1,6 @@
# On-Premises Databases
-import Content from '../../reuse-content/_enterprise-and-cloud-features.md';
-
TapData Cloud supports a variety of database types, including relational and non-relational databases. Please select the database you would like to add:
diff --git a/docs/prerequisites/on-prem-databases/dameng.md b/docs/connectors/on-prem-databases/dameng.md
similarity index 97%
rename from docs/prerequisites/on-prem-databases/dameng.md
rename to docs/connectors/on-prem-databases/dameng.md
index 4572de40..dadc3f03 100644
--- a/docs/prerequisites/on-prem-databases/dameng.md
+++ b/docs/connectors/on-prem-databases/dameng.md
@@ -1,7 +1,5 @@
# Dameng
-import Content from '../../reuse-content/_enterprise-and-cloud-features.md';
-
[Dameng Database Management System](https://en.dameng.com/) is a new generation of large-scale general-purpose relational database developed by Dameng. It fully supports SQL standards and mainstream programming language interfaces/development frameworks. With its hybrid row-column storage technology, it caters to both OLAP and OLTP, meeting the needs of HTAP (Hybrid Transactional and Analytical Processing) applications. This document will guide you on how to add Dameng as a data source in TapData, which can subsequently be used as a source or target database to build data pipelines.
@@ -38,7 +36,7 @@ DM versions 7.x and 8.x (standalone architecture)
## Considerations
-* Incremental log mining uses official features to load log files into temporary views and filter the DML and DDL logs of the tracked tables. This is similar to Oracle's LogMiner and may consume some database performance. If there are multiple synchronization tasks with scattered data tables, it is recommended to use [Shared Mining](../../user-guide/advanced-settings/share-mining.md) to reduce database load.
+* Incremental log mining uses official features to load log files into temporary views and filter the DML and DDL logs of the tracked tables. This is similar to Oracle's LogMiner and may consume some database performance. If there are multiple synchronization tasks with scattered data tables, it is recommended to use [Shared Mining](../../operational-data-hub/advanced/share-mining.md) to reduce database load.
* When the database log space is insufficient, you can use the command `SF_ARCHIVELOG_DELETE_BEFORE_TIME(SYSDATE-1);` to clean up archive logs from the day before, retaining only the logs from the last day. You can also specify the number of days to retain based on your needs.
## Preparation
@@ -154,7 +152,7 @@ Before connecting to the Dameng database, you need to complete some preparatory
## Connect to Oracle
-1. [Log in to TapData Platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. In the left navigation panel, click **Connections**.
diff --git a/docs/prerequisites/on-prem-databases/db2.md b/docs/connectors/on-prem-databases/db2.md
similarity index 95%
rename from docs/prerequisites/on-prem-databases/db2.md
rename to docs/connectors/on-prem-databases/db2.md
index 73877d4f..9610486a 100644
--- a/docs/prerequisites/on-prem-databases/db2.md
+++ b/docs/connectors/on-prem-databases/db2.md
@@ -1,8 +1,6 @@
# Db2
-import Content from '../../reuse-content/_enterprise-and-cloud-features.md';
-
[IBM Db2](https://www.ibm.com/docs/zh/db2) is a relational database known for its high performance, scalability, and reliability in managing structured data. TapData supports using Db2 as both a source and target database, helping you quickly build data pipelines. This guide will walk you through connecting a Db2 data source in TapData.
@@ -137,7 +135,7 @@ Before connecting to a Db2 database, you need to complete account authorization
## Connect to Db2
-1. [Log in to TapData platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. In the left navigation bar, click **Connections**.
@@ -163,7 +161,7 @@ Before connecting to a Db2 database, you need to complete account authorization
* **Advanced Settings**
* **Time Zone**: Default is UTC (0). If changed to another timezone, it will impact the synchronization time, particularly for fields without timezone information, such as TIMESTAMP and TIME types. However, DATE types will remain unaffected.
- * **CDC Log Caching**: [Extract the incremental logs](../../user-guide/advanced-settings/share-mining.md) from the source database. This allows multiple tasks to share the incremental log extraction process from the same source, reducing the load on the source database. When enabled, you also need to select a storage location for the incremental log information.
+ * **CDC Log Caching**: [Extract the incremental logs](../../operational-data-hub/advanced/share-mining.md) from the source database. This allows multiple tasks to share the incremental log extraction process from the same source, reducing the load on the source database. When enabled, you also need to select a storage location for the incremental log information.
* **Include Tables**: By default, all tables are included. You can choose to customize and specify the tables to include, separated by commas.
* **Exclude Tables**: When enabled, you can specify tables to exclude, separated by commas.
* **Agent Settings**: The default is automatic assignment by the platform. You can also manually specify an Agent.
diff --git a/docs/prerequisites/on-prem-databases/elasticsearch.md b/docs/connectors/on-prem-databases/elasticsearch.md
similarity index 93%
rename from docs/prerequisites/on-prem-databases/elasticsearch.md
rename to docs/connectors/on-prem-databases/elasticsearch.md
index d1c64771..55856166 100644
--- a/docs/prerequisites/on-prem-databases/elasticsearch.md
+++ b/docs/connectors/on-prem-databases/elasticsearch.md
@@ -1,8 +1,6 @@
# Elasticsearch
-import Content from '../../reuse-content/_all-features.md';
-
Elasticsearch is a distributed, RESTful search and analytics engine capable of addressing a growing number of use cases.
@@ -14,7 +12,7 @@ Elasticsearch 7.6
## Connect to Elasticsearch
-1. [Log in to TapData Platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. In the left navigation panel, click **Connections**.
diff --git a/docs/prerequisites/on-prem-databases/gbase-8a.md b/docs/connectors/on-prem-databases/gbase-8a.md
similarity index 96%
rename from docs/prerequisites/on-prem-databases/gbase-8a.md
rename to docs/connectors/on-prem-databases/gbase-8a.md
index 94fd8059..0055bce0 100644
--- a/docs/prerequisites/on-prem-databases/gbase-8a.md
+++ b/docs/connectors/on-prem-databases/gbase-8a.md
@@ -1,8 +1,6 @@
# GBase 8a
-import Content from '../../reuse-content/_enterprise-and-cloud-features.md';
-
GBase 8a is an analytical database developed by GBASE, highly compatible with MySQL syntax, features, and field types. TapData supports using GBase 8a as a target database, helping you quickly build real-time data synchronization links. Next, we will introduce how to add GBase 8a as a data source in TapData.
@@ -72,7 +70,7 @@ You can configure the write strategy in the **Advanced Settings** of the task no
## Connect to GBase 8a
-1. [Log in to TapData Platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. In the left navigation panel, click **Connections**.
diff --git a/docs/prerequisites/on-prem-databases/gbase-8s.md b/docs/connectors/on-prem-databases/gbase-8s.md
similarity index 95%
rename from docs/prerequisites/on-prem-databases/gbase-8s.md
rename to docs/connectors/on-prem-databases/gbase-8s.md
index c943af87..7c3731a2 100644
--- a/docs/prerequisites/on-prem-databases/gbase-8s.md
+++ b/docs/connectors/on-prem-databases/gbase-8s.md
@@ -1,8 +1,6 @@
# GBase 8s
-import Content from '../../reuse-content/_enterprise-and-cloud-features.md';
-
GBase 8s is a database developed based on Informix, retaining most of the native syntax, features, and field types, while also incorporating many advantages of Oracle. TapData supports using GBase 8s as a target database, helping you quickly build real-time data synchronization pipelines. In this article, we will explain how to add a GBase 8s in TapData.
@@ -46,7 +44,7 @@ You can choose the write policy in the **Advanced Configuration** of the task no
## Connect to GBase 8s
-1. [Log in to TapData platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. In the left navigation bar, click on **Connections**.
diff --git a/docs/prerequisites/on-prem-databases/hive1.md b/docs/connectors/on-prem-databases/hive1.md
similarity index 94%
rename from docs/prerequisites/on-prem-databases/hive1.md
rename to docs/connectors/on-prem-databases/hive1.md
index 27ebe7d1..232a8b7b 100644
--- a/docs/prerequisites/on-prem-databases/hive1.md
+++ b/docs/connectors/on-prem-databases/hive1.md
@@ -1,8 +1,6 @@
# Hive1
-import Content from '../../reuse-content/_all-features.md';
-
Please follow the instructions below to successfully add and use the Hive database in TapData Cloud.
diff --git a/docs/prerequisites/on-prem-databases/hive3.md b/docs/connectors/on-prem-databases/hive3.md
similarity index 94%
rename from docs/prerequisites/on-prem-databases/hive3.md
rename to docs/connectors/on-prem-databases/hive3.md
index 0c3d90a6..af6684e7 100644
--- a/docs/prerequisites/on-prem-databases/hive3.md
+++ b/docs/connectors/on-prem-databases/hive3.md
@@ -1,8 +1,6 @@
# Hive3
-import Content from '../../reuse-content/_all-features.md';
-
Please follow the instructions below to successfully add and use the Hive database in TapData Cloud.
diff --git a/docs/prerequisites/on-prem-databases/informix.md b/docs/connectors/on-prem-databases/informix.md
similarity index 92%
rename from docs/prerequisites/on-prem-databases/informix.md
rename to docs/connectors/on-prem-databases/informix.md
index a0cbc560..df98af64 100644
--- a/docs/prerequisites/on-prem-databases/informix.md
+++ b/docs/connectors/on-prem-databases/informix.md
@@ -1,8 +1,6 @@
# Informix
-import Content from '../../reuse-content/_enterprise-and-cloud-features.md';
-
Informix is a product family within IBM's Information Management division that is centered on several relational database management system and Multi-model database offerings.
@@ -10,7 +8,7 @@ This article describes how to connect to Informix data sources on TapData Cloud.
## Connect to Informix
-1. [Log in to TapData Platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. In the left navigation panel, click **Connections**.
diff --git a/docs/prerequisites/on-prem-databases/kingbase-es-r3.md b/docs/connectors/on-prem-databases/kingbase-es-r3.md
similarity index 96%
rename from docs/prerequisites/on-prem-databases/kingbase-es-r3.md
rename to docs/connectors/on-prem-databases/kingbase-es-r3.md
index a4b934eb..003d753f 100644
--- a/docs/prerequisites/on-prem-databases/kingbase-es-r3.md
+++ b/docs/connectors/on-prem-databases/kingbase-es-r3.md
@@ -1,8 +1,6 @@
# KingbaseES-R3
-import Content from '../../reuse-content/_enterprise-and-cloud-features.md';
-
The Kingbase Database Management System (KingbaseES) is a commercial relational database management system developed independently by Beijing Kingbase Technology Inc, with proprietary intellectual property rights. This article will introduce how to add KingbaseES-R3 data source in TapData Cloud, which can then be used as a source or target database to build data pipelines.
@@ -74,7 +72,7 @@ GRANT SELECT, INSERT, UPDATE, DELETE, TRUNCATE ON ALL TABLES IN SCHEMA schema_na
## Connect to KingbaseES-R3
-1. [Log in to TapData Platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. In the left navigation panel, click **Connections**.
diff --git a/docs/prerequisites/on-prem-databases/kingbase-es-r6.md b/docs/connectors/on-prem-databases/kingbase-es-r6.md
similarity index 95%
rename from docs/prerequisites/on-prem-databases/kingbase-es-r6.md
rename to docs/connectors/on-prem-databases/kingbase-es-r6.md
index 5ebeca54..f30cd786 100644
--- a/docs/prerequisites/on-prem-databases/kingbase-es-r6.md
+++ b/docs/connectors/on-prem-databases/kingbase-es-r6.md
@@ -1,9 +1,5 @@
# KingbaseES-R6
-import Content1 from '../../reuse-content/_enterprise-and-cloud-features.md';
-
-
-
[KingbaseES](https://www.kingbase.com.cn/en/kingbasees1/index.htm) is a commercial relational database management system (RDBMS) developed by Beijing Kingbase Technology Inc. KingbaseES-R6 is compatible with most features of PostgreSQL 9.6. This document will guide you on how to add KingbaseES-R6 as a data source in TapData and use it as either a source or target database to build data pipelines.
```mdx-code-block
@@ -201,7 +197,7 @@ When KingbaseES-R6 is used as a target, you can choose write strategies through
## Connect to KingbaseES-R6
-1. [Log in to the TapData platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. In the left-hand navigation bar, click **Connections**.
@@ -225,7 +221,7 @@ When KingbaseES-R6 is used as a target, you can choose write strategies through
* **Password**: The password for the database user.
* **logPluginName**: To capture incremental data from KingbaseES-R6, follow the [Prerequisites](#prerequisites) to install the required plugin.
* **Advanced Settings**:
- * **CDC Log Caching**: [Mining the source database's](../../user-guide/advanced-settings/share-mining.md) incremental logs, this feature allows multiple tasks to share incremental logs from the source database, avoiding redundant reads and thus significantly reducing the load on the source database during incremental synchronization. Upon enabling this feature, an external storage should be selected to store the incremental log.
+ * **CDC Log Caching**: [Mining the source database's](../../operational-data-hub/advanced/share-mining.md) incremental logs, this feature allows multiple tasks to share incremental logs from the source database, avoiding redundant reads and thus significantly reducing the load on the source database during incremental synchronization. Upon enabling this feature, an external storage should be selected to store the incremental log.
* **Contain Table**: The default option is **All**, which includes all tables. Alternatively, you can select **Custom** and manually specify the desired tables by separating their names with commas (,).
* **Exclude Tables**: Once the switch is enabled, you have the option to specify tables to be excluded. You can do this by listing the table names separated by commas (,) in case there are multiple tables to be excluded.
* **Agent Settings**: Defaults to **Platform automatic allocation**, you can also manually specify an agent.
diff --git a/docs/prerequisites/on-prem-databases/mariadb.md b/docs/connectors/on-prem-databases/mariadb.md
similarity index 98%
rename from docs/prerequisites/on-prem-databases/mariadb.md
rename to docs/connectors/on-prem-databases/mariadb.md
index 2e30930d..13ae3d01 100644
--- a/docs/prerequisites/on-prem-databases/mariadb.md
+++ b/docs/connectors/on-prem-databases/mariadb.md
@@ -1,8 +1,6 @@
# MariaDB
-import Content from '../../reuse-content/_all-features.md';
-
MariaDB is a versatile open-source relational database management system used for high-availability transaction data, analytics, as an embedded server, and is widely supported by various tools and applications. TapData Cloud provides comprehensive support for building data pipelines utilizing MariaDB as both the source and target database.
@@ -245,7 +243,7 @@ To further enhance the security of the data link, you can choose to enable SSL (
## Connect to MariaDB
-1. [Log in to TapData Platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. In the left navigation panel, click **Connections**.
diff --git a/docs/prerequisites/on-prem-databases/mongodb-atlas.md b/docs/connectors/on-prem-databases/mongodb-atlas.md
similarity index 97%
rename from docs/prerequisites/on-prem-databases/mongodb-atlas.md
rename to docs/connectors/on-prem-databases/mongodb-atlas.md
index a0b27aae..d1776155 100644
--- a/docs/prerequisites/on-prem-databases/mongodb-atlas.md
+++ b/docs/connectors/on-prem-databases/mongodb-atlas.md
@@ -10,9 +10,7 @@ keywords:
# MongoDB Atlas
-import Content from '../../reuse-content/_all-features.md';
-
[TapData](https://tapdata.io/) supports [MongoDB Atlas](https://www.mongodb.com/atlas) as a data source, enabling real-time data sync, incremental replication, and seamless cloud-to-local data integration.
@@ -79,7 +77,7 @@ Before establishing the connection, it is essential to complete the necessary pr
## Connect to MongoDB Atlas
-1. [Log in to TapData Platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. In the left navigation panel, click **Connections**.
diff --git a/docs/prerequisites/on-prem-databases/mongodb-below34.md b/docs/connectors/on-prem-databases/mongodb-below34.md
similarity index 95%
rename from docs/prerequisites/on-prem-databases/mongodb-below34.md
rename to docs/connectors/on-prem-databases/mongodb-below34.md
index 773ae16d..0d5083ce 100644
--- a/docs/prerequisites/on-prem-databases/mongodb-below34.md
+++ b/docs/connectors/on-prem-databases/mongodb-below34.md
@@ -10,9 +10,7 @@ keywords:
# MongoDB Below 3.4
-import Content from '../../reuse-content/_all-features.md';
-
[TapData](https://tapdata.io/) supports [MongoDB](https://www.mongodb.com/) (3.4 and earlier) as a data source, enabling real-time CDC sync, incremental replication, and flexible pipeline building.
@@ -144,7 +142,7 @@ When using MongoDB version 3.2, you also need to grant read permissions for the
## Connect to MongoDB Below 3.4
-1. [Log in to TapData platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. In the left navigation bar, click **Connections**.
@@ -168,7 +166,7 @@ When using MongoDB version 3.2, you also need to grant read permissions for the
* **Direct Connection**: TapData Cloud will connect directly to the database and you need to set up security rules to allow access.
* **Number of Sampling Records for Loading Model**: Specifies the number of records to sample when loading the schema to ensure that the generated schema structure matches the source data. Default is **1000**.
* **Fields Load Limit For Each Collection**: Limits the maximum number of fields loaded per collection to avoid slow schema generation due to excessive fields. Default is **1024**.
- * **CDC Log Caching**: [Mining the source database's](../../user-guide/advanced-settings/share-mining.md) incremental logs, this feature allows multiple tasks to share incremental logs from the source database, avoiding redundant reads and thus significantly reducing the load on the source database during incremental synchronization. Upon enabling this feature, an external storage should be selected to store the incremental log.
+ * **CDC Log Caching**: [Mining the source database's](../../operational-data-hub/advanced/share-mining.md) incremental logs, this feature allows multiple tasks to share incremental logs from the source database, avoiding redundant reads and thus significantly reducing the load on the source database during incremental synchronization. Upon enabling this feature, an external storage should be selected to store the incremental log.
* **Contain Table**: The default option is All, which includes all tables. Alternatively, you can select Custom and manually specify the desired tables by separating their names with commas (,).
* **Exclude Tables**: Once the switch is enabled, you have the option to specify tables to be excluded. You can do this by listing the table names separated by commas (,) in case there are multiple tables to be excluded.
* **Agent Settings**: Defaults to Platform automatic allocation, you can also manually specify an agent.
diff --git a/docs/prerequisites/on-prem-databases/mongodb.md b/docs/connectors/on-prem-databases/mongodb.md
similarity index 96%
rename from docs/prerequisites/on-prem-databases/mongodb.md
rename to docs/connectors/on-prem-databases/mongodb.md
index d054ce13..448bd62a 100644
--- a/docs/prerequisites/on-prem-databases/mongodb.md
+++ b/docs/connectors/on-prem-databases/mongodb.md
@@ -10,9 +10,7 @@ keywords:
# MongoDB
-import Content from '../../reuse-content/_all-features.md';
-
[TapData](https://tapdata.io/) supports [MongoDB](https://www.mongodb.com/) (4.0 and above) as a data source, enabling real-time CDC sync, incremental replication, and flexible pipeline building.
@@ -139,7 +137,7 @@ db.createUser(
## Connect to MongoDB
-1. [Log in to TapData platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. In the left navigation bar, click **Connections**.
@@ -163,7 +161,7 @@ db.createUser(
* **Direct Connection**: TapData Cloud will connect directly to the database and you need to set up security rules to allow access.
* **Number of Sampling Records for Loading Model**: Specifies the number of records to sample when loading the schema to ensure that the generated schema structure matches the source data. Default is **1000**.
* **Fields Load Limit For Each Collection**: Limits the maximum number of fields loaded per collection to avoid slow schema generation due to excessive fields. Default is **1024**.
- * **CDC Log Caching**: [Mining the source database's](../../user-guide/advanced-settings/share-mining.md) incremental logs, this feature allows multiple tasks to share incremental logs from the source database, avoiding redundant reads and thus significantly reducing the load on the source database during incremental synchronization. Upon enabling this feature, an external storage should be selected to store the incremental log.
+ * **CDC Log Caching**: [Mining the source database's](../../operational-data-hub/advanced/share-mining.md) incremental logs, this feature allows multiple tasks to share incremental logs from the source database, avoiding redundant reads and thus significantly reducing the load on the source database during incremental synchronization. Upon enabling this feature, an external storage should be selected to store the incremental log.
* **Contain Table**: The default option is All, which includes all tables. Alternatively, you can select Custom and manually specify the desired tables by separating their names with commas (,).
* **Exclude Tables**: Once the switch is enabled, you have the option to specify tables to be excluded. You can do this by listing the table names separated by commas (,) in case there are multiple tables to be excluded.
* **Agent Settings**: Defaults to Platform automatic allocation, you can also manually specify an agent.
diff --git a/docs/prerequisites/on-prem-databases/mrs-hive3.md b/docs/connectors/on-prem-databases/mrs-hive3.md
similarity index 95%
rename from docs/prerequisites/on-prem-databases/mrs-hive3.md
rename to docs/connectors/on-prem-databases/mrs-hive3.md
index 88056128..1b886e2a 100644
--- a/docs/prerequisites/on-prem-databases/mrs-hive3.md
+++ b/docs/connectors/on-prem-databases/mrs-hive3.md
@@ -1,8 +1,6 @@
# mrs-hive3
-import Content from '../../reuse-content/_all-features.md';
-
Follow these instructions to ensure that the Hive database is successfully added and used in TapData Cloud.
diff --git a/docs/prerequisites/on-prem-databases/mysql-pxc.md b/docs/connectors/on-prem-databases/mysql-pxc.md
similarity index 97%
rename from docs/prerequisites/on-prem-databases/mysql-pxc.md
rename to docs/connectors/on-prem-databases/mysql-pxc.md
index e704a7df..77bf5058 100644
--- a/docs/prerequisites/on-prem-databases/mysql-pxc.md
+++ b/docs/connectors/on-prem-databases/mysql-pxc.md
@@ -1,8 +1,6 @@
# MySQL PXC
-import Content from '../../reuse-content/_all-features.md';
-
Please follow the instructions below to successfully add and use MySQL PXC databases in TapData.
diff --git a/docs/prerequisites/on-prem-databases/mysql.md b/docs/connectors/on-prem-databases/mysql.md
similarity index 97%
rename from docs/prerequisites/on-prem-databases/mysql.md
rename to docs/connectors/on-prem-databases/mysql.md
index 4c000458..a2037e98 100644
--- a/docs/prerequisites/on-prem-databases/mysql.md
+++ b/docs/connectors/on-prem-databases/mysql.md
@@ -1,7 +1,5 @@
# MySQL
-import Content from '../../reuse-content/_all-features.md';
-
MySQL is the most widely used open-source relational database, serving as the data storage solution for many websites, applications, and commercial products. This document will guide you through adding an MySQL data source in TapData, which can be used as a **source** or **target database** to build real-time data pipelines.
@@ -314,7 +312,7 @@ To further enhance the security of the data connection, you can choose to enable
## Connect to MySQL
-1. [Log in to TapData Platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. In the left navigation panel, click **Connections**.
@@ -338,7 +336,7 @@ To further enhance the security of the data connection, you can choose to enable
* **Advanced Settings**
* **Connection Parameter String**: Default is `useUnicode=yes&characterEncoding=UTF-8`, indicating that data transmission will use the UTF-8 encoded Unicode character set, which helps avoid character encoding issues.
* **Timezone**: Default is set to 0 timezone. If configured to another timezone, it will affect fields without timezone information (e.g., `datetime`). Fields with timezone information (e.g., `timestamp`, `date`, and `time`) are not affected.
- * **CDC Log Caching**: [Mining the source database's](../../user-guide/advanced-settings/share-mining.md) incremental logs. This allows multiple tasks to share the same source database’s incremental log mining process, reducing duplicate reads and minimizing the impact of incremental synchronization on the source database. After enabling this feature, you will need to select an external storage to store the incremental log information.
+ * **CDC Log Caching**: [Mining the source database's](../../operational-data-hub/advanced/share-mining.md) incremental logs. This allows multiple tasks to share the same source database’s incremental log mining process, reducing duplicate reads and minimizing the impact of incremental synchronization on the source database. After enabling this feature, you will need to select an external storage to store the incremental log information.
* **Contain Table**: The default option is **All**, which includes all tables. Alternatively, you can select **Custom** and manually specify the desired tables by separating their names with commas (,).
* **Exclude Tables**: Once the switch is enabled, you have the option to specify tables to be excluded. You can do this by listing the table names separated by commas (,) in case there are multiple tables to be excluded.
* **Agent Settings**: Defaults to **Platform automatic allocation**, you can also manually specify an agent.
diff --git a/docs/prerequisites/on-prem-databases/oceanbase-oracle.md b/docs/connectors/on-prem-databases/oceanbase-oracle.md
similarity index 94%
rename from docs/prerequisites/on-prem-databases/oceanbase-oracle.md
rename to docs/connectors/on-prem-databases/oceanbase-oracle.md
index 856172d3..f64c68dc 100644
--- a/docs/prerequisites/on-prem-databases/oceanbase-oracle.md
+++ b/docs/connectors/on-prem-databases/oceanbase-oracle.md
@@ -1,9 +1,5 @@
# OceanBase (Oracle Mode)
-import Content1 from '../../reuse-content/_enterprise-and-cloud-features.md';
-
-
-
**OceanBase** is a natively distributed relational database developed by Ant Group. It is compatible with both MySQL and Oracle syntax and features high availability, high performance, and strong consistency. Tapdata supports **OceanBase (Oracle Mode)** as both a source and a target database, enabling you to build real-time, multi-source data pipelines for synchronization and integration across heterogeneous systems.
```mdx-code-block
@@ -148,7 +144,7 @@ import TabItem from '@theme/TabItem';
## Connect to OceanBase (Oracle Mode)
-1. [Log in to Tapdata platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. In the left navigation bar, click **Connections**.
@@ -177,7 +173,7 @@ import TabItem from '@theme/TabItem';
- **CDC Password**: Password for the CDC account.
- **Time Zone**: Defaults to UTC (UTC+0). If changed, only affects time zone–less types like `TIMESTAMP`. `TIMESTAMP WITH TIME ZONE` and `DATE` are unaffected.
- **Advanced Settings**
- - **CDC Log Caching**: [Mining the source database's](../../user-guide/advanced-settings/share-mining.md) incremental logs. This allows multiple tasks to share the same source database’s incremental log mining process, reducing duplicate reads and minimizing the impact of incremental synchronization on the source database. After enabling this feature, you will need to select an external storage to store the incremental log information.
+ - **CDC Log Caching**: [Mining the source database's](../../operational-data-hub/advanced/share-mining.md) incremental logs. This allows multiple tasks to share the same source database’s incremental log mining process, reducing duplicate reads and minimizing the impact of incremental synchronization on the source database. After enabling this feature, you will need to select an external storage to store the incremental log information.
- **Agent Settings**: Defaults to **Platform Automatic Allocation**, you can also manually specify an agent.
- **Model Load Time**: If there are less than 10,000 models in the data source, their schema will be updated every hour. But if the number of models exceeds 10,000, the refresh will take place daily at the time you have specified.
- **Enable Heartbeat Table**: When OceanBase is used as **source and target** or **source**, you can enable this option. Tapdata will create a `_tapdata_heartbeat_table` in the source and update it every 10 seconds for health monitoring (requires proper permissions).
diff --git a/docs/prerequisites/on-prem-databases/oceanbase.md b/docs/connectors/on-prem-databases/oceanbase.md
similarity index 93%
rename from docs/prerequisites/on-prem-databases/oceanbase.md
rename to docs/connectors/on-prem-databases/oceanbase.md
index 0c3f41f1..dcbd91ec 100644
--- a/docs/prerequisites/on-prem-databases/oceanbase.md
+++ b/docs/connectors/on-prem-databases/oceanbase.md
@@ -1,8 +1,6 @@
# OceanBase (MySQL Mode)
-import Content from '../../reuse-content/_all-features.md';
-
**OceanBase** is a natively distributed relational database developed by Ant Group. It is compatible with both MySQL and Oracle syntax and features high availability, high performance, and strong consistency. Tapdata supports OceanBase as both a source and a target database, enabling you to build real-time, multi-source data pipelines for synchronization and integration across heterogeneous systems.
@@ -105,7 +103,7 @@ import TabItem from '@theme/TabItem';
## Connect to OceanBase (MySQL Mode)
-1. [Log in to Tapdata platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. In the left navigation bar, click **Connections**.
@@ -131,7 +129,7 @@ import TabItem from '@theme/TabItem';
- **Log Proxy Port**: Port for the log proxy service (used for Binlog/incremental sync). Default is **2983**.
- **Time Zone**: Default is UTC (UTC+0). If the source or target database uses a different time zone, configure accordingly. Affects time zone–less fields (e.g., `DATETIME`), but not `TIMESTAMP`, `DATE`, or `TIME`.
- **Advanced Settings**
- - **CDC Log Caching**: [Mining the source database's](../../user-guide/advanced-settings/share-mining.md) incremental logs. This allows multiple tasks to share the same source database’s incremental log mining process, reducing duplicate reads and minimizing the impact of incremental synchronization on the source database. After enabling this feature, you will need to select an external storage to store the incremental log information.
+ - **CDC Log Caching**: [Mining the source database's](../../operational-data-hub/advanced/share-mining.md) incremental logs. This allows multiple tasks to share the same source database’s incremental log mining process, reducing duplicate reads and minimizing the impact of incremental synchronization on the source database. After enabling this feature, you will need to select an external storage to store the incremental log information.
- **Agent Settings**: Defaults to **Platform Automatic Allocation**, you can also manually specify an agent.
- **Model Load Time**: If there are less than 10,000 models in the data source, their schema will be updated every hour. But if the number of models exceeds 10,000, the refresh will take place daily at the time you have specified.
- **Enable Heartbeat Table**: When OceanBase is used as **source and target** or **source**, you can enable this option. Tapdata will create a `_tapdata_heartbeat_table` in the source and update it every 10 seconds for health monitoring (requires proper permissions).
diff --git a/docs/prerequisites/on-prem-databases/opengauss.md b/docs/connectors/on-prem-databases/opengauss.md
similarity index 98%
rename from docs/prerequisites/on-prem-databases/opengauss.md
rename to docs/connectors/on-prem-databases/opengauss.md
index bf4afb08..0d990e15 100644
--- a/docs/prerequisites/on-prem-databases/opengauss.md
+++ b/docs/connectors/on-prem-databases/opengauss.md
@@ -1,8 +1,6 @@
# openGauss
-import Content from '../../reuse-content/_all-features.md';
-
Please follow the instructions below to successfully add and use openGauss databases in TapData Cloud.
diff --git a/docs/prerequisites/on-prem-databases/oracle.md b/docs/connectors/on-prem-databases/oracle.md
similarity index 98%
rename from docs/prerequisites/on-prem-databases/oracle.md
rename to docs/connectors/on-prem-databases/oracle.md
index 11416e4b..d775539d 100644
--- a/docs/prerequisites/on-prem-databases/oracle.md
+++ b/docs/connectors/on-prem-databases/oracle.md
@@ -1,8 +1,6 @@
# Oracle
-import Content1 from '../../reuse-content/_enterprise-and-cloud-features.md';
-
Oracle Database is a powerful and widely used relational database management system (RDBMS) developed by Oracle Corporation. It provides a comprehensive set of features to store, organize and retrieve large amounts of data efficiently. Because of its scalability, reliability, concurrency and performance, it is a popular choice for large-scale enterprise applications. This document will guide you through adding an Oracle data source in TapData, which can be used as a **source** or **target database** to build real-time data pipelines.
@@ -58,7 +56,7 @@ To improve data change capture efficiency, TapData supports both the native data
:::tip
-- When using the traditional LogMiner method, each data synchronization task will start a LogMiner session. For large-scale incremental changes, it can occupy up to one CPU core; for smaller changes, it may use about 0.25 CPU cores. It is recommended to dedicate a separate mining process for large data tasks, while small tasks can use [shared mining](../../user-guide/advanced-settings/share-mining.md).
+- When using the traditional LogMiner method, each data synchronization task will start a LogMiner session. For large-scale incremental changes, it can occupy up to one CPU core; for smaller changes, it may use about 0.25 CPU cores. It is recommended to dedicate a separate mining process for large data tasks, while small tasks can use [shared mining](../../operational-data-hub/advanced/share-mining.md).
- When using the Raw Log method, due to its high parsing performance, it may occupy CPU, memory, and disk I/O resources during peak times. It is recommended to connect to standby nodes to minimize the impact on business operations.
:::
@@ -485,7 +483,7 @@ XE=
## Connect to Oracle
-1. [Log in to TapData Platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. In the left navigation panel, click **Connections**.
@@ -515,7 +513,7 @@ XE=
* **Use SSL**: Select whether to enable SSL connection for the data source to enhance data security. After enabling this feature, you will need to upload SSL certificate files and enter the certificate password. The relevant files can be obtained from [Enabling SSL Connection](#ssl).
* **Timezone for datetime**: Default is set to UTC (0 timezone). If changed to another timezone, fields without timezone (such as TIMESTAMP) will be affected, while fields with timezone (such as TIMESTAMP WITH TIME ZONE) and DATE types will remain unaffected.
* **Socket Read Timeout**: Set this parameter to avoid zombie connections that may occur due to unexpected situations (e.g., socket interaction timeout) when LogMiner automatically mines incremental changes. The default value of 0 means no timeout is set.
- * **CDC Log Caching**: [Mining the source database's](../../user-guide/advanced-settings/share-mining.md) incremental logs. This allows multiple tasks to share the same source database’s incremental log mining process, reducing duplicate reads and minimizing the impact of incremental synchronization on the source database. After enabling this feature, you will need to select an external storage to store the incremental log information.
+ * **CDC Log Caching**: [Mining the source database's](../../operational-data-hub/advanced/share-mining.md) incremental logs. This allows multiple tasks to share the same source database’s incremental log mining process, reducing duplicate reads and minimizing the impact of incremental synchronization on the source database. After enabling this feature, you will need to select an external storage to store the incremental log information.
* **Contain table**: The default option is **All**, which includes all tables. Alternatively, you can select **Custom** and manually specify the desired tables by separating their names with commas (,).
* **Exclude tables**: Once the switch is enabled, you have the option to specify tables to be excluded. You can do this by listing the table names separated by commas (,) in case there are multiple tables to be excluded.
* **Agent settings**: Defaults to **Platform automatic allocation**, you can also manually specify an agent.
diff --git a/docs/prerequisites/on-prem-databases/postgresql.md b/docs/connectors/on-prem-databases/postgresql.md
similarity index 97%
rename from docs/prerequisites/on-prem-databases/postgresql.md
rename to docs/connectors/on-prem-databases/postgresql.md
index 3268101a..a6e4c08e 100644
--- a/docs/prerequisites/on-prem-databases/postgresql.md
+++ b/docs/connectors/on-prem-databases/postgresql.md
@@ -1,9 +1,5 @@
# PostgreSQL
-import Content1 from '../../reuse-content/_all-features.md';
-
-
-
[PostgreSQL](https://www.postgresql.org/) is a powerful open-source object-relational database management system (ORDBMS). TapData supports using PostgreSQL as both a source and target database, helping you quickly build real-time data pipelines. This document will introduce how to connect PostgreSQL as a data source in the TapData platform.
```mdx-code-block
@@ -410,7 +406,7 @@ To further enhance the security of the data pipeline, you can enable SSL (Secure
## Connect to PostgreSQL
-1. [Log in to TapData Platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
1. In the left navigation bar, click **Connections**.
@@ -435,7 +431,7 @@ To further enhance the security of the data pipeline, you can enable SSL (Secure
* **Advanced Settings**
* **ExtParams**: Additional connection parameters, default is empty.
* **Timezone**: Defaults to timezone 0. You can also specify it manually according to business needs. Configuring a different timezone will affect timezone-related fields, such as DATE, TIMESTAMP, TIMESTAMP WITH TIME ZONE, etc.
- * **CDC Log Caching**: [Mining the source database's](../../user-guide/advanced-settings/share-mining.md) incremental logs. This allows multiple tasks to share the same source database’s incremental log mining process, reducing duplicate reads and minimizing the impact of incremental synchronization on the source database. After enabling this feature, you will need to select an external storage to store the incremental log information.
+ * **CDC Log Caching**: [Mining the source database's](../../operational-data-hub/advanced/share-mining.md) incremental logs. This allows multiple tasks to share the same source database’s incremental log mining process, reducing duplicate reads and minimizing the impact of incremental synchronization on the source database. After enabling this feature, you will need to select an external storage to store the incremental log information.
* **Contain Table**: The default option is **All**, which includes all tables. Alternatively, you can select **Custom** and manually specify the desired tables by separating their names with commas (,).
* **Exclude Tables**: Once the switch is enabled, you have the option to specify tables to be excluded. You can do this by listing the table names separated by commas (,) in case there are multiple tables to be excluded.
* **Agent Settings**: Defaults to **Platform automatic allocation**, you can also manually specify an agent.
diff --git a/docs/prerequisites/on-prem-databases/redis.md b/docs/connectors/on-prem-databases/redis.md
similarity index 92%
rename from docs/prerequisites/on-prem-databases/redis.md
rename to docs/connectors/on-prem-databases/redis.md
index c5c4dc90..0e93333d 100644
--- a/docs/prerequisites/on-prem-databases/redis.md
+++ b/docs/connectors/on-prem-databases/redis.md
@@ -1,8 +1,6 @@
# Redis
-import Content from '../../reuse-content/_all-features.md';
-
Redis is an open source (BSD licensed), in-memory data structure store used as a database, cache, message broker, and streaming engine. TapData Cloud supports real-time synchronization of data from relational databases (Oracle, MySQL, MongoDB, PostgreSQL, SQL Server) to Redis to help you quickly complete data flow.
This article describes how to connect to Redis data sources on TapData Cloud.
@@ -13,7 +11,7 @@ Redis 2.8 ~ 6.0
## Connect to Redis
-1. [Log in to TapData Platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. In the left navigation panel, click **Connections**.
diff --git a/docs/prerequisites/on-prem-databases/sqlserver.md b/docs/connectors/on-prem-databases/sqlserver.md
similarity index 97%
rename from docs/prerequisites/on-prem-databases/sqlserver.md
rename to docs/connectors/on-prem-databases/sqlserver.md
index 8769efba..5e5cdb77 100644
--- a/docs/prerequisites/on-prem-databases/sqlserver.md
+++ b/docs/connectors/on-prem-databases/sqlserver.md
@@ -1,8 +1,6 @@
# SQL Server
-import Content from '../../reuse-content/_enterprise-and-cloud-features.md';
-
[SQL Server](https://www.microsoft.com/en-us/sql-server/) is a relational database management system (RDBMS) developed by Microsoft. TapData supports using SQL Server as both a source and target database, helping you quickly build real-time data synchronization pipelines. In this guide, we will walk you through how to add SQL Server as a data source in TapData.
@@ -277,7 +275,7 @@ After completing the configuration, be sure to securely store the certificate-re
## Connect to SQL Server
-1. [Log in to TapData platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. In the left-hand navigation bar, click **Connections**.
@@ -302,7 +300,7 @@ After completing the configuration, be sure to securely store the certificate-re
* **Additional Connection Parameters**: Additional connection parameters, default empty.
* **Timezone**: The default timezone is UTC (0 timezone). If another timezone is configured, it will affect synchronization times for fields without timezone information, such as `time`, `datetime`, `datetime2`, and `smalldatetime`. Fields with timezone information (e.g., `datetimeoffset`) and the `date` type will not be affected.
* **Use SSL/TLS**: Select whether to enable SSL for a more secure connection. If enabled, you will need to upload a CA certificate, certificate password, and server hostname details (see the [Enable SSL](#ssl) section for files).
- * **Using CDC Log Caching**: [Mining the source database's](../../user-guide/advanced-settings/share-mining.md) incremental logs. This allows multiple tasks to share the same source database’s incremental log mining process, reducing duplicate reads and minimizing the impact of incremental synchronization on the source database. After enabling this feature, you will need to select an external storage to store the incremental log information.
+ * **Using CDC Log Caching**: [Mining the source database's](../../operational-data-hub/advanced/share-mining.md) incremental logs. This allows multiple tasks to share the same source database’s incremental log mining process, reducing duplicate reads and minimizing the impact of incremental synchronization on the source database. After enabling this feature, you will need to select an external storage to store the incremental log information.
* **Contain Table**: The default option is **All**, which includes all tables. Alternatively, you can select **Custom** and manually specify the desired tables by separating their names with commas (,).
* **Exclude Tables**: Once the switch is enabled, you have the option to specify tables to be excluded. You can do this by listing the table names separated by commas (,) in case there are multiple tables to be excluded.
* **Agent Settings**: Defaults to **Platform automatic allocation**, you can also manually specify an agent.
diff --git a/docs/prerequisites/on-prem-databases/sybase.md b/docs/connectors/on-prem-databases/sybase.md
similarity index 91%
rename from docs/prerequisites/on-prem-databases/sybase.md
rename to docs/connectors/on-prem-databases/sybase.md
index 21215fdd..c1c99e98 100644
--- a/docs/prerequisites/on-prem-databases/sybase.md
+++ b/docs/connectors/on-prem-databases/sybase.md
@@ -1,8 +1,6 @@
# Sybase
-import Content from '../../reuse-content/_enterprise-and-cloud-features.md';
-
[Sybase Database](https://infocenter.sybase.com/help/index.jsp), also known as Adaptive Server Enterprise (ASE), is a high-performance, reliable, and scalable enterprise-grade relational database management system. Sybase is nearing the end of its support lifecycle, and it is recommended to migrate to other databases to reduce risk. With TapData, you can easily build real-time synchronization pipelines to sync Sybase data to other database platforms, ensuring business continuity.
@@ -42,7 +40,7 @@ DML Operations: INSERT, UPDATE, DELETE
- DDL event capture and application are not supported. If a DDL event occurs during synchronization, you must stop the task and re-run a full data sync.
-- Due to Sybase limitations, if multiple synchronization tasks are enabled on the same database, you must enable **[Shared Mining](../../user-guide/advanced-settings/share-mining.md)** in the Sybase connection and task configurations to prevent new tasks from failing to correctly sync incremental data.
+- Due to Sybase limitations, if multiple synchronization tasks are enabled on the same database, you must enable **[Shared Mining](../../operational-data-hub/advanced/share-mining.md)** in the Sybase connection and task configurations to prevent new tasks from failing to correctly sync incremental data.
- Due to Sybase's cache limitations when executing SQL statements, if you encounter the error **"Procedure cache exhausted before a query plan could be found."** while loading the schema, you can adjust the cache size by running the following command:
@@ -101,7 +99,7 @@ DML Operations: INSERT, UPDATE, DELETE
## Connect to Sybase
-1. [Log in to TapData platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. In the left navigation panel, click **Connections**.
@@ -124,7 +122,7 @@ DML Operations: INSERT, UPDATE, DELETE
* **Password**: The password associated with the database account.
* **Byte Order**: Choose between big-endian and little-endian based on the machine architecture. For example, Linux machines typically use little-endian, while some dedicated Sybase machines use big-endian. Incorrect configuration may cause inconsistent data during the incremental synchronization phase.
* **Advanced Settings**
- * **Shared Mining**: [Mining the source database's](../../user-guide/advanced-settings/share-mining.md) incremental logs allows multiple tasks to share the same source database’s incremental log mining process, reducing duplicate reads and minimizing the impact of incremental synchronization on the source database. After enabling this feature, you will need to select external storage to store the incremental log information.
+ * **Shared Mining**: [Mining the source database's](../../operational-data-hub/advanced/share-mining.md) incremental logs allows multiple tasks to share the same source database’s incremental log mining process, reducing duplicate reads and minimizing the impact of incremental synchronization on the source database. After enabling this feature, you will need to select external storage to store the incremental log information.
* **Include Tables**: Default is **All**. You can customize and specify the topics to include by separating table names with commas (`,`).
* **Exclude Tables**: When enabled, you can specify topics to exclude, separated by commas (`,`).
* **Agent Settings**: Defaults to **Platform Automatic Allocation**, but you can manually assign an agent.
diff --git a/docs/prerequisites/on-prem-databases/tdengine.md b/docs/connectors/on-prem-databases/tdengine.md
similarity index 93%
rename from docs/prerequisites/on-prem-databases/tdengine.md
rename to docs/connectors/on-prem-databases/tdengine.md
index be7d9905..4d2a382e 100644
--- a/docs/prerequisites/on-prem-databases/tdengine.md
+++ b/docs/connectors/on-prem-databases/tdengine.md
@@ -1,8 +1,6 @@
# TDengine
-import Content from '../../reuse-content/_all-features.md';
-
Please follow the instructions below to ensure that the TDengine database is successfully added and used in TapData Cloud.
diff --git a/docs/prerequisites/on-prem-databases/tidb.md b/docs/connectors/on-prem-databases/tidb.md
similarity index 93%
rename from docs/prerequisites/on-prem-databases/tidb.md
rename to docs/connectors/on-prem-databases/tidb.md
index 97aba13e..0b6cb6bc 100644
--- a/docs/prerequisites/on-prem-databases/tidb.md
+++ b/docs/connectors/on-prem-databases/tidb.md
@@ -1,8 +1,6 @@
# TiDB
-import Content from '../../reuse-content/_enterprise-and-cloud-features.md';
-
[TiDB](https://docs.pingcap.com/tidb/stable) is an open-source, distributed relational database developed by PingCAP. It is a hybrid database product that supports both Online Transaction Processing (OLTP) and Online Analytical Processing (OLAP). After deploying the Agent, you can follow this tutorial to add TiDB as a data source in TapData, where it can be used as a **source** or **target database** to build data pipelines.
@@ -56,8 +54,6 @@ To simplify the usage process, the TapData TiDB connector integrates with the [T
* TapData engine must be deployed on an **arm or amd** system architecture.
- * Due to communication restrictions between TiDB components, when using the Tapdata Cloud product, the deployed Agent must be a [semi-managed instance](../../billing/purchase.md).
-
## Prerequisites
1. Log in to the TiDB database and create a user account for data synchronization/transformation tasks using the following command:
@@ -104,7 +100,7 @@ GRANT SELECT, INSERT, UPDATE, DELETE, ALTER, CREATE, DROP ON *.* TO 'username';
* **username**: The username.
## Connect to TiDB
-1. [Log in to TapData platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. In the left navigation bar, click **Connections**.
@@ -128,7 +124,7 @@ GRANT SELECT, INSERT, UPDATE, DELETE, ALTER, CREATE, DROP ON *.* TO 'username';
* **Advanced Settings**
* **Other Connection String Parameters**: Additional connection parameters, which are empty by default.
* **Timezone**: The default time zone is 0 (UTC). If another time zone is configured, it may affect the synchronization of fields without time zone information (e.g., `datetime`). Fields with time zone information (e.g., `timestamp with time zone`) and `date` and `time` types will not be affected.
- * **CDC Log Caching**: [Mining the source database's](../../user-guide/advanced-settings/share-mining.md) incremental logs. This allows multiple tasks to share the same source database’s incremental log mining process, reducing duplicate reads and minimizing the impact of incremental synchronization on the source database. After enabling this feature, you will need to select an external storage to store the incremental log information.
+ * **CDC Log Caching**: [Mining the source database's](../../operational-data-hub/advanced/share-mining.md) incremental logs. This allows multiple tasks to share the same source database’s incremental log mining process, reducing duplicate reads and minimizing the impact of incremental synchronization on the source database. After enabling this feature, you will need to select an external storage to store the incremental log information.
* **Contain Table**: The default option is All, which includes all tables. Alternatively, you can select Custom and manually specify the desired tables by separating their names with commas (,).
* **Exclude Tables**: Once the switch is enabled, you have the option to specify tables to be excluded. You can do this by listing the table names separated by commas (,) in case there are multiple tables to be excluded.
* **Agent Settings**: Defaults to Platform automatic allocation, you can also manually specify an agent.
diff --git a/docs/prerequisites/on-prem-databases/vastbase.md b/docs/connectors/on-prem-databases/vastbase.md
similarity index 95%
rename from docs/prerequisites/on-prem-databases/vastbase.md
rename to docs/connectors/on-prem-databases/vastbase.md
index 4d6e28f0..c0518242 100644
--- a/docs/prerequisites/on-prem-databases/vastbase.md
+++ b/docs/connectors/on-prem-databases/vastbase.md
@@ -1,8 +1,6 @@
# Vastbase
-import Content from '../../reuse-content/_all-features.md';
-
Vastbase is an enterprise-level relational database based on the open-source openGauss kernel. It adds numerous Oracle-compatible features and security enhancements to the original functionality, along with enterprise-level capabilities in specialized application areas such as GIS and stream computing. TapData supports using Vastbase as either a source or target database to build data pipelines. This document explains how to add a Vastbase data source in TapData.
@@ -201,7 +199,7 @@ In this example, we will use Vastbase's built-in [wal2json](https://docs.vastdat
## Add Data Source
-1. [Log in to the TapData platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. In the left navigation bar, click **Connections**.
@@ -226,7 +224,7 @@ In this example, we will use Vastbase's built-in [wal2json](https://docs.vastdat
* **Advanced Settings**
* **Additional Parameters**: Additional connection parameters, default is empty.
* **Timezone**: Defaults to the time zone used by the database, which you can also manually specify according to your business needs.
- * **CDC Log Caching**: [Mining the source database's](../../user-guide/advanced-settings/share-mining.md) incremental logs, this feature allows multiple tasks to share incremental logs from the source database, avoiding redundant reads and thus significantly reducing the load on the source database during incremental synchronization. Upon enabling this feature, an external storage should be selected to store the incremental log.
+ * **CDC Log Caching**: [Mining the source database's](../../operational-data-hub/advanced/share-mining.md) incremental logs, this feature allows multiple tasks to share incremental logs from the source database, avoiding redundant reads and thus significantly reducing the load on the source database during incremental synchronization. Upon enabling this feature, an external storage should be selected to store the incremental log.
* **Contain Table**: The default option is **All**, which includes all tables. Alternatively, you can select **Custom** and manually specify the desired tables by separating their names with commas (,).
* **Exclude tables**: Once the switch is enabled, you have the option to specify tables to be excluded. You can do this by listing the table names separated by commas (,) in case there are multiple tables to be excluded.
* **Agent Settings**: Defaults to **Platform automatic allocation**, you can also manually specify an agent.
diff --git a/docs/prerequisites/others/README.md b/docs/connectors/others/README.md
similarity index 59%
rename from docs/prerequisites/others/README.md
rename to docs/connectors/others/README.md
index 647253dd..36fad8c4 100644
--- a/docs/prerequisites/others/README.md
+++ b/docs/connectors/others/README.md
@@ -1,8 +1,6 @@
# SaaS and APIs
-import Content from '../../reuse-content/_enterprise-and-cloud-features.md';
-
Please select the database you would like to add:
diff --git a/docs/prerequisites/others/custom-connection.md b/docs/connectors/others/custom-connection.md
similarity index 99%
rename from docs/prerequisites/others/custom-connection.md
rename to docs/connectors/others/custom-connection.md
index b2067492..3ea4cd98 100644
--- a/docs/prerequisites/others/custom-connection.md
+++ b/docs/connectors/others/custom-connection.md
@@ -1,8 +1,6 @@
# Custom Connection
-import Content from '../../reuse-content/_all-features.md';
-
If the existing data sources don't meet your requirements, you can create custom connections based on your business needs. This article outlines the configuration process.
diff --git a/docs/prerequisites/others/dummy.md b/docs/connectors/others/dummy.md
similarity index 97%
rename from docs/prerequisites/others/dummy.md
rename to docs/connectors/others/dummy.md
index da0884cf..ed7d0dfa 100644
--- a/docs/prerequisites/others/dummy.md
+++ b/docs/connectors/others/dummy.md
@@ -1,8 +1,6 @@
# Dummy
-import Content from '../../reuse-content/_all-features.md';
-
Dummy is a data source that generates test data. This article describes how to add Dummy data sources to TapData Cloud.
@@ -32,7 +30,7 @@ Dummy is a data source that generates test data. This article describes how to a
## Connect to Dummy
-1. [Log in to TapData Platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. In the left navigation panel, click **Connections**.
3. On the right side of the page, click **Create**.
4. In the pop-up dialog, select **Dummy**.
diff --git a/docs/prerequisites/others/http-receiver.md b/docs/connectors/others/http-receiver.md
similarity index 97%
rename from docs/prerequisites/others/http-receiver.md
rename to docs/connectors/others/http-receiver.md
index 5cebdfae..9a176374 100644
--- a/docs/prerequisites/others/http-receiver.md
+++ b/docs/connectors/others/http-receiver.md
@@ -1,14 +1,12 @@
# Http Receiver
-import Content from '../../reuse-content/_all-features.md';
-
With TapData Cloud's HTTP Receiver data source, you can receive data pushed from platforms such as SaaS to quickly bridge data silos and build a unified data platform. This article explains how to add an HTTP Receiver data source in TapData Cloud.
## Connect to HTTP Receiver
-1. [Log in to TapData Platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. In the left navigation pane, click on **Connection**.
diff --git a/docs/prerequisites/others/mock-source.md b/docs/connectors/others/mock-source.md
similarity index 97%
rename from docs/prerequisites/others/mock-source.md
rename to docs/connectors/others/mock-source.md
index 4ddf87c5..a8a7c09e 100644
--- a/docs/prerequisites/others/mock-source.md
+++ b/docs/connectors/others/mock-source.md
@@ -1,8 +1,6 @@
# Mock Source
-import Content from '../../reuse-content/_all-features.md';
-
The Mock Source can be used as a source database, primarily for performance testing scenarios.
diff --git a/docs/prerequisites/others/mock-target.md b/docs/connectors/others/mock-target.md
similarity index 86%
rename from docs/prerequisites/others/mock-target.md
rename to docs/connectors/others/mock-target.md
index d0f58a3d..e319c396 100644
--- a/docs/prerequisites/others/mock-target.md
+++ b/docs/connectors/others/mock-target.md
@@ -1,7 +1,5 @@
# Mock Target
-import Content from '../../reuse-content/_all-features.md';
-
The Mock Target can be used as a target database, primarily for performance testing scenarios.
diff --git a/docs/user-guide/pre-check.md b/docs/connectors/pre-check.md
similarity index 98%
rename from docs/user-guide/pre-check.md
rename to docs/connectors/pre-check.md
index d98236f5..10c7f72b 100644
--- a/docs/user-guide/pre-check.md
+++ b/docs/connectors/pre-check.md
@@ -1,8 +1,6 @@
# Task Pre-check
-import Content from '../reuse-content/_all-features.md';
-
To ensure the normal operation of data replication/development tasks, when you save or start a task, TapData will conduct a pre-check based on node configuration and data source characteristics. Simultaneously, it prints the check results through logs, helping you avoid the risk of task execution failure and manage tasks more efficiently.
diff --git a/docs/prerequisites/saas-and-api/README.md b/docs/connectors/saas-and-api/README.md
similarity index 60%
rename from docs/prerequisites/saas-and-api/README.md
rename to docs/connectors/saas-and-api/README.md
index 3df9db2b..9be27695 100644
--- a/docs/prerequisites/saas-and-api/README.md
+++ b/docs/connectors/saas-and-api/README.md
@@ -1,8 +1,6 @@
# SaaS and APIs
-import Content from '../../reuse-content/_enterprise-and-cloud-features.md';
-
Please select the database you would like to add:
diff --git a/docs/prerequisites/saas-and-api/coding.md b/docs/connectors/saas-and-api/coding.md
similarity index 90%
rename from docs/prerequisites/saas-and-api/coding.md
rename to docs/connectors/saas-and-api/coding.md
index 3525d794..2de10756 100644
--- a/docs/prerequisites/saas-and-api/coding.md
+++ b/docs/connectors/saas-and-api/coding.md
@@ -1,8 +1,6 @@
# Coding
-import Content from '../../reuse-content/_all-features.md';
-
**Coding** is an all-in-one DevOps platform by Tencent Cloud. It supports core features such as code hosting, project collaboration, and continuous integration. By integrating with Tapdata, you can monitor key events in Coding (e.g., code commits, task updates) in real time, build automated data pipelines, and unify R&D data for intelligent analysis—accelerating the digital and intelligent transformation of your DevOps processes.
@@ -22,7 +20,7 @@ For more details on data structures and event support, refer to Coding's [offici
## Connect to Coding
-1. [Log in to Tapdata platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. In the left navigation panel, click **Connections**.
@@ -71,7 +69,7 @@ For more details on data structures and event support, refer to Coding's [offici
- **Advanced Settings**: Configure based on your business needs:
- - **CDC Log Caching**: [Mining the source database's](../../user-guide/advanced-settings/share-mining.md) incremental logs. This allows multiple tasks to share the same source database’s incremental log mining process, reducing duplicate reads and minimizing the impact of incremental synchronization on the source database. After enabling this feature, you will need to select an external storage to store the incremental log information.
+ - **CDC Log Caching**: [Mining the source database's](../../operational-data-hub/advanced/share-mining.md) incremental logs. This allows multiple tasks to share the same source database’s incremental log mining process, reducing duplicate reads and minimizing the impact of incremental synchronization on the source database. After enabling this feature, you will need to select an external storage to store the incremental log information.
- **Agent Settings**: Defaults to **Platform automatic allocation**, you can also manually specify an agent.
- **Model Load Time**: If there are less than 10,000 models in the data source, their schema will be updated every hour. But if the number of models exceeds 10,000, the refresh will take place daily at the time you have specified.
diff --git a/docs/connectors/saas-and-api/feishu-bitable.md b/docs/connectors/saas-and-api/feishu-bitable.md
new file mode 100644
index 00000000..35efad33
--- /dev/null
+++ b/docs/connectors/saas-and-api/feishu-bitable.md
@@ -0,0 +1,65 @@
+# Feishu Bitable
+
+[Feishu Bitable](https://open.feishu.cn/document/server-docs/docs/bitable-v1/bitable-overview) (also known as Lark Bitable) is a collaborative spreadsheet tool designed for managing structured data. Tapdata supports integrating it as both a source and target, enabling real-time extraction of table data and synchronization to target databases, search engines, or data lakes. This helps enterprises unify data governance, enhance data usability, and supports use cases such as BI reporting, automation workflows, and real-time analytics.
+
+## Considerations
+
+Feishu Open Platform enforces different rate limiting strategies for different OpenAPI endpoints to ensure system stability and provide optimal performance and developer experience:
+
+- **Batch Insert/Update APIs**: Up to 50 requests per second; each request can contain up to 1,000 records.
+- **Batch Delete API**: Up to 50 requests per second; each request can contain up to 500 records.
+- **Batch Query API**: Up to 20 requests per second.
+ - For tables with a **single primary key**, up to 50 records can be retrieved per request.
+ - For **composite primary keys**, only 10 records per request are supported.
+
+For more details, see [Rate Limiting Policy](https://open.feishu.cn/document/server-docs/api-call-guide/frequency-control).
+
+## Supported Sync Operations
+
+DML operations: INSERT, UPDATE, DELETE
+
+:::tip
+
+When Feishu Bitable is used as a **target**, you can configure data write policies in the task node’s advanced settings:
+
+- For **insert events**: Choose to update if the record exists, discard it, or insert only.
+- For **update events**: Choose to discard if the record doesn't exist, or insert instead.
+
+:::
+
+## Prerequisites
+
+1. Log in to the [Lark Open Platform](https://open.feishu.cn/app) as an administrator.
+
+2. On the homepage of the development platform, open the self-built application.
+
+ For instructions on creating a self-built application, see [Development Process](https://open.feishu.cn/document/develop-process/self-built-application-development-process).
+
+3. In the left navigation bar, click **Credentials & Basic Info** to obtain the App ID and App Secret, which are needed when configuring the data source connection.
+
+ 
+
+## Connect to Feishu Bitable
+
+1. Log in to TapData platform.
+
+2. In the left navigation panel, click **Connections**.
+
+3. On the right side of the page, click **Create Connection**.
+
+4. In the dialog box, search for and select **Feishu Bitable**.
+
+5. Complete the data source configuration as follows:
+
+ 
+
+ - **Name**: A unique, descriptive name with business significance.
+ - **Type**: Support using Connect to Feishu Bitable as either a source or target.
+ - **App ID**, **App Secret**: Obtain these from the Feishu Open Platform. See [Prerequisites](#prerequisite) for details.
+ - **App Token**: Each Bitable app is treated as a unique app with a corresponding `app_token`. See the [Integration Guide](https://open.feishu.cn/document/server-docs/docs/bitable-v1/notification) for details.
+ - **Table ID**: Each Bitable app may include multiple tables, each identified by a unique `table_id`. You can find this in the [Integration Guide](https://open.feishu.cn/document/server-docs/docs/bitable-v1/notification), or by calling the [List all tables](https://open.feishu.cn/document/uAjLw4CM/ukTMukTMukTM/reference/bitable-v1/app-table/list) API.
+ - **Advanced Settings**
+ - **Agent Settings**: Defaults to **Platform automatic allocation**, you can also manually specify an agent.
+ - **Model Load Time**: If there are less than 10,000 models in the data source, their schema will be updated every hour. But if the number of models exceeds 10,000, the refresh will take place daily at the time you have specified.
+
+6. Click **Test**. If successful, click **Save**.
\ No newline at end of file
diff --git a/docs/prerequisites/saas-and-api/github.md b/docs/connectors/saas-and-api/github.md
similarity index 92%
rename from docs/prerequisites/saas-and-api/github.md
rename to docs/connectors/saas-and-api/github.md
index d12bd4a9..7b47f94a 100644
--- a/docs/prerequisites/saas-and-api/github.md
+++ b/docs/connectors/saas-and-api/github.md
@@ -1,8 +1,6 @@
# GitHub
-import Content from '../../reuse-content/_enterprise-and-cloud-features.md';
-
GitHub is a website and cloud-based service that helps developers store and manage their code, as well as track and control changes to their code. TapData supports building data pipelines with GitHub as a source database, helping you to read the Issue and Pull Requests change data of the specified repository and synchronize to the specified data source.
@@ -12,7 +10,7 @@ This article describes how to add GitHub data source to TapData Cloud.
## Procedure
-1. [Log in to TapData Platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. In the left navigation panel, click **Connections**.
diff --git a/docs/prerequisites/saas-and-api/lark-approval.md b/docs/connectors/saas-and-api/lark-approval.md
similarity index 85%
rename from docs/prerequisites/saas-and-api/lark-approval.md
rename to docs/connectors/saas-and-api/lark-approval.md
index 083dee04..48cd4d4d 100644
--- a/docs/prerequisites/saas-and-api/lark-approval.md
+++ b/docs/connectors/saas-and-api/lark-approval.md
@@ -1,8 +1,6 @@
# Lark Approval
-import Content from '../../reuse-content/_enterprise-and-cloud-features.md';
-
This article describes how to connect to Lark Approval data sources on TapData Cloud.
diff --git a/docs/prerequisites/saas-and-api/lark-doc.md b/docs/connectors/saas-and-api/lark-doc.md
similarity index 94%
rename from docs/prerequisites/saas-and-api/lark-doc.md
rename to docs/connectors/saas-and-api/lark-doc.md
index cf2c3dac..311e3987 100644
--- a/docs/prerequisites/saas-and-api/lark-doc.md
+++ b/docs/connectors/saas-and-api/lark-doc.md
@@ -1,8 +1,6 @@
# Lark Doc
-import Content from '../../reuse-content/_enterprise-and-cloud-features.md';
-
**Lark Docs** is the core platform within the Lark ecosystem for knowledge accumulation, sharing, and collaboration. Tapdata supports using Lark Docs as a data source, allowing you to extract document content in real time and sync it to target databases, search engines, or data lakes.
Through structured management and unified governance, this integration helps enterprises build intelligent knowledge systems, improve document search efficiency, and support downstream BI analytics, question-answering systems, and data-driven decision-making.
@@ -29,7 +27,7 @@ The Lark Open Platform enforces rate limiting policies for different OpenAPI end
## Connect to Lark Doc
-1. [Log in to Tapdata platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. In the left navigation bar, click **Connections**.
diff --git a/docs/prerequisites/saas-and-api/lark-im.md b/docs/connectors/saas-and-api/lark-im.md
similarity index 89%
rename from docs/prerequisites/saas-and-api/lark-im.md
rename to docs/connectors/saas-and-api/lark-im.md
index e7d92086..64dc1686 100644
--- a/docs/prerequisites/saas-and-api/lark-im.md
+++ b/docs/connectors/saas-and-api/lark-im.md
@@ -1,9 +1,5 @@
# Lark-IM
-import Content from '../../reuse-content/_enterprise-and-cloud-features.md';
-
-
-
Lark is an enterprise collaboration and management platform that integrates instant messaging, audio and video conferencing, and open applications. By configuring **Lark-IM (Lark chat messages)** as a target in Tapdata, you can push alert events, operational metrics, or custom notifications from data pipelines to Lark in real time, enabling real-time alerting, automated collaboration, and decision support for your teams.
## Consideration
@@ -13,7 +9,7 @@ Lark is an enterprise collaboration and management platform that integrates inst
- The maximum size for a text message request body is **150 KB**, and the maximum size for a card or rich text message request body is **30 KB**.
- Typically, upstream sources (such as databases or logs) do not produce data in the message body structure required by Lark.
- Therefore, you usually need to add a [JavaScript](../../user-guide/data-development/process-node.md#js-process) or [Python](../../user-guide/data-development/process-node.md#python) processing node in the data pipeline to clean and format the raw data into a JSON structure like the one below:
+ Therefore, you usually need to add a [JavaScript](../../data-transformation/process-node.md#js-process) or [Python](../../data-transformation/process-node.md#python) processing node in the data pipeline to clean and format the raw data into a JSON structure like the one below:
```json
[
@@ -57,7 +53,7 @@ For more details, see the [Send message content structure](https://open.feishu.c
## Connect to Lark-IM
-1. [Log in to Tapdata platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. In the left navigation bar, click **Connections**.
diff --git a/docs/prerequisites/saas-and-api/lark-task.md b/docs/connectors/saas-and-api/lark-task.md
similarity index 92%
rename from docs/prerequisites/saas-and-api/lark-task.md
rename to docs/connectors/saas-and-api/lark-task.md
index eed3df88..8c8d2f7c 100644
--- a/docs/prerequisites/saas-and-api/lark-task.md
+++ b/docs/connectors/saas-and-api/lark-task.md
@@ -1,7 +1,5 @@
# LarkTask
-import Content from '../../reuse-content/_enterprise-and-cloud-features.md';
-
**Lark** is a collaboration and management platform that provides instant messaging, video conferencing, and other features. After completing the Agent deployment, you can follow this guide to add a **LarkTask** data source in Tapdata, and later use it as a target to build data pipelines.
@@ -19,7 +17,7 @@ import Content from '../../reuse-content/_enterprise-and-cloud-features.md';
## Connect to LarkTask
-1. [Log in to the Tapdata platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. In the left navigation panel, click **Connections**.
diff --git a/docs/prerequisites/saas-and-api/quick-api.md b/docs/connectors/saas-and-api/quick-api.md
similarity index 95%
rename from docs/prerequisites/saas-and-api/quick-api.md
rename to docs/connectors/saas-and-api/quick-api.md
index bcc93584..088b1d72 100644
--- a/docs/prerequisites/saas-and-api/quick-api.md
+++ b/docs/connectors/saas-and-api/quick-api.md
@@ -1,8 +1,6 @@
# Quick API
-import Content from '../../reuse-content/_all-features.md';
-
### 1. Fill in the connection name (required)
@@ -27,7 +25,7 @@ You need to add some standardized labels to this API name on the corresponding t
which contains the following keywords:
- A、 TAP_ TABLE: the table creation keyword, which indicates that the data obtained by the current API will form a data table.
-- B. [Tickets]: Specify the table name, generally the same as TAP_ The TABLE keyword appears together, specifying the table name after the table is created and the data obtained by the API is stored in this table. A text wrapped with []. Please organize the table name reasonably. It is not recommended to use special characters. For example, using one of the two characters [] in the table name will affect the table name after the table is created.
+- B. Tickets: Specify the table name, generally the same as TAP_ The TABLE keyword appears together, specifying the table name after the table is created and the data obtained by the API is stored in this table. A text wrapped with []. Please organize the table name reasonably. It is not recommended to use special characters. For example, using one of the two characters [] in the table name will affect the table name after the table is created.
- C. (PAGE_LIMIT: data) The LIMIT paging type queries the data, indicating that the API is paging based on the record index and intra page offset. The specific paging type needs to be indicated after you analyze the API interface, otherwise it will affect the query results and cause data errors. The following page types are provided. You can specify the page types according to the relevant API features:
```
diff --git a/docs/prerequisites/saas-and-api/vika.md b/docs/connectors/saas-and-api/vika.md
similarity index 94%
rename from docs/prerequisites/saas-and-api/vika.md
rename to docs/connectors/saas-and-api/vika.md
index d16f49af..da2641f4 100644
--- a/docs/prerequisites/saas-and-api/vika.md
+++ b/docs/connectors/saas-and-api/vika.md
@@ -1,8 +1,6 @@
# Vika
-import Content from '../../reuse-content/_all-features.md';
-
This article describes how to connect to Vika data sources on TapData Cloud.
diff --git a/docs/prerequisites/saas-and-api/zoho-desk.md b/docs/connectors/saas-and-api/zoho-desk.md
similarity index 97%
rename from docs/prerequisites/saas-and-api/zoho-desk.md
rename to docs/connectors/saas-and-api/zoho-desk.md
index 4be2d0e9..19caebbf 100644
--- a/docs/prerequisites/saas-and-api/zoho-desk.md
+++ b/docs/connectors/saas-and-api/zoho-desk.md
@@ -1,8 +1,6 @@
# Zoho Desk
-import Content from '../../reuse-content/_all-features.md';
-
Zoho Desk is a cloud-based customer service platform that supports ticket management, customer communication, and service automation. By integrating Zoho Desk as a data source in TapData, you can stream key data such as tickets, contacts, and conversations in real time. This is ideal for building unified customer service views, customer profiles, and service response monitoring systems.
@@ -62,7 +60,7 @@ Before connecting Zoho Desk to TapData, follow these steps to retrieve authentic
## Connect to Zoho Desk
-1. [Log in to TapData](../../user-guide/log-in.md).
+1. Log in to TapData.
2. From the left navigation panel, click **Connections**.
diff --git a/docs/prerequisites/supported-databases.md b/docs/connectors/supported-data-sources.md
similarity index 94%
rename from docs/prerequisites/supported-databases.md
rename to docs/connectors/supported-data-sources.md
index dd8ec1e6..aa5fa012 100644
--- a/docs/prerequisites/supported-databases.md
+++ b/docs/connectors/supported-data-sources.md
@@ -1,8 +1,5 @@
# Supported Data Sources
-import Content from '../reuse-content/_all-features.md';
-
-
TapData supports rich data sources as follows:
@@ -12,17 +9,16 @@ If you need to synchronize DDL operations, you need to enable DDL collection and
:::
-### Synchronization Types
+### Data Source Types
+
+TapData supports various data sources that can serve different roles in data synchronization:
+
+- **Source**: The origin database or system from which data is read and synchronized
+- **Target**: The destination database or system where data is written and stored
+
+TapData supports both **full synchronization** and **incremental synchronization** between these data sources. For detailed support information, refer to the data source support table below.
-TapData supports two types of synchronization: **full synchronization** and **incremental synchronization**, covering both one-way and two-way synchronization scenarios. It is compatible with a variety of data sources, as described below:
-- **One-Way Synchronization**: Data is synchronized from the source to the target data source. For detailed support information on incremental synchronization, refer to the data source support table in this document.
-- **Two-Way Synchronization**: Enables real-time bidirectional data flow between source and target data sources, ensuring data consistency on both ends. For detailed configuration steps, see [Bidirectional Synchronization Case](../case-practices/pipeline-tutorial/mysql-bi-directional-sync.md). Currently, the following data sources support **Two-Way Synchronization**, applicable to both full and incremental synchronization scenarios:
- - MySQL ↔ MySQL
- - PostgreSQL ↔ PostgreSQL
- - MongoDB ↔ MongoDB
- - PostgreSQL ↔ MySQL
- - SQL Server ↔ SQL Server
```mdx-code-block
import Tabs from '@theme/Tabs';
@@ -478,7 +474,7 @@ The beta version of the data sources is in public preview and has passed the bas
-
Huawei's Cloud GaussDB
+
Huawei Cloud GaussDB
✅
✅
➖
diff --git a/docs/user-guide/trouble-shooting-connection.md b/docs/connectors/trouble-shooting-connection.md
similarity index 73%
rename from docs/user-guide/trouble-shooting-connection.md
rename to docs/connectors/trouble-shooting-connection.md
index ef346428..9b834e7e 100644
--- a/docs/user-guide/trouble-shooting-connection.md
+++ b/docs/connectors/trouble-shooting-connection.md
@@ -1,10 +1,8 @@
# Troubleshooting Connections
-import Content from '../reuse-content/_all-features.md';
-
-To ensure the effectiveness of the data source connection, you can perform a connection test after completing the [data connection configuration](../prerequisites/README.md). By clicking on **Test Connection**, you can verify if the data source configuration meets the requirements and check for normal network connectivity. This article provides an overview of common inspection items within TapData and offers troubleshooting methods in case of connection failures.
+To ensure the effectiveness of the data source connection, you can perform a connection test after completing the [data connection configuration](README.md). By clicking on **Test Connection**, you can verify if the data source configuration meets the requirements and check for normal network connectivity. This article provides an overview of common inspection items within TapData and offers troubleshooting methods in case of connection failures.
- **Check if connections are available**
@@ -24,7 +22,7 @@ To ensure the effectiveness of the data source connection, you can perform a con
- **Check if binlog is enabled and set to ROW format** (for MySQL)
- TapData verifies whether the database's binlog is enabled and set to the ROW format. If these requirements are not met, the connection test will fail. To gather more information about binlog settings, it is recommended to refer to the [MySQL preparations documentation](../prerequisites/on-prem-databases/mysql.md). In such cases, it is necessary to review and verify the configuration of the database's binlog to ensure it is properly enabled and set to the ROW format as per the requirements.
+ TapData verifies whether the database's binlog is enabled and set to the ROW format. If these requirements are not met, the connection test will fail. To gather more information about binlog settings, it is recommended to refer to the [MySQL preparations documentation](on-prem-databases/mysql.md). In such cases, it is necessary to review and verify the configuration of the database's binlog to ensure it is properly enabled and set to the ROW format as per the requirements.
- **Check if permissions required for CDC are authorized**
@@ -32,18 +30,18 @@ To ensure the effectiveness of the data source connection, you can perform a con
- **Check if archive logging is enabled** (for Oracle)
- TapData checks if the archive log is enabled. If it is not enabled, the test fails. For more information about how to enable it, see [Oracle preparation](../prerequisites/on-prem-databases/oracle.md).
+ TapData checks if the archive log is enabled. If it is not enabled, the test fails. For more information about how to enable it, see [Oracle preparation](on-prem-databases/oracle.md).
- **Check if supplemental log mode is correct** (for Oracle)
- TapData checks if the supplemental log mode is correct. If it is incorrect, the test fails. For more information about how to set it up, see [Oracle preparation](../prerequisites/on-prem-databases/oracle.md).
+ TapData checks if the supplemental log mode is correct. If it is incorrect, the test fails. For more information about how to set it up, see [Oracle preparation](on-prem-databases/oracle.md).
- **Check if permissions required for DDL are authorized** (for Oracle)
- TapData checks if the database account has DDL execution permissions. If the permission is not met, the test fails. For an example of authorization, see [Oracle preparation](../prerequisites/on-prem-databases/oracle.md).
+ TapData checks if the database account has DDL execution permissions. If the permission is not met, the test fails. For an example of authorization, see [Oracle preparation](on-prem-databases/oracle.md).
## See also
-[Preparations Before Connection](../prerequisites/README.md)
\ No newline at end of file
+[Preparations Before Connection](README.md)
\ No newline at end of file
diff --git a/docs/prerequisites/warehouses-and-lake/README.md b/docs/connectors/warehouses-and-lake/README.md
similarity index 62%
rename from docs/prerequisites/warehouses-and-lake/README.md
rename to docs/connectors/warehouses-and-lake/README.md
index 9d73ca07..c7fe9249 100644
--- a/docs/prerequisites/warehouses-and-lake/README.md
+++ b/docs/connectors/warehouses-and-lake/README.md
@@ -1,8 +1,6 @@
# Data Warehouse and Data Lake
-import Content from '../../reuse-content/_enterprise-and-cloud-features.md';
-
Please select the database you would like to add:
diff --git a/docs/prerequisites/warehouses-and-lake/big-query.md b/docs/connectors/warehouses-and-lake/big-query.md
similarity index 95%
rename from docs/prerequisites/warehouses-and-lake/big-query.md
rename to docs/connectors/warehouses-and-lake/big-query.md
index feb211df..cbb8e37a 100644
--- a/docs/prerequisites/warehouses-and-lake/big-query.md
+++ b/docs/connectors/warehouses-and-lake/big-query.md
@@ -1,8 +1,6 @@
# BigQuery
-import Content from '../../reuse-content/_all-features.md';
-
TapData Cloud offers seamless support for data synchronization and data transformation tasks using [BigQuery](https://cloud.google.com/bigquery/docs) as the target database. BigQuery is a highly efficient, serverless, and cost-effective enterprise data warehouse that provides extensive capabilities for BI (Business Intelligence), machine learning, and AI (Artificial Intelligence). With TapData Cloud, you can easily integrate BigQuery data sources into your workflows.
@@ -10,7 +8,7 @@ This article serves as a comprehensive guide, providing step-by-step instruction
## Precautions
-[Agent](../../installation/install-tapdata-agent.md)'s machine can access to Google Cloud Services.
+TapData Platform's machine can access to Google Cloud Services.
@@ -92,7 +90,7 @@ This article serves as a comprehensive guide, providing step-by-step instruction
## Connect to BigQuery
-1. [Log in to TapData Platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. In the left navigation panel, click **Connections**.
diff --git a/docs/prerequisites/warehouses-and-lake/clickhouse.md b/docs/connectors/warehouses-and-lake/clickhouse.md
similarity index 98%
rename from docs/prerequisites/warehouses-and-lake/clickhouse.md
rename to docs/connectors/warehouses-and-lake/clickhouse.md
index d157fd95..0e0f99e3 100644
--- a/docs/prerequisites/warehouses-and-lake/clickhouse.md
+++ b/docs/connectors/warehouses-and-lake/clickhouse.md
@@ -1,8 +1,6 @@
# ClickHouse
-import Content from '../../reuse-content/_all-features.md';
-
[ClickHouse](https://clickhouse.com/) is a high-performance, column-oriented SQL database management system (DBMS) for online analytical processing (OLAP). This document will guide you on how to add ClickHouse as a data source in TapData, enabling you to use it as either a **source** or **target database** for building real-time data pipelines.
@@ -98,7 +96,7 @@ GRANT SELECT, INSERT, CREATE TABLE, ALTER TABLE, ALTER UPDATE, DROP TABLE, TRUNC
## Connect to ClickHouse
-1. [Log in to TapData Platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. In the left navigation bar, click **Connections**.
diff --git a/docs/prerequisites/warehouses-and-lake/databend.md b/docs/connectors/warehouses-and-lake/databend.md
similarity index 95%
rename from docs/prerequisites/warehouses-and-lake/databend.md
rename to docs/connectors/warehouses-and-lake/databend.md
index 7c9c004a..f6fe7fbf 100644
--- a/docs/prerequisites/warehouses-and-lake/databend.md
+++ b/docs/connectors/warehouses-and-lake/databend.md
@@ -1,8 +1,6 @@
# Databend
-import Content1 from '../../reuse-content/_enterprise-and-cloud-features.md';
-
Databend is an open-source, elastic, and workload-aware modern cloud data warehouse. Utilizing the latest vectorized query processing technology, Databend helps users perform rapid data analysis on object storage.
@@ -45,7 +43,7 @@ import Content from '../../reuse-content/_beta.md';
## Connecting to Databend
-1. [Log in to TapData Platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. In the left navigation panel, click **Connections**.
diff --git a/docs/prerequisites/warehouses-and-lake/doris.md b/docs/connectors/warehouses-and-lake/doris.md
similarity index 98%
rename from docs/prerequisites/warehouses-and-lake/doris.md
rename to docs/connectors/warehouses-and-lake/doris.md
index 2d6aab7a..819f8771 100644
--- a/docs/prerequisites/warehouses-and-lake/doris.md
+++ b/docs/connectors/warehouses-and-lake/doris.md
@@ -1,9 +1,5 @@
# Doris
-import Content1 from '../../reuse-content/_all-features.md';
-
-
-
[Apache Doris](https://doris.apache.org/) is an MPP-based real-time data warehouse known for its high query speed. It can be used for report analysis, ad-hoc queries, unified data warehouse, and data lake query acceleration. Tapdata supports using Doris as both a source and a target database to build data pipelines, helping you quickly handle data flows for big data analysis scenarios. In this article, we will introduce how to connect to Doris on the Tapdata platform.
```mdx-code-block
@@ -101,7 +97,7 @@ Please replace the username, password, and host in the command above.
## Connect to Doris
-1. [Log in to Tapdata platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. In the left navigation bar, click **Connections**.
diff --git a/docs/prerequisites/warehouses-and-lake/gaussdb.md b/docs/connectors/warehouses-and-lake/gaussdb.md
similarity index 98%
rename from docs/prerequisites/warehouses-and-lake/gaussdb.md
rename to docs/connectors/warehouses-and-lake/gaussdb.md
index e240ecf7..d2bf3616 100644
--- a/docs/prerequisites/warehouses-and-lake/gaussdb.md
+++ b/docs/connectors/warehouses-and-lake/gaussdb.md
@@ -1,8 +1,6 @@
# GaussDB (DWS)
-import Content from '../../reuse-content/_all-features.md';
-
GaussDB (DWS) is a fully managed, enterprise-grade cloud data warehouse service offering zero-maintenance, online scaling, and efficient multi-source data loading capabilities. It is compatible with the PostgreSQL ecosystem. TapData supports using GaussDB (DWS) as both a source and a target, enabling you to quickly build data pipelines. This guide explains how to connect GaussDB (DWS) in the TapData platform.
@@ -115,7 +113,7 @@ In GaussDB (DWS), distribution columns determine how data is distributed across
## Connect to GaussDB(DWS)
-1. [Log in to TapData Platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. Click **Connections** in the left navigation bar.
diff --git a/docs/prerequisites/warehouses-and-lake/greenplum.md b/docs/connectors/warehouses-and-lake/greenplum.md
similarity index 95%
rename from docs/prerequisites/warehouses-and-lake/greenplum.md
rename to docs/connectors/warehouses-and-lake/greenplum.md
index 28d5f7e8..9eb594f7 100644
--- a/docs/prerequisites/warehouses-and-lake/greenplum.md
+++ b/docs/connectors/warehouses-and-lake/greenplum.md
@@ -1,8 +1,6 @@
# Greenplum
-import Content from '../../reuse-content/_all-features.md';
-
Greenplum Database is a massively parallel processing (MPP) database server with an architecture specially designed to manage large-scale analytic data warehouses and business intelligence workloads.
@@ -10,7 +8,7 @@ This article provides detailed instructions on adding a Greenplum database to Ta
## Connect to Greenplum
-1. [Log in to TapData Platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. In the left navigation panel, click **Connections**.
diff --git a/docs/prerequisites/warehouses-and-lake/hudi.md b/docs/connectors/warehouses-and-lake/hudi.md
similarity index 95%
rename from docs/prerequisites/warehouses-and-lake/hudi.md
rename to docs/connectors/warehouses-and-lake/hudi.md
index 39bece57..98c182b0 100644
--- a/docs/prerequisites/warehouses-and-lake/hudi.md
+++ b/docs/connectors/warehouses-and-lake/hudi.md
@@ -1,8 +1,6 @@
# Hudi
-import Content from '../../reuse-content/_all-features.md';
-
[Apache Hudi](https://hudi.apache.org/cn/docs/overview) is a storage format for data lakes that provides the ability to update, delete, and consume change data on top of the Hadoop file system. TapData supports using Hudi as a **target database** to build data transfer pipelines.
diff --git a/docs/prerequisites/warehouses-and-lake/selectdb.md b/docs/connectors/warehouses-and-lake/selectdb.md
similarity index 96%
rename from docs/prerequisites/warehouses-and-lake/selectdb.md
rename to docs/connectors/warehouses-and-lake/selectdb.md
index 4bb39d59..e3667df8 100644
--- a/docs/prerequisites/warehouses-and-lake/selectdb.md
+++ b/docs/connectors/warehouses-and-lake/selectdb.md
@@ -10,9 +10,7 @@ keywords:
# SelectDB
-import Content from '../../reuse-content/_all-features.md';
-
[TapData](https://tapdata.io/) supports [SelectDB Cloud](https://www.selectdb.com/) as a real-time data source for analytics, reporting, and high-performance data warehouse integration.
@@ -46,7 +44,7 @@ GRANT PROCESS ON *.* TO 'tapdata' IDENTIFIED BY 'password';
## Connect to SelectDB
-1. [Log in to TapData Platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. In the left navigation panel, click **Connections**.
diff --git a/docs/prerequisites/warehouses-and-lake/starrocks.md b/docs/connectors/warehouses-and-lake/starrocks.md
similarity index 97%
rename from docs/prerequisites/warehouses-and-lake/starrocks.md
rename to docs/connectors/warehouses-and-lake/starrocks.md
index 043b937e..e8ecc999 100644
--- a/docs/prerequisites/warehouses-and-lake/starrocks.md
+++ b/docs/connectors/warehouses-and-lake/starrocks.md
@@ -1,9 +1,5 @@
# StarRocks
-import Content1 from '../../reuse-content/_enterprise-features.md';
-
-
-
StarRocks is a high-performance data warehouse designed for real-time analytics. It features a vectorized execution engine and an MPP architecture, supporting high concurrency, multidimensional analysis, and real-time data updates. Tapdata supports using StarRocks as a target in data pipelines to enable real-time data ingestion and analytics acceleration at scale.
```mdx-code-block
@@ -94,7 +90,7 @@ GRANT ALL ON ALL TABLES IN ALL DATABASES TO USER your_username;
## Connect to StarRocks
-1. [Log in to Tapdata Platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. In the left navigation bar, click **Connections**.
diff --git a/docs/prerequisites/warehouses-and-lake/tablestore.md b/docs/connectors/warehouses-and-lake/tablestore.md
similarity index 95%
rename from docs/prerequisites/warehouses-and-lake/tablestore.md
rename to docs/connectors/warehouses-and-lake/tablestore.md
index 205319e4..abdf6c4f 100644
--- a/docs/prerequisites/warehouses-and-lake/tablestore.md
+++ b/docs/connectors/warehouses-and-lake/tablestore.md
@@ -1,8 +1,6 @@
# Tablestore
-import Content from '../../reuse-content/_all-features.md';
-
[Alibaba Cloud Tablestore](https://www.alibabacloud.com/help/en/tablestore) is a serverless table storage service designed for handling large volumes of structured data. It also provides a comprehensive solution for IoT scenarios, offering optimized data storage capabilities. TapData Cloud supports data synchronization tasks with Tablestore as the target database.
@@ -33,7 +31,7 @@ This article provides instructions on how to add Tablestore data sources to TapD
## Connect to Tablestore
-1. [Log in to TapData Platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. In the left navigation panel, click **Connections**.
diff --git a/docs/prerequisites/warehouses-and-lake/yashandb.md b/docs/connectors/warehouses-and-lake/yashandb.md
similarity index 97%
rename from docs/prerequisites/warehouses-and-lake/yashandb.md
rename to docs/connectors/warehouses-and-lake/yashandb.md
index da02c6c9..091c425a 100644
--- a/docs/prerequisites/warehouses-and-lake/yashandb.md
+++ b/docs/connectors/warehouses-and-lake/yashandb.md
@@ -1,9 +1,5 @@
# YashanDB
-import Content1 from '../../reuse-content/_all-features.md';
-
-
-
YashanDB is powered by Bounded Evaluation, which uniquely delivers Just in Time analytic capabilities focusing on conquering some of the challenges surrounding big data characterized by Volume, Velocity and Variety. TapData Cloud supports using YashanDB as a target database to build data pipelines, facilitating rapid data transfer.
This article will guide you through connecting a YashanDB data source on the TapData Cloud.
diff --git a/docs/user-guide/copy-data/README.md b/docs/data-replication/README.md
similarity index 84%
rename from docs/user-guide/copy-data/README.md
rename to docs/data-replication/README.md
index 2888b3d9..8fa06d60 100644
--- a/docs/user-guide/copy-data/README.md
+++ b/docs/data-replication/README.md
@@ -1,8 +1,6 @@
# Data Replication
-import Content from '../../reuse-content/_all-features.md';
-
Data replication involves creating a task to copy data from a source to a target database. This can be done through 1:1 replication, where all the data from the source is replicated to the target, or through incremental replication, where only the changes or updates since the last replication are copied to the target.
diff --git a/docs/user-guide/copy-data/create-task.md b/docs/data-replication/create-task.md
similarity index 85%
rename from docs/user-guide/copy-data/create-task.md
rename to docs/data-replication/create-task.md
index f4585f74..85b518a2 100644
--- a/docs/user-guide/copy-data/create-task.md
+++ b/docs/data-replication/create-task.md
@@ -1,19 +1,22 @@
# Create a Data Replication Task
-import Content from '../../reuse-content/_all-features.md';
-
-
-The data replication function can help you to achieve real-time synchronization between the same/heterogeneous data sources, which is suitable for data migration/synchronization, data disaster recovery, reading performance expansion, and other [business scenarios](../../introduction/use-cases.md).
+The data replication function can help you to achieve real-time synchronization between the same/heterogeneous data sources, which is suitable for data migration/synchronization, data disaster recovery, reading performance expansion, and other [business scenarios](../introduction/use-cases.md).
This article explains the specific data replication process to help you quickly become familiar with creating, monitoring, and managing data replication tasks.
+:::tip
+
+TapData supports both one-way and bidirectional synchronization. For bidirectional synchronization, the following data source combinations are supported: MySQL ↔ MySQL, PostgreSQL ↔ PostgreSQL, MongoDB ↔ MongoDB, PostgreSQL ↔ MySQL, and SQL Server ↔ SQL Server. For detailed configuration steps, see [Bidirectional Synchronization Case](../case-practices/pipeline-tutorial/mysql-bi-directional-sync.md).
+
+:::
+
## Prerequisites
Before you create a data replication task, you need to perform the following preparations:
-* [Install TapData](../../quick-start/install.md)
-* [Connect to a Data Source](../../quick-start/connect-database.md)
+* [Install TapData](../getting-started/install-and-setup/README.md)
+* [Connect to a Data Source](../getting-started/connect-data-source.md)
## Procedure
@@ -21,10 +24,10 @@ As an example of creating a data replication task, the article demonstrates the
Best Practices
- To build efficient and reliable data replication tasks, it is recommended to read the Data Synchronization Best Practices before starting to configure tasks.
+ To build efficient and reliable data replication tasks, it is recommended to read the Data Synchronization Best Practices before starting to configure tasks.
-1. [Log in to TapData Platform](../log-in.md).
+1. Log in to TapData Platform.
2. In the left navigation panel, click **Data Replication**.
@@ -38,17 +41,17 @@ As an example of creating a data replication task, the article demonstrates the
4. On the left side of the page, you can drag and drop the source and destination data icons onto the right canvas. After placing them, you can connect them by drawing a line between them to establish the data flow for the replication task.
- 
+ 
:::tip
- In addition to adding data nodes, you can also add processing nodes to complete more complex tasks, such as filtering data, adding or subtracting fields, etc. For more information, see [processing nodes](../data-development/process-node.md).
+ In addition to adding data nodes, you can also add processing nodes to complete more complex tasks, such as filtering data, adding or subtracting fields, etc. For more information, see [processing nodes](../data-transformation/process-node.md).
:::
5. Click the source node (MySQL in this example) to complete the parameter configuration of the right panel according to the following instructions.
- 
+ 
* **Basic Settings**
@@ -80,7 +83,7 @@ As an example of creating a data replication task, the article demonstrates the
6. Click on the target node, which in this example is MongoDB, to configure the parameters in the right panel based on the following instructions.
- 
+ 
* **Basic Settings**
* **Node Name**: Defaults to the connection name; you can also set a name that has business significance.
@@ -101,7 +104,7 @@ As an example of creating a data replication task, the article demonstrates the
* **Alert Settings**
Defaults as per source node alert settings.
-7. (Optional) Click the  icon above to configure the task properties.
+7. (Optional) Click the  icon above to configure the task properties.
* **Task name**: Fill in a name that has business significance.
* **Sync type**: You have the option to select **Full + incremental synchronization**, or you can choose to perform **Initial sync** and **CDC** (Change Data Capture) separately. In real-time data synchronization scenarios, using the combination of full and incremental data copying allows you to copy existing data from the source database to the target database.
@@ -110,11 +113,11 @@ As an example of creating a data replication task, the article demonstrates the
8. Click **Start**, and you will be able to view the performance of the task on the current page, including metrics such as RPS (Records Per Second), delay, and task event statistics.
- 
+ 
## See also
-[Monitor or Manage Tasks](manage-task.md)
+[Monitor or Manage Tasks](../data-transformation/manage-task.md)
diff --git a/docs/user-guide/incremental-check.md b/docs/data-replication/incremental-check.md
similarity index 95%
rename from docs/user-guide/incremental-check.md
rename to docs/data-replication/incremental-check.md
index c8e7a3ae..0b47baff 100644
--- a/docs/user-guide/incremental-check.md
+++ b/docs/data-replication/incremental-check.md
@@ -1,15 +1,7 @@
# Incremental Data Validation
-import Content from '../reuse-content/_enterprise-features.md';
-
-
-
Incremental data validation is a real-time mechanism designed to enhance data accuracy and consistency. It periodically samples and compares newly inserted or updated records between the source and target systems to detect and automatically correct inconsistencies.
-## Prerequisites
-
-Tables selected for incremental validation must have a primary key or a unique index.
-
## Background
As real-time data synchronization and integration become core demands across modern enterprises, ensuring consistent data flow between heterogeneous systems has grown increasingly critical. However, real-time sync is often prone to issues such as network latency, system errors, and failed writes, which can result in inconsistencies between source and target databases—ultimately impacting business decisions and operational efficiency.
@@ -34,7 +26,7 @@ To address these challenges, TapData introduces **Incremental Data Validation**,
## Procedure
-1. [Log in to TapData Platform](log-in.md).
+1. Log in to TapData Platform.
2. Create a data replication or transformation task.
diff --git a/docs/user-guide/data-development/manage-task.md b/docs/data-replication/manage-task.md
similarity index 77%
rename from docs/user-guide/data-development/manage-task.md
rename to docs/data-replication/manage-task.md
index 49fe9e13..612156f0 100644
--- a/docs/user-guide/data-development/manage-task.md
+++ b/docs/data-replication/manage-task.md
@@ -1,15 +1,13 @@
-# Manage Data Transformation Task
-import Content from '../../reuse-content/_all-features.md';
+# Manage Data Replication Task
-
After the replication task is created, you can monitor and manage the task in the task list.
-
+
| Operation | Description |
| ----------------- | ------------------------------------------------------------ |
-| **Set Tag** | Click the  icon at the top left of the task list to expand the tag information. You can quickly set tags by clicking and dragging the task name to the desired tag. |
+| **Set Tag** | Click the  icon at the top left of the task list to expand the tag information. You can quickly set tags by clicking and dragging the task name to the desired tag. |
| **Set category** | Choose the target task and categorize it based on the business perspective. |
| **Start**/**Stop** | After stopping the task, the next start will continue to replicate the data based on the last stopped incremental point in time. |
| **Edit** | Configure the task, including node settings, synchronized tables, task start schedule, and other information. Please note that the task cannot be altered during execution. |
diff --git a/docs/user-guide/copy-data/monitor-task.md b/docs/data-replication/monitor-task.md
similarity index 84%
rename from docs/user-guide/copy-data/monitor-task.md
rename to docs/data-replication/monitor-task.md
index dfd7a110..fba689e6 100644
--- a/docs/user-guide/copy-data/monitor-task.md
+++ b/docs/data-replication/monitor-task.md
@@ -1,7 +1,5 @@
# Monitor Data Replication Task
-import Content from '../../reuse-content/_all-features.md';
-
Once the data replication task is started, the page will automatically redirect to the task monitoring page. From there, you can monitor the task's operation details, such as the status of the Agent, data synchronization progress, task progress, alert settings, and other relevant information.
@@ -11,7 +9,7 @@ By clicking the **monitor** button on the task list page, you can access the mon
:::
-
+
@@ -43,7 +41,7 @@ Displaying basic information and key monitoring indicators of the task, includin
## ③ Node Information Display Area
-Hover your mouse pointer over a node to display key metrics for that node, and click the  icon in the bottom right corner of the node to see more details.
+Hover your mouse pointer over a node to display key metrics for that node, and click the  icon in the bottom right corner of the node to see more details.
- **Full Sync Progress**: The progress report on the full data synchronization.
- **Incremental Data Synchronization**: The incremental log collection time point is represented as the relative time of (engine time - incremental time point of the node) in the floating window when hovering the mouse.
@@ -57,13 +55,13 @@ Hover your mouse pointer over a node to display key metrics for that node, and c
## ④ Task Log Display Area
-Click the  icon at the top of the page, then you can view the progress, logs, alert list, and associated task information for a task run. You can filter the logs using keywords, periods, and levels, or download them for local analysis on the **Log** tab.
+Click the  icon at the top of the page, then you can view the progress, logs, alert list, and associated task information for a task run. You can filter the logs using keywords, periods, and levels, or download them for local analysis on the **Log** tab.
## ⑤ Task/Alert Setting Area
-Click the  icon at the top of the page, which displays the task settings (not modifiable) and alert settings, you can set the alert rules:
+Click the  icon at the top of the page, which displays the task settings (not modifiable) and alert settings, you can set the alert rules:
* Task running error alert
* Notice of full completion of tasks
diff --git a/docs/data-transformation/README.md b/docs/data-transformation/README.md
new file mode 100644
index 00000000..81dbcbdd
--- /dev/null
+++ b/docs/data-transformation/README.md
@@ -0,0 +1,21 @@
+# Data Transformation
+
+TapData's Data Transformation capabilities enable you to build sophisticated data processing pipelines that go beyond simple replication. Whether you're creating real-time materialized views, merging multiple data sources, or applying complex business logic, our transformation tools provide the flexibility and performance you need.
+
+## Getting Started
+
+* **Data Processing Tasks**
+Create comprehensive [data transformation pipelines](create-task.md) with processing nodes for filtering, field modifications, data type conversions, and custom business logic. Perfect for ETL workflows and data preparation.
+
+
+* **Incremental Materialized Views (IMV)**
+Build [real-time, high-performance views](create-views/README.md) that automatically update as source data changes. Combine data from multiple sources to accelerate insights and decision-making.
+
+
+:::tip Enterprise Data Management
+For organizations seeking structured data platform governance, explore TapData's [Operational Data Hub (ODH)](../operational-data-hub/README.md) framework. ODH provides a layered architecture approach with dedicated zones for data ingestion (FDM), transformation (MDM), and delivery (ADM), offering proven patterns for enterprise data management and team collaboration.
+:::
+
+import DocCardList from '@theme/DocCardList';
+
+
\ No newline at end of file
diff --git a/docs/user-guide/data-development/create-task.md b/docs/data-transformation/create-task.md
similarity index 94%
rename from docs/user-guide/data-development/create-task.md
rename to docs/data-transformation/create-task.md
index 866c4322..ade14053 100644
--- a/docs/user-guide/data-development/create-task.md
+++ b/docs/data-transformation/create-task.md
@@ -1,8 +1,5 @@
# Create a Data Transformation Task
-import Content from '../../reuse-content/_all-features.md';
-
-
In TapData, data transformation tasks provide the capability to incorporate processing nodes between source and target data nodes. These processing nodes serve as valuable tools for efficiently carrying out data processing tasks, such as merging multiple tables, splitting data, and adding or removing fields.
@@ -12,7 +9,7 @@ The following article outlines the step-by-step process of creating data transfo
As an example, we will show how to change the **birthdate** field's data type from **STRING** to **DATE** in the table structure without modifying the source table (**customer** table) and simultaneously filter out users born after **1991-01-01**, a data transformation task is created. The resulting table, **customer_new**, will reflect the updated table structure and filtered data.
-1. [Log in to TapData Platform](../log-in.md).
+1. Log in to TapData Platform.
2. In the left navigation panel, click **Data Transformation**.
@@ -30,12 +27,12 @@ As an example, we will show how to change the **birthdate** field's data type fr
6. Connect the aforementioned four nodes in the order of data flow as shown below.
- 
+ 
7. Follow the instructions below to configure each node in sequence.
1. On the canvas, click the source node on the far left and complete the parameter configuration in the right panel according to the following instructions.
- 
+ 
* **Basic Settings**
@@ -66,11 +63,11 @@ As an example, we will show how to change the **birthdate** field's data type fr
By default, if the node's average processing time is equal to or greater than 5 seconds for a consecutive minute, system and email notifications are sent. You can also adjust the rules or turn off alerts according to business needs.
2. Click on the **Type Modification** node, and then in the right panel, modify the type of the **birthdate** field to **Date**.
- 
+ 
3. Click on the Row Filter node and complete the parameter configuration in the right panel according to the following instructions.
- 
+ 
* **Action**: Choose **Retain Matching Data**.
* **Conditional Expression**: Enter the data matching expression, in this case `record.birthdate >= '1990-01-01'`, supported symbols are:
@@ -81,7 +78,7 @@ As an example, we will show how to change the **birthdate** field's data type fr
4. Click the final target data node and complete the parameter configuration in the right panel according to the following instructions.
- 
+ 
* **Basic Settings**
* **Node Name**: Defaults to the connection name, but you can set a name with business meaning.
@@ -100,7 +97,7 @@ As an example, we will show how to change the **birthdate** field's data type fr
* **Alert Settings**
By default, if the node's average processing time is equal to or greater than 5 seconds for a consecutive minute, system and email notifications are sent. You can also adjust the rules or turn off alerts according to business needs.
-8. (Optional) Click the  icon above to configure the task properties.
+8. (Optional) Click the  icon above to configure the task properties.
* **Task name**: Fill in a name that has business significance.
* **Sync type**: You have the option to select **full + incremental synchronization**, or you can choose to perform an **initial sync** and use Change Data Capture (**CDC**) separately. In real-time data synchronization scenarios, the combination of full and incremental data copying is often used to transfer existing data from the source database to the target database.
@@ -109,7 +106,7 @@ As an example, we will show how to change the **birthdate** field's data type fr
9. Click **Save** or **Start**, the following figure shows that after the task starts successfully, you can view its RPS (Records Per Second), delay, task event, and other information.
- 
+ 
:::tip
diff --git a/docs/data-transformation/create-views/README.md b/docs/data-transformation/create-views/README.md
new file mode 100644
index 00000000..18b45146
--- /dev/null
+++ b/docs/data-transformation/create-views/README.md
@@ -0,0 +1,13 @@
+# Create Incremental Materialized Views
+
+This section shows you how to use TapData to build **Incremental Materialized Views (IMV)**—real-time, high-performance analytics tables that combine data from multiple sources to speed up your insights and decision-making.
+
+Before diving into view creation, we recommend starting with the [Overview](overview.md) to understand the design principles, data structure options, and use cases for Incremental Materialized Views.
+
+:::tip Learn About Operational Data Hub (ODH)
+For enterprise-grade data management, explore TapData's [Operational Data Hub](../../operational-data-hub/plan-data-platform.md) framework. ODH provides a structured approach with dedicated layers for data ingestion (FDM), transformation (MDM), and delivery (ADM), offering valuable insights into modern data architecture and team collaboration patterns.
+:::
+
+import DocCardList from '@theme/DocCardList';
+
+
diff --git a/docs/data-transformation/create-views/overview.md b/docs/data-transformation/create-views/overview.md
new file mode 100644
index 00000000..a2687eb8
--- /dev/null
+++ b/docs/data-transformation/create-views/overview.md
@@ -0,0 +1,50 @@
+# Overview
+
+This section will help you design real-time Incremental Materialized Views (IMVs) in TapData—tailored to your downstream use cases. You'll learn how to choose the right data structure for your needs, whether that's a flat table, embedded documents, or nested arrays, ensuring your data is clean, consistent, and ready to drive modern analytics and APIs.
+
+## Background
+
+In the **Getting Started** section, you learned how to quickly join your orders and users tables to build a simple, real-time view that helps analysts identify high-value customers and power marketing activities like coupons and loyalty programs.
+
+However, in real-world e-commerce environments, data requirements are rarely so simple. Different teams often need richer, more flexible insights from the same dataset:
+
+- **Analysts and marketing teams** want to segment customers by value, region, or product preferences without relying on complex SQL.
+- **BI and reporting teams** need detailed transaction-level data to analyze order quantities, categories, and product bundles.
+- **Developers and data engineers** are looking for ways to simplify ETL pipelines and API integrations, while keeping production systems performant.
+
+To support these diverse use cases, we’ll extend our earlier example and design a more advanced, business-ready view that combines multiple data sources:
+
+- Embedded user profiles to preserve full customer details in each record.
+- Nested arrays of order items to capture all products in a single purchase.
+- Flattened product attributes so each order line includes product names, categories, and pricing.
+
+import TapDataAnimation from '@site/src/components/Animation/TapDataAnimation';
+
+
+
+
+Ways to Add Related Fields
+
+When designing your Incremental Materialized View, you can choose how data from related tables is included in your main record. TapData lets you customize this structure to match your analysis needs and downstream use cases:
+
+- **Flatten**: Pull selected columns directly into the top level of the main table. Ideal for simple attributes you want to filter or group by (e.g., user_level, country).
+- **Embedded Document**: Include all or selected fields as a nested object. Useful for preserving detailed context, such as a user profile with signup date, tier history, or calculated metrics.
+- **Embedded Array**: Aggregate multiple related records as an array of objects. Perfect for one-to-many relationships like order items, each enriched with product details.
+
+By combining these methods, you can design a single view that is analysis-ready, API-friendly, and tailored to your business questions—all without complex joins or heavy ETL processes.
+
+
+
+With TapData’s Incremental Materialized Views, you can automatically join and transform data across tables in real time—no manual pipelines required. This approach gives your teams a single, always-up-to-date view ready for analytics, dashboards, and APIs, without overloading operational databases.
+
+In this section, we’ll also share tips for optimizing performance and designing views that scale with your business needs.
+
+## How to Create Your View
+
+TapData offers multiple ways to design and build your IMVs, so you can choose the approach that best suits your needs and technical comfort level:
+
+- [Using IMV Guide](using-imv-guide.md): A step-by-step wizard for quickly creating even complex joins, with a streamlined setup that doesn't require extra processing nodes.
+- [Using Data Pipeline](using-data-pipeline-ui.md): A visual, flow-based interface that lets you define joins, choose fields, and insert transformation nodes for cleaning or enriching your data.
+- [Using TapFlow](using-tapflow.md): A code-friendly approach designed for developers and advanced users who want full control and automation via API or CLI.
+
+Choose the approach that works best for you and start building real-time, analysis-ready data views tailored to your business.
diff --git a/docs/data-transformation/create-views/using-data-pipeline-ui.md b/docs/data-transformation/create-views/using-data-pipeline-ui.md
new file mode 100644
index 00000000..e02fd41c
--- /dev/null
+++ b/docs/data-transformation/create-views/using-data-pipeline-ui.md
@@ -0,0 +1,157 @@
+# Build View with Data Pipeline
+
+Use TapData’s Data Pipeline to build real-time, incremental materialized views with flexible, visual workflows—perfect for adding rich processing steps and custom transformations.
+
+:::tip
+This guide expands on the example in [Design Incremental Materialized Views](overview.md) and shows you how to build it using the visual Data Pipeline.
+:::
+
+TapData’s Data Pipeline also supports rich processing nodes between your source and target. You can add steps for field renaming, row filtering, and more—all without writing complex SQL or maintaining separate ETL jobs.
+
+## Prerequisites
+
+Make sure you have already connected your **source MySQL database** and **target MongoDB database** in TapData.
+
+If you haven’t set them up yet, see [Connect Data Sources](../../getting-started/connect-data-source.md).
+
+## Procedure
+
+Follow these steps to design an advanced real-time view with nested user profiles and detailed order items.
+
+1. Log in to TapData platform.
+
+2. In the left navigation panel, go to **Data Transformation** and open **Data Pipeline**.
+
+3. Add source nodes to the canvas.
+
+ 1. Drag your source MySQL connection onto the canvas and choose **orders** as the main table.
+
+ 2. Repeat this step to add **users**, **order_items**, and **products** as separate nodes.
+
+ 
+
+4. Set up the join logic.
+
+ 1. Drag a Master-Slave Merge node onto the canvas.
+
+ 2. Connect the **orders** node and all other source nodes to this merge node to define their relationships.
+
+ 
+
+5. Embed the user profile.
+
+ 1. Drag the **users** node onto the **orders** node.
+
+ 
+
+ 2. Choose Document as the embed type, then set the field path to `user_info` and map the join key to `user_id`.
+
+ 
+
+ :::tip
+
+ Document embedding nests the entire user record inside each order. This is ideal for marketing segmentation and analysis without separate lookups.
+
+ :::
+
+6. Include order items as an array.
+
+ 1. Drag the **order_items** node onto the **orders** node.
+
+ 2. Choose **Array** as the embed type, then set the field path to `product_items` and map the join key to `order_id`.
+
+ 
+
+ :::tip
+
+ Arrays allow you to store all order items as a single field in the order document—perfect for representing one-to-many relationships natively.
+
+ :::
+
+7. Enrich **order_items** with product details.
+
+ 1. Drag the **products** node onto the **order_items** table.
+
+ 2. Choose **Document** or **Flatten** depending on your data structure.
+
+ 3. Map the join key to `product_id` so product details are included in each order item.
+
+ Document nests the product data as an object, while **Flatten** merges product fields directly into **order_items**—choose the style that fits your target schema.
+
+8. Configure the target output.
+
+ 1. Drag your target MongoDB connection onto the canvas.
+
+ 2. Connect the Master-Slave Merge node to the MongoDB target.
+
+ 3. Select the target collection or enter a custom name. TapData will create it automatically if it doesn’t exist.
+
+ 4. Specify the **update key** to enable upsert behavior and automatic index creation.
+
+ 
+
+ Optional settings include:
+
+ - **Initial load behavior**: Clear existing data or append to it.
+ - **Batch size** and **max wait time**: Fine-tune performance for large loads.
+ - **Multithreaded writes**: Configure for higher throughput in full or incremental sync.
+
+9. Click **Start** to launch the real-time materialized view.
+
+ Once running, monitor throughput, latency, and event processing stats in real time.
+
+ 
+
+
+
+## Verify Results
+
+Once your task is running, log in to your target MongoDB to explore the new view. Here’s an example document illustrating the nested structure:
+
+```javascript
+{
+ _id: ObjectId('6868d470d9b9cd512feb6b69'),
+ order_id: 'o2001',
+ order_amount: Decimal128('759.97'),
+ order_status: 'PAID',
+ order_time: 2025-01-02T10:00:00.000Z,
+ payment_method: 'CREDIT_CARD',
+ user_id: 'u001',
+ product_items: [
+ {
+ quantity: 1,
+ item_id: 'i3001',
+ product_id: 'p101',
+ order_id: 'o2001',
+ category: 'Electronics',
+ product_name: 'Smartphone',
+ unit_price: Decimal128('699.99')
+ },
+ {
+ quantity: 2,
+ item_id: 'i3002',
+ product_id: 'p102',
+ order_id: 'o2001',
+ category: 'Accessories',
+ product_name: 'Phone Case',
+ unit_price: Decimal128('29.99')
+ }
+ ],
+ user_info: {
+ city: 'New York',
+ country: 'USA',
+ signup_time: 2024-12-20T12:00:00.000Z,
+ user_id: 'u001',
+ user_level: 'GOLD',
+ user_name: 'Alice'
+ }
+}
+```
+
+This structure is analysis-ready, API-friendly, and tailored for real-time use. Analysts can easily filter and aggregate orders, marketing can segment by user attributes, and developers can serve complete order details in a single API response without expensive joins.
+
+## What’s Next?
+
+- **Monitor your task** to track throughput and latency in real time.
+- **Validate data accuracy** using built-in tools or source queries.
+- **Publish the view as an API** so other teams or systems can consume fresh, structured order data via REST or GraphQL.
\ No newline at end of file
diff --git a/docs/data-transformation/create-views/using-imv-guide.md b/docs/data-transformation/create-views/using-imv-guide.md
new file mode 100644
index 00000000..118626a6
--- /dev/null
+++ b/docs/data-transformation/create-views/using-imv-guide.md
@@ -0,0 +1,133 @@
+# Build View with IMV Guide
+
+
+
+Use TapData’s IMV Guide to design real-time, incremental materialized views that deliver analysis-ready data without complex SQL or heavy ETL. Ideal for BI dashboards, marketing analytics, and API integrations—all with a simple, no-code experience.
+
+:::tip
+
+This step-by-step guide builds on the real-world ecommerce scenario described in [Design Incremental Materialized Views](overview.md) and shows you how to implement it with the IMV Guide.
+
+:::
+
+## Prerequisites
+
+Make sure you have already connected your **source MySQL database** and **target MongoDB database** in TapData.
+
+If you haven’t set them up yet, see [Connect Data Sources](../../getting-started/connect-data-source.md).
+
+## Procedure
+
+Follow these steps to design your advanced real-time view with nested user profiles and detailed order items.
+
+1. Log in to TapData platform.
+
+2. In the left navigation panel, go to **Data Transformation**.
+
+3. Click **Build Materialized View** to open the configuration workspace.
+
+4. **Select your main source table.**
+ For this scenario, choose **orders** as your primary data source. This is the top-level collection we’ll enrich with user and item details.
+
+ 
+
+5. Add User Profile as Embedded Document.
+ To bring in rich user details:
+
+ - Click **+ Add Field** and choose **Embedded Document**.
+ - Enter a field name (e.g., `user_info`) for the embedded object in your main document.
+ - In the field editor, choose the **users** table and set the join condition on `user_id`.
+
+ Once configured, you’ll see `user_info` added on the right with all user fields nested. This preserves the full user profile in the order document—ideal for marketing or segmentation.
+
+ 
+
+6. Combine Order Items with Product Details as an Embedded Array.
+
+ Now let’s link detailed line items:
+
+ 1. In the main **orders** section, click **+ Add Field** and choose **Embedded Array**.
+
+ 2. Enter a field name (e.g., `product_items`) for the array.
+
+ 3. In the editor, choose the **order_items** table and set the join on `order_id`.
+
+ 
+
+ This aggregates all related items for each order as an array—perfect for one-to-many detail.
+
+ 4. Still within the **order_items** node, click **+ Add Field** and choose **Flatten**.
+
+ 5. Select the **products** table and join on `product_id`.
+
+ 
+
+ This step enriches each item with product details like name, category, and unit price.
+
+ Once complete, the right-side preview shows a fully nested order record with `user_info` and an array of enriched `product_items`.
+
+7. Configure Target Output.
+
+ 1. Click **+ Write Target** in the top right.
+
+ 2. Select your MongoDB connection, then enter a target collection name (e.g., `orders_advanced_imv`).
+
+ The preview on the right shows field mappings and types for the target collection.
+
+8. Click **Start** in the top right to start your real-time materialized view.
+
+ After launch, you’ll see the task in the monitoring page with stats like records per second (RPS), latency, and processed event counts.
+
+ 
+
+## Verify Results
+
+Once your task is running, log in to your target MongoDB to explore the new view. Here’s an example document illustrating the nested structure:
+
+```javascript
+{
+ _id: ObjectId('6868d470d9b9cd512feb6b69'),
+ order_id: 'o2001',
+ order_amount: Decimal128('759.97'),
+ order_status: 'PAID',
+ order_time: 2025-01-02T10:00:00.000Z,
+ payment_method: 'CREDIT_CARD',
+ user_id: 'u001',
+ product_items: [
+ {
+ quantity: 1,
+ item_id: 'i3001',
+ product_id: 'p101',
+ order_id: 'o2001',
+ category: 'Electronics',
+ product_name: 'Smartphone',
+ unit_price: Decimal128('699.99')
+ },
+ {
+ quantity: 2,
+ item_id: 'i3002',
+ product_id: 'p102',
+ order_id: 'o2001',
+ category: 'Accessories',
+ product_name: 'Phone Case',
+ unit_price: Decimal128('29.99')
+ }
+ ],
+ user_info: {
+ city: 'New York',
+ country: 'USA',
+ signup_time: 2024-12-20T12:00:00.000Z,
+ user_id: 'u001',
+ user_level: 'GOLD',
+ user_name: 'Alice'
+ }
+}
+```
+
+This structure is analysis-ready, API-friendly, and tailored for real-time use. Analysts can easily filter and aggregate orders, marketing can segment by user attributes, and developers can serve complete order details in a single API response without expensive joins.
+
+## What’s Next?
+
+- **Monitor your task** to track throughput and latency in real time.
+- **Validate data accuracy** using built-in tools or source queries.
+- **Publish the view as an API** so other teams or systems can consume fresh, structured order data via REST or GraphQL.
\ No newline at end of file
diff --git a/docs/data-transformation/create-views/using-tapflow.md b/docs/data-transformation/create-views/using-tapflow.md
new file mode 100644
index 00000000..5a2db6e3
--- /dev/null
+++ b/docs/data-transformation/create-views/using-tapflow.md
@@ -0,0 +1,250 @@
+# Build View with TapFlow
+
+Use TapData’s TapFlow to build real-time, incremental materialized views with full code-level control—ideal for automation, advanced configurations, and developer workflows.
+
+:::tip
+This approach extends the scenario in [Design Incremental Materialized Views](overview.md) and shows you how to implement it programmatically with TapFlow.
+
+:::
+
+```mdx-code-block
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+```
+
+## Prerequisites
+
+- Install TapShell and complete initialization.
+
+- Make sure you've already connected your **source MySQL database** and **target MongoDB database** in TapData.
+
+ If you need help setting up connections, see [Connect Data Sources](../../getting-started/connect-data-source.md) for detailed instructions.
+
+## Procedure
+
+```mdx-code-block
+
+
+```
+
+You can build the real-time, incremental materialized view directly in **TapShell** using these steps:
+
+1. Launch the TapShell interactive CLI:
+
+ ```bash
+ tap
+ ```
+
+2. Define the flow and set **orders** as the main source table:
+
+ ```python
+ # Create the Flow and set "orders" as the root table
+ orderView = Flow("Order_Advanced_View") \
+ .read_from("MySQL_Demo.orders")
+ ```
+
+3. Add **lookup** steps to join related tables. Each `.lookup()` call merges data into the main stream at the specified path using the provided join key.
+
+ ```python
+ # Embed 'users' as a nested document
+ orderView.lookup(
+ "MySQL_Demo.users",
+ path="user_info",
+ type="object",
+ relation=[["user_id", "user_id"]]
+ )
+
+ # Add 'order_items' as an array
+ orderView.lookup(
+ "MySQL_Demo.order_items",
+ path="product_items",
+ type="array",
+ relation=[["order_id", "order_id"]]
+ )
+
+ # Flatten 'products' into each order_item
+ orderView.lookup(
+ "MySQL_Demo.products",
+ path="product_items.product",
+ type="object",
+ relation=[["product_items.product_id", "product_id"]]
+ )
+ ```
+
+ Understanding type and path in .lookup()
+ These parameters control **how** related data is merged:
+
+ - **type="object"** – embeds the joined record as a nested document at `path`. Ideal for one-to-one enrichments like adding user profiles inside orders.
+ - **type="array"** – collects multiple matching records as an array of documents at `path`. Perfect for one-to-many relationships, such as order items.
+
+ :::tip
+
+ To flatten joined fields directly into the parent document—so they appear at the same level without any nesting—use `type="object"` and simply leave `path` empty (`path=""`). This creates a wide-table schema that merges all fields into the root level.
+
+ :::
+
+
+
+4. Specify the target MongoDB collection to store the resulting view:
+
+ ```python
+ # Define the MongoDB target collection
+ orderView.write_to("MongoDB_Demo.orders_advanced_imv")
+ # Save the task
+ orderView.save()
+ ```
+
+5. Start the flow and monitor its status:
+
+ ```python
+ orderView.start()
+ status orderView
+ ```
+
+ Example output:
+
+ ```
+ job current status is: running, qps is: 3521.2, total rows: 99441, delay is: 332ms
+ ```
+
+
+
+
+Alternatively, you can define the same real-time materialized view as a standalone **Python script** that you can run with:
+
+```bash
+python real_time_order_view.py
+```
+
+Here's the complete example:
+
+```python title="real_time_order_view.py"
+"""
+Example: Building a real-time order-wide view (single view) with TapFlow Python SDK
+
+This script reads from multiple MySQL source tables in the 'MySQL_Demo' connection,
+joins them into a nested order document, and writes the result to a MongoDB collection.
+"""
+
+from tapflow.lib import *
+from tapflow.cli.cli import init
+
+# Initialize TapFlow configuration
+init()
+
+# Create a new data flow task
+orderView = Flow("Order_SingleView_Sync")
+
+# Set the main source table for orders
+orderView.read_from("MySQL_Demo.orders")
+
+# Add user profile as a nested document
+orderView.lookup(
+ "MySQL_Demo.users",
+ path="user_info",
+ type="object",
+ relation=[["user_id", "user_id"]]
+)
+
+# Add order items as an array
+orderView.lookup(
+ "MySQL_Demo.order_items",
+ path="product_items",
+ type="array",
+ relation=[["order_id", "order_id"]]
+)
+
+# Flatten product details into each order_item
+orderView.lookup(
+ "MySQL_Demo.products",
+ path="product_items.product",
+ type="object",
+ relation=[["product_items.product_id", "product_id"]]
+)
+
+# Define the MongoDB target collection
+orderView.write_to("MongoDB_Demo.orderSingleView")
+
+# Save and start the flow
+orderView.save()
+orderView.start()
+print("Real-time wide-table task started.")
+
+# Monitor task status
+while True:
+ current_status = orderView.status()
+ if current_status == "running":
+ print(f"Task status: {current_status}")
+ break
+ elif current_status == "error":
+ print("Task failed to start. Please check your configuration or logs.")
+ break
+```
+
+**Example Output**
+When you run the script, you should see logs similar to:
+
+```
+Real-time wide-table task started.
+Task status: running
+```
+
+
+
+
+
+
+## Verify Results
+
+Once your task is running, log in to your target MongoDB to explore the new view. Here’s an example document illustrating the nested structure:
+
+```javascript
+{
+ _id: ObjectId('6868d470d9b9cd512feb6b69'),
+ order_id: 'o2001',
+ order_amount: Decimal128('759.97'),
+ order_status: 'PAID',
+ order_time: 2025-01-02T10:00:00.000Z,
+ payment_method: 'CREDIT_CARD',
+ user_id: 'u001',
+ product_items: [
+ {
+ quantity: 1,
+ item_id: 'i3001',
+ product_id: 'p101',
+ order_id: 'o2001',
+ category: 'Electronics',
+ product_name: 'Smartphone',
+ unit_price: Decimal128('699.99')
+ },
+ {
+ quantity: 2,
+ item_id: 'i3002',
+ product_id: 'p102',
+ order_id: 'o2001',
+ category: 'Accessories',
+ product_name: 'Phone Case',
+ unit_price: Decimal128('29.99')
+ }
+ ],
+ user_info: {
+ city: 'New York',
+ country: 'USA',
+ signup_time: 2024-12-20T12:00:00.000Z,
+ user_id: 'u001',
+ user_level: 'GOLD',
+ user_name: 'Alice'
+ }
+}
+```
+
+This structure is analysis-ready, API-friendly, and tailored for real-time use. Analysts can easily filter and aggregate orders, marketing can segment by user attributes, and developers can serve complete order details in a single API response without expensive joins.
+
+## What’s Next?
+
+- **Monitor your task** to track throughput and latency in real time.
+- **Validate data accuracy** using built-in tools or source queries.
+- **Publish the view as an API** so other teams or systems can consume fresh, structured order data via REST or GraphQL.
+
+
+
diff --git a/docs/data-transformation/design-considerations.md b/docs/data-transformation/design-considerations.md
new file mode 100644
index 00000000..e6a9a158
--- /dev/null
+++ b/docs/data-transformation/design-considerations.md
@@ -0,0 +1 @@
+# Design Considerations
\ No newline at end of file
diff --git a/docs/user-guide/copy-data/manage-task.md b/docs/data-transformation/manage-task.md
similarity index 68%
rename from docs/user-guide/copy-data/manage-task.md
rename to docs/data-transformation/manage-task.md
index 2289051e..65b7f0a4 100644
--- a/docs/user-guide/copy-data/manage-task.md
+++ b/docs/data-transformation/manage-task.md
@@ -1,20 +1,16 @@
-# Manage Data Replication Task
+# Manage Data Transformation Task
-import Content from '../../reuse-content/_all-features.md';
+Once the data transformation task is created, you can manage it in the task list.
-
-
-Once the replication task is created, you can monitor and manage it in the task list.
-
-
+
| Operation | Description |
| ----------------- | ------------------------------------------------------------ |
-| **Set Tag** | Click the  icon at the top left of the task list to expand the tag information. You can quickly set tags by clicking and dragging the task name to the desired tag. |
+| **Set Tag** | Click the  icon at the top left of the task list to expand the tag information. You can quickly set tags by clicking and dragging the task name to the desired tag. |
| **Set category** | Choose the target task and categorize it based on the business perspective. |
| **Start**/**Stop** | After stopping the task, the next start will continue to replicate the data based on the last stopped incremental point in time. |
| **Edit** | Configure the task, including node settings, synchronized tables, task start schedule, and other information. Please note that the task cannot be altered during execution. |
-| **Monitor** | View the running progress, running logs, connections, history, synchronized content, and more. For more information, see [monitor data replication task](monitor-task.md). |
+| **Monitor** | View the running progress, running logs, connections, history, synchronized content, and more. For more information, see [monitor task](monitor-view-tasks.md) |
| **Copy** | Clear the data synchronization progress of the task, and the next start will initiate the data synchronization task from the beginning. |
| **Reset** | Clear the data synchronization progress of the task, and the next start will restart the data synchronization task. |
| **Delete** | Please note that once a task is deleted, it cannot be recovered. Please proceed with caution when deleting tasks. |
diff --git a/docs/user-guide/data-development/monitor-task.md b/docs/data-transformation/monitor-view-tasks.md
similarity index 84%
rename from docs/user-guide/data-development/monitor-task.md
rename to docs/data-transformation/monitor-view-tasks.md
index 2689096b..05769504 100644
--- a/docs/user-guide/data-development/monitor-task.md
+++ b/docs/data-transformation/monitor-view-tasks.md
@@ -1,8 +1,6 @@
# Monitor Data Transformation Task
-import Content from '../../reuse-content/_all-features.md';
-
Once the data transformation task is started, the page will automatically redirect to the task monitoring page. From there, you can monitor the task's operation details, such as the status of the Agent, data synchronization progress, task progress, alert settings, and other relevant information.
@@ -12,7 +10,7 @@ By clicking the **monitor** button on the task list page, you can access the mon
:::
-
+
@@ -44,7 +42,7 @@ Displaying basic information and key monitoring indicators of the task, includin
## ③ Node Information Display Area
-Hover your mouse pointer over a node to display key metrics for that node, and click the  icon in the bottom right corner of the node to see more details.
+Hover your mouse pointer over a node to display key metrics for that node, and click the  icon in the bottom right corner of the node to see more details.
- **Full Sync Progress**: The progress report on the full data synchronization.
- **Incremental Data Synchronization**: The incremental log collection time point is represented as the relative time of (engine time - incremental time point of the node) in the floating window when hovering the mouse.
@@ -58,13 +56,13 @@ Hover your mouse pointer over a node to display key metrics for that node, and c
## ④ Task Log Display Area
-Click the  icon at the top of the page, then you can view the progress, logs, alert list, and associated task information for a task run. You can filter the logs using keywords, periods, and levels, or download them for local analysis on the **Log** tab.
+Click the  icon at the top of the page, then you can view the progress, logs, alert list, and associated task information for a task run. You can filter the logs using keywords, periods, and levels, or download them for local analysis on the **Log** tab.
## ⑤ Task/Alert Setting Area
-Click the  icon at the top of the page, which displays the task settings (not modifiable) and alert settings, you can set the alert rules:
+Click the  icon at the top of the page, which displays the task settings (not modifiable) and alert settings, you can set the alert rules:
* Task running error alert
* Notice of full completion of tasks
diff --git a/docs/user-guide/data-development/process-node.md b/docs/data-transformation/process-node.md
similarity index 85%
rename from docs/user-guide/data-development/process-node.md
rename to docs/data-transformation/process-node.md
index ab46e852..5e44df07 100644
--- a/docs/user-guide/data-development/process-node.md
+++ b/docs/data-transformation/process-node.md
@@ -1,9 +1,5 @@
-# Add Processing Node
-import Content from '../../reuse-content/_all-features.md';
-
-
-
-TapData supports the addition of processing nodes to data replication or pipeline tasks, providing the flexibility to incorporate data filtering, field adjustments, and other processing operations as needed. This allows users to customize and enhance their data replication workflows based on specific requirements.
+# Supported Processing Node
+TapData supports the addition of processing nodes to data transformation tasks, providing the flexibility to incorporate data filtering, field adjustments, and other processing operations as needed. This allows users to customize and enhance their data replication workflows based on specific requirements.
## Row Filter
@@ -14,7 +10,7 @@ The main usage of processing nodes in TapData is to filter table data, where use
* **Conditional expression**: An expression that sets a filter condition
* **Example expression**: Filter out individuals who are either men over 50 years old or people under 30 years old with incomes of 10,000 or less. The filtering condition can be expressed as `( record.gender == 0&& record.age > 50) || ( record.age >= 30&& record.salary <= 10000)`.
-
+
@@ -22,7 +18,7 @@ The main usage of processing nodes in TapData is to filter table data, where use
The **Add and Delete Fields** node can be added to the canvas and its parameters can be configured after connecting it to the data node. It allows for adding new fields or deleting existing fields, and if a field is deleted, it will not be passed to the next node, and you can also adjust the fields order.
-
+
@@ -30,7 +26,7 @@ The **Add and Delete Fields** node can be added to the canvas and its parameters
To rename or convert the case of a field, add the **Field Rename** node to the canvas, connect it to the data node in the desired processing order, and configure the node parameters accordingly.
-
+
@@ -38,7 +34,7 @@ To rename or convert the case of a field, add the **Field Rename** node to the c
To assign a value to a field by performing calculations between fields, add the **Field Calculation** node to the canvas. Connect the node to the data node in the desired processing order and configure the calculation rules for the target field using JavaScript (JS) expressions.
-
+
@@ -46,7 +42,7 @@ To assign a value to a field by performing calculations between fields, add the
The Type modification node can be used to adjust the data type of the field.
-
+
@@ -56,14 +52,14 @@ In big data processing and analysis, merging and transforming data is a pivotal
:::tip
-- When using the Master Slave Merge, it's essential to [upgrade the Agent instance](../manage-agent.md) to version 3.5.1. Additionally, the target database should be either a self-deployed MongoDB or MongoDB Atlas.
+- When using the Master Slave Merge, it's essential to upgrade TapData to version 3.5.1. Additionally, the target database should be either a self-deployed MongoDB or MongoDB Atlas.
- Tables participating in the merge must contain one or more fields that form a logical unique identifier, ensuring data uniqueness.
:::
**Procedure**:
-1. [Log in to TapData Platform](../log-in.md).
+1. Log in to TapData platform.
2. On the left navigation bar, click **Data Transformation**.
@@ -71,13 +67,13 @@ In big data processing and analysis, merging and transforming data is a pivotal
4. Drag and drop the data sources you want to merge from the left side of the page to the right canvas. Then, drag the **Master Slave Merge** node from the bottom-left corner and connect them in the sequence shown below.
- 
+ 
5. Click on each of the data sources you want to merge sequentially and select the tables to be merged (**lineorder** / **date**) from the panel on the right.
6. Click the **Master Slave Merge** node, drag and drop the `date` table into the `lineorder` table to signify their relationship. Subsequently, you can view the merged table structure.
- 
+ 
:::tip
@@ -89,7 +85,7 @@ In big data processing and analysis, merging and transforming data is a pivotal
8. Click the data source you intend to store the merged table in, then select a target table or input a table name for TapData to automatically create in the right panel. After setting up, choose to update conditions.
- 
+ 
9. After confirming the configurations are correct, click **Start**.
@@ -111,13 +107,13 @@ The **Union** node in TapData merges multiple tables with the same or similar st
Assume that we want to merge(Union) **student1** and **student2** tables with the same table structure into one table, and then store the result in the **student_merge** table. The tables structure and data are as follows:
-
+
**Operation**:
-1. [Log in to TapData Platform](../log-in.md).
+1. Log in to TapData platform.
2. In the left navigation bar, click **Data Transformation**.
@@ -125,7 +121,7 @@ Assume that we want to merge(Union) **student1** and **student2** tables with th
4. Drag the desired data source from the left side of the page and place it onto the right canvas. Then, locate the **Union** node at the bottom left corner of the page and drag it onto the canvas. Finally, connect the data source node to the Union node to perform the append merge operation.
- 
+ 
5. Click on the desired data source that you want to perform the append merge with. In the panel on the right side of the page, select the table that you wish to merge, either **student1** or **student2**.
@@ -141,13 +137,13 @@ Assume that we want to merge(Union) **student1** and **student2** tables with th
:::
- 
+ 
9. After confirming that the configuration is correct, simply click on the **Start** to initiate the task.
After the operation is completed, you can observe the performance of the task on the current page. This includes metrics such as RPS (Records Per Second), delay, task time statistics, and more.
- 
+ 
@@ -176,13 +172,9 @@ mysql> select * from student_merge;
## Join
-import Content4 from '../../reuse-content/_enterprise-features.md';
-
-
-
The **Join Node** is used to configure joins between tables, supporting **Left Join** operations. You can simply select the relevant fields to perform the join and merge data from two tables.
-
+
:::tip
@@ -233,7 +225,7 @@ Currently, Python processing node does not support custom dependency packages an
:::
-
+
The Python processing node supports version Python 2.7.3. The supported third-party packages include: **requests-2.2.1**, **PyYAML-3.13**, and **setuptools-44.0.0**. The content description for `context` in the above image is as follows:
@@ -303,15 +295,11 @@ return record
## Unwind
-import Content2 from '../../reuse-content/_enterprise-features.md';
-
-
-
**Unwind** is specifically designed to handle array contents in JSON data structures, efficiently "unwinding" each element in an array and converting them into independent data rows. This approach is particularly suitable for scenarios that require deep analysis of array data, such as data normalization, personalized report generation, data transformation, and data quality cleaning. Additionally, when the target system or application does not support array formats, or for compatibility with other data structures, the Unwind node provides a ready-to-use solution, ensuring the efficiency and accuracy of the data processing and synchronization process.
Suppose there is a collection named `customer_json` that records the list of products purchased by each customer. To analyze the sales of each product in more detail, we want to convert the product list from an array format to separate data rows. In this way, each product will have a corresponding customer purchase record. To achieve this requirement, we can add an **Unwind** node when configuring the data transformation task. The node configuration example is as follows.
-
+
:::tip
@@ -325,23 +313,20 @@ In real-time data integration and synchronization processes, capturing and synch
When the data source lacks complete CDC support or is restricted by permission controls from accessing incremental logs, we can add a time field injection node to the data synchronization chain. This node automatically adds timestamp information to the source table data read. Subsequently, in the target table's configuration, this field (of DATETIME type) can be selected for polling to achieve incremental data retrieval, thereby further enhancing the flexibility of real-time data acquisition methods.
-
+
## JS Processing
-import Content3 from '../../reuse-content/_enterprise-features.md';
-
-
Support is provided for data processing through JavaScript or Java code. When writing the code, it is important to ensure source node and the target node is connected. This ensures seamless data processing between the two nodes.
-
+
### Model Declarations
For JS nodes, TapData deduces the model information of the node by sampling data trial run. If the deduced model is found to be inaccurate or the number of fields changes, the field information in the model can be defined explicitly by the model declaration.
-
+
In the development task, the method that the model declares support is as follows:
@@ -391,6 +376,6 @@ Parameter Description
### JS Built-in Function Description
-* [Standard JS](../../appendix/standard-js.md): TapData supports processing and operating on data records, providing various functions and operations to manipulate and transform data. For example, you can use JavaScript or Java code to convert date strings to Date types. This allows you to perform date-related operations, comparisons, and formatting on the data records as needed. With this capability, you have flexibility in manipulating and transforming your data to meet your specific requirements.
-* [Enhanced JS (Beta)](../../appendix/enhanced-js.md): TapData supports making external calls in JavaScript code using standard built-in functions. This allows you to perform network requests, interact with databases, and perform other operations by utilizing the capabilities of JavaScript and its built-in functions.
+* [Standard JS](../appendix/standard-js.md): TapData supports processing and operating on data records, providing various functions and operations to manipulate and transform data. For example, you can use JavaScript or Java code to convert date strings to Date types. This allows you to perform date-related operations, comparisons, and formatting on the data records as needed. With this capability, you have flexibility in manipulating and transforming your data to meet your specific requirements.
+* [Enhanced JS (Beta)](../appendix/enhanced-js.md): TapData supports making external calls in JavaScript code using standard built-in functions. This allows you to perform network requests, interact with databases, and perform other operations by utilizing the capabilities of JavaScript and its built-in functions.
diff --git a/docs/experimental/README.md b/docs/experimental/README.md
new file mode 100644
index 00000000..8616beeb
--- /dev/null
+++ b/docs/experimental/README.md
@@ -0,0 +1,5 @@
+# Experimental Features
+
+import DocCardList from '@theme/DocCardList';
+
+
\ No newline at end of file
diff --git a/docs/mcp/README.md b/docs/experimental/mcp/README.md
similarity index 54%
rename from docs/mcp/README.md
rename to docs/experimental/mcp/README.md
index bbb608b3..f0006298 100644
--- a/docs/mcp/README.md
+++ b/docs/experimental/mcp/README.md
@@ -1,12 +1,7 @@
-# TapData MCP Server
-
-
-import Content from '../reuse-content/_enterprise-and-community-features.md';
-
-
+# AI Agent Integration via MCP (Preview)
:::tip
-The TapData MCP Server feature is currently in Beta. If you have any questions or feature requests during usage, feel free to contact the [Tapdata Support Team](../appendix/support.md).
+The TapData MCP Server feature is currently in Beta. If you have any questions or feature requests during usage, feel free to contact the [Tapdata Support Team](../../appendix/support.md).
:::
import DocCardList from '@theme/DocCardList';
diff --git a/docs/mcp/introduction.md b/docs/experimental/mcp/introduction.md
similarity index 72%
rename from docs/mcp/introduction.md
rename to docs/experimental/mcp/introduction.md
index 93b7ea37..48572229 100644
--- a/docs/mcp/introduction.md
+++ b/docs/experimental/mcp/introduction.md
@@ -1,9 +1,5 @@
# MCP Server Introduction
-import Content from '../reuse-content/_enterprise-and-community-features.md';
-
-
-
**MCP (Model Context Protocol)** is a protocol designed to provide structured business data in real time to AI models, enhancing their understanding of business context. With the **Tapdata MCP Server**, you can integrate, anonymize, and publish data from multiple heterogeneous systems as real-time contextual views that can be dynamically accessed by LLMs (Large Language Models) or AI Agents.
This solution is especially suitable for enterprise scenarios with high demands for data freshness and compliance, such as financial risk control, intelligent customer service, and personalized recommendation.
@@ -16,7 +12,7 @@ As digital transformation accelerates, more enterprises are leveraging AI models
- Enterprise data is typically scattered across systems like CRM, core banking, ERP, etc., creating data silos.
- Due to data security and compliance requirements, AI models are often prohibited from directly accessing raw databases.
-
+
To address these challenges, Tapdata provides the MCP service. It uses a standardized SSE protocol, along with real-time materialized views and data anonymization, to securely and efficiently deliver structured context to AI models. The model can access real-time business context **without** direct database connections, significantly improving inference accuracy and enabling trustworthy AI adoption in enterprises. This forms a unified AI Context Service Layer.
@@ -25,11 +21,11 @@ To address these challenges, Tapdata provides the MCP service. It uses a standar
- **Real-Time Acceleration with 100x Faster Query Performance**
Powered by TapData’s caching and [materialized view](../tapflow/tapflow-tutorial/build-real-time-wide-table.md) capabilities, MCP enables millisecond-level query responses without direct access to source systems—dramatically improving context retrieval and model inference speed.
- **Secure Access with Trusted, Controlled Context**
- Supports field-level [masking](../user-guide/advanced-settings/custom-node.md) and [role-based](../user-guide/manage-system/manage-role.md) permissions. Combined with real-time sync and incremental updates, it ensures AI models access only fresh, authorized data during inference.
+ Supports field-level [masking](../../operational-data-hub/advanced/custom-node.md) and [role-based](../../system-admin/manage-role.md) permissions. Combined with real-time sync and incremental updates, it ensures AI models access only fresh, authorized data during inference.
- **Connect 100+ Data Sources with a Single MCP**
- One MCP instance can connect to [over 100 heterogeneous data sources](../prerequisites/supported-databases.md), including major databases and SaaS platforms—breaking data silos and providing a unified foundation for context-aware AI.
+ One MCP instance can connect to [over 100 heterogeneous data sources](../../connectors/supported-data-sources.md), including major databases and SaaS platforms—breaking data silos and providing a unified foundation for context-aware AI.
- **Built for LLM Agents with Seamless Integration**
- Offers standardized SSE support and no-code [REST API](../user-guide/data-service/README.md) setup, fully compatible with tools like Cursor, Claude, and other popular agent frameworks—bridging enterprise data and AI with ease.
+ Offers standardized SSE support and no-code [REST API](../../publish-apis/README.md) setup, fully compatible with tools like Cursor, Claude, and other popular agent frameworks—bridging enterprise data and AI with ease.
## Learn More
diff --git a/docs/mcp/quick-start.md b/docs/experimental/mcp/quick-start.md
similarity index 91%
rename from docs/mcp/quick-start.md
rename to docs/experimental/mcp/quick-start.md
index 76a9328e..1edcb21e 100644
--- a/docs/mcp/quick-start.md
+++ b/docs/experimental/mcp/quick-start.md
@@ -1,39 +1,35 @@
# Quick Start
-import Content from '../reuse-content/_enterprise-and-community-features.md';
-
-
-
This guide walks you through enabling the [MCP (Model Context Protocol) service](introduction.md) in Tapdata and integrating it with AI agent tools that support the SSE protocol (e.g., Cursor). It enables real-time delivery of structured contextual data, helping large language models better understand business context.
## Prerequisites
-- Requires Tapdata Enterprise or Community Edition to be [deployed](../quick-start/install.md).
+- Requires Tapdata Enterprise or Community Edition to be [deployed](../../getting-started/install-and-setup/README.md).
- You have an AI model service or tool that supports MCP with SSE protocol (e.g., Cursor or Trae).
## Step 1: Set Up User and Get Access Code
To ensure platform security, you need to create and authorize a user account with permission to access data via the MCP protocol.
-1. [Log in to the Tapdata Platform](../user-guide/log-in.md).
+1. Log in to Tapdata Platform.
-2. Go to **System Settings** > **Role Management** and create a role named `mcp` (case-insensitive). See [Manage Roles](../user-guide/manage-system/manage-role.md).
+2. Go to **System Settings** > **Role Management** and create a role named `mcp` (case-insensitive). See [Manage Roles](../../system-admin/manage-role.md).
-3. Navigate to **System Settings** > **User Management**, and assign the `mcp` role to a user account. See [Manage Users](../user-guide/manage-system/manage-user.md).
+3. Navigate to **System Settings** > **User Management**, and assign the `mcp` role to a user account. See [Manage Users](../../system-admin/manage-user.md).
4. Log in to Tapdata using the authorized account, click your username in the top-right corner, and select **Account**. Copy the **Access Code**; you’ll need it in the next steps.
- 
+ 
## Step 2: Configure MCP Server in Agent Tool
Next, we’ll use **Cursor** as an example to show how to configure and connect to the Tapdata MCP Server:
-1. Open and log in to the Cursor app. Click the top-right  icon.
+1. Open and log in to the Cursor app. Click the top-right  icon.
2. Click **MCP** on the left menu, then click **Add new global MCP Server**.
- 
+ 
3. In the `mcp.json` config file that opens, add the Tapdata MCP service config using the structure below:
@@ -67,7 +63,7 @@ Next, we’ll use **Cursor** as an example to show how to configure and connect
4. Save and close the configuration file. Return to the MCP settings section. When the status light on the left turns green, the connection to Tapdata MCP Server is successful.
- 
+ 
5. When you interact with the AI model in Cursor, it will automatically fetch context data from Tapdata MCP Server. You can also guide the model via prompts to help it access data efficiently.
diff --git a/docs/tapflow/README.md b/docs/experimental/tapflow/README.md
similarity index 91%
rename from docs/tapflow/README.md
rename to docs/experimental/tapflow/README.md
index 403be85a..a1f3e260 100644
--- a/docs/tapflow/README.md
+++ b/docs/experimental/tapflow/README.md
@@ -1,4 +1,4 @@
-# TapFlow Developer Guide
+# TapFlow Developer Guide (Preview)
**Tap Flow** is an API framework for the TapData Live Data Platform, offering a programmable interface for tasks such as managing replication pipelines, building wide tables or materialized views, and performing general data integration. The framework currently includes a Python SDK and an interactive CLI for flexible development and management.
diff --git a/docs/tapflow/api-reference/README.md b/docs/experimental/tapflow/api-reference/README.md
similarity index 100%
rename from docs/tapflow/api-reference/README.md
rename to docs/experimental/tapflow/api-reference/README.md
diff --git a/docs/tapflow/api-reference/data-flow.md b/docs/experimental/tapflow/api-reference/data-flow.md
similarity index 98%
rename from docs/tapflow/api-reference/data-flow.md
rename to docs/experimental/tapflow/api-reference/data-flow.md
index daf9a808..7d69a9ea 100644
--- a/docs/tapflow/api-reference/data-flow.md
+++ b/docs/experimental/tapflow/api-reference/data-flow.md
@@ -271,7 +271,7 @@ source.enableDDL()
:::tip
-To enable DDL synchronization, the target database must also support DDL application. You can check the [list of supported data sources](../../prerequisites/supported-databases.md) for each database's support for DDL events. For more details, see [Best Practices for Handling Schema Changes](../../case-practices/best-practice/handle-schema-changes.md).
+To enable DDL synchronization, the target database must also support DDL application. You can check the [list of supported data sources](../../../connectors/supported-data-sources.md) for each database's support for DDL events. For more details, see [Best Practices for Handling Schema Changes](../../../case-practices/best-practice/handle-schema-changes.md).
:::
@@ -711,7 +711,7 @@ Here, `ecom_orders` is the main table, `order_payments` is the related table, jo
#### JS Processing
-**Node Description**: Embeds JavaScript code within the data flow task to allow custom processing of data from the source. For more details, refer to [Standard](../../appendix/standard-js.md) / [Enhanced](../../appendix/standard-js.md) JS built-in functions.
+**Node Description**: Embeds JavaScript code within the data flow task to allow custom processing of data from the source. For more details, refer to [Standard](../../../appendix/standard-js.md) / [Enhanced](../../../appendix/standard-js.md) JS built-in functions.
**Example**: The following example adds a confirmation status field to delivered orders in a JavaScript processing node. The processed records are then written to the `updatedCollection` collection in MongoDB.
diff --git a/docs/tapflow/api-reference/data-source.md b/docs/experimental/tapflow/api-reference/data-source.md
similarity index 92%
rename from docs/tapflow/api-reference/data-source.md
rename to docs/experimental/tapflow/api-reference/data-source.md
index 11da1469..3190c75d 100644
--- a/docs/tapflow/api-reference/data-source.md
+++ b/docs/experimental/tapflow/api-reference/data-source.md
@@ -1,10 +1,10 @@
# Data Source APIs
-This document explains how to create a new data source connection in TapFlow, which serves as the source and target for subsequent flow tasks. Additionally, you can manage data sources [through the interface](../../prerequisites/README.md) for convenience based on your preferences.
+This document explains how to create a new data source connection in TapFlow, which serves as the source and target for subsequent flow tasks. Additionally, you can manage data sources [through the interface](../../../connectors/README.md) for convenience based on your preferences.
:::tip
-TapFlow supports [dozens of common data sources](../../prerequisites/supported-databases.md). You can configure a data source by setting the `name`, `type`, and `config` parameters. Configuration details, required permissions, and parameter descriptions vary by data source type. For more information, see [Connect Data Sources](../../prerequisites/README.md).
+TapFlow supports [dozens of common data sources](../../../connectors/supported-data-sources.md). You can configure a data source by setting the `name`, `type`, and `config` parameters. Configuration details, required permissions, and parameter descriptions vary by data source type. For more information, see [Connect Data Sources](../../../connectors/README.md).
:::
diff --git a/docs/tapflow/introduction.md b/docs/experimental/tapflow/introduction.md
similarity index 74%
rename from docs/tapflow/introduction.md
rename to docs/experimental/tapflow/introduction.md
index 465263da..440f0b0d 100644
--- a/docs/tapflow/introduction.md
+++ b/docs/experimental/tapflow/introduction.md
@@ -4,10 +4,10 @@
## How It Works
-
+
Typical application scenarios for TapFlow involve the following main data processing steps:
-- **Data Collection**: Using Tap [Change Data Capture](../introduction/change-data-capture-mechanism.md) (CDC), it connects to and monitors update events (such as insert, update, and delete operations) in data sources, transforming them into data streams.
+- **Data Collection**: Using Tap [Change Data Capture](../../introduction/change-data-capture-mechanism.md) (CDC), it connects to and monitors update events (such as insert, update, and delete operations) in data sources, transforming them into data streams.
- **Data Stream Processing**: Allows users to perform real-time processing on data streams via API or graphical interface, including complex operations like data merging, cleansing, and transformation.
- **Data Storage or Output**: The processed data streams can be saved to materialized views to support fast queries and application services or sent directly to downstream databases or message queues (such as Kafka) for rapid data transmission.
diff --git a/docs/tapflow/quick-start.md b/docs/experimental/tapflow/quick-start.md
similarity index 97%
rename from docs/tapflow/quick-start.md
rename to docs/experimental/tapflow/quick-start.md
index dd2fe5d6..a0d20c7b 100644
--- a/docs/tapflow/quick-start.md
+++ b/docs/experimental/tapflow/quick-start.md
@@ -89,7 +89,7 @@ import TabItem from '@theme/TabItem';
Register and log in to [TapData Cloud](https://cloud.tapdata.net/). Click your username in the upper-right corner and select **User Center** to obtain the Access Key and Secret Key.
- 
+ 
@@ -97,7 +97,7 @@ import TabItem from '@theme/TabItem';
Contact your administrator for the TapData Enterprise login address. After logging in, click your username in the upper-right corner and select **Personal Settings** to get the access code.
- 
+ 
@@ -178,7 +178,7 @@ Next, configure your data sources via TapShell. In this example, we’ll use MyS
:::tip
- - TapData supports [many popular data sources](../prerequisites/supported-databases.md), with slight configuration differences depending on the source. For more on permissions and parameters, see [Connecting Data Sources](../prerequisites/README.md).
+ - TapData supports [many popular data sources](../../connectors/supported-data-sources.md), with slight configuration differences depending on the source. For more on permissions and parameters, see [Connecting Data Sources](../../connectors/README.md).
- If you receive a “**load schema status: error**” error, it’s typically a permission or configuration issue. Retrying with the same name will overwrite the previous configuration with “**database MongoDB_ECommerce exists, will update its config**.”
:::
diff --git a/docs/tapflow/tapflow-tutorial/README.md b/docs/experimental/tapflow/tapflow-tutorial/README.md
similarity index 100%
rename from docs/tapflow/tapflow-tutorial/README.md
rename to docs/experimental/tapflow/tapflow-tutorial/README.md
diff --git a/docs/tapflow/tapflow-tutorial/build-real-time-wide-table.md b/docs/experimental/tapflow/tapflow-tutorial/build-real-time-wide-table.md
similarity index 98%
rename from docs/tapflow/tapflow-tutorial/build-real-time-wide-table.md
rename to docs/experimental/tapflow/tapflow-tutorial/build-real-time-wide-table.md
index 4d702d28..eb4d7005 100644
--- a/docs/tapflow/tapflow-tutorial/build-real-time-wide-table.md
+++ b/docs/experimental/tapflow/tapflow-tutorial/build-real-time-wide-table.md
@@ -15,7 +15,7 @@ As the business and data volume grows, the e-commerce company **XYZ** faces chal
- **Data Inconsistency**: Ensuring data consistency across multiple tables in high-concurrency scenarios is challenging, increasing the risk of inconsistencies.
- **Lack of Real-Time Updates**: Changes in order status or inventory are not reflected promptly, making it difficult for users to access up-to-date information.
-
+
To address these challenges, the company uses **TapFlow** to build a real-time wide table, consolidating order, customer, payment, and product data in MongoDB to support high-concurrency mobile API queries. Here’s an overview of the process:
@@ -23,7 +23,7 @@ To address these challenges, the company uses **TapFlow** to build a real-time w
2. **Wide Table Generation**: TapFlow's lookup feature combines data from multiple tables into a single wide table, embedding customer, product, and payment information into the order record for simplified queries.
3. **Real-Time Updates**: When source data changes, TapFlow synchronizes incremental updates to the MongoDB wide table, ensuring the query content is always up-to-date.
-
+
By using TapFlow, XYZ enables real-time synchronization and fast querying of order and inventory information. Operations staff can access the latest order data instantly, significantly improving the user experience. The wide table consolidates order, customer, product, and logistics information in MongoDB, reducing cross-table join resource consumption and improving query efficiency and system performance.
@@ -313,4 +313,4 @@ The following steps simulate data flow in a real-world business scenario by manu
## See also
-[Publish Data as API](../../user-guide/data-service/create-api-service.md)
\ No newline at end of file
+[Publish Data as API](../../../publish-apis/create-api-service.md)
\ No newline at end of file
diff --git a/docs/tapflow/tapflow-tutorial/expand-mogodb-array-to-mysql.md b/docs/experimental/tapflow/tapflow-tutorial/expand-mogodb-array-to-mysql.md
similarity index 98%
rename from docs/tapflow/tapflow-tutorial/expand-mogodb-array-to-mysql.md
rename to docs/experimental/tapflow/tapflow-tutorial/expand-mogodb-array-to-mysql.md
index 63bfb83d..67d88e22 100644
--- a/docs/tapflow/tapflow-tutorial/expand-mogodb-array-to-mysql.md
+++ b/docs/experimental/tapflow/tapflow-tutorial/expand-mogodb-array-to-mysql.md
@@ -35,7 +35,7 @@ In this example, an e-commerce company aims to conduct independent analysis of o
To support traditional relational analysis requirements, we propose a solution using TapFlow to expand MongoDB’s nested order arrays (e.g., `order_payments`) into independent rows in MySQL, ensuring that analytical teams can leverage SQL queries to generate efficient reports and perform data mining. The flow is as follows:
-
+
Additionally, TapFlow’s real-time sync capabilities ensure that MySQL reflects the latest data from MongoDB, helping the company improve query performance while maintaining data freshness, thus enabling the analytics team to access timely business data and make informed decisions.
@@ -103,7 +103,7 @@ Next, we demonstrate how to expand the `order_payments` array and rename fields
7. While the task runs, you can check the task status and statistics using the command `status MySQL_to_MongoDB_Order`.
- Additionally, you can [monitor the task status through the Web UI](../../user-guide/data-development/monitor-task).
+ Additionally, you can [monitor the task status through the Web UI](../../../data-transformation/monitor-view-tasks.md).
diff --git a/docs/tapflow/tapflow-tutorial/merge-inventory-to-mongodb.md b/docs/experimental/tapflow/tapflow-tutorial/merge-inventory-to-mongodb.md
similarity index 98%
rename from docs/tapflow/tapflow-tutorial/merge-inventory-to-mongodb.md
rename to docs/experimental/tapflow/tapflow-tutorial/merge-inventory-to-mongodb.md
index 6073f216..72c5ebaf 100644
--- a/docs/tapflow/tapflow-tutorial/merge-inventory-to-mongodb.md
+++ b/docs/experimental/tapflow/tapflow-tutorial/merge-inventory-to-mongodb.md
@@ -72,7 +72,7 @@ This guide demonstrates how to consolidate regional inventory data into MongoDB
2. Add JavaScript Processing Logic.
- Add a `FROM` field to identify the region (e.g., `usaWarehouse`) and standardize the `PK_CERT_NBR` field by extracting the substring before the `|` delimiter. If no delimiter exists, retain the original value. For more details, refer to [Standard JS](../../appendix/standard-js.md) and [Enhanced JS](../../appendix/standard-js.md).
+ Add a `FROM` field to identify the region (e.g., `usaWarehouse`) and standardize the `PK_CERT_NBR` field by extracting the substring before the `|` delimiter. If no delimiter exists, retain the original value. For more details, refer to [Standard JS](../../../appendix/standard-js.md) and [Enhanced JS](../../../appendix/standard-js.md).
```python
# Add region identifier "usaWarehouse" and standardize PK_CERT_NBR field
@@ -129,7 +129,7 @@ This guide demonstrates how to consolidate regional inventory data into MongoDB
inventoryFlow.start();
```
-6. (Optional) During task execution, use `status Inventory_Merge` to check the task's status and statistics or monitor the task via the [Web UI](../../user-guide/data-development/monitor-task).
+6. (Optional) During task execution, use `status Inventory_Merge` to check the task's status and statistics or monitor the task via the [Web UI](../../../data-transformation/monitor-view-tasks.md).
```python
# Example task status output
diff --git a/docs/tapflow/tapshell-reference.md b/docs/experimental/tapflow/tapshell-reference.md
similarity index 100%
rename from docs/tapflow/tapshell-reference.md
rename to docs/experimental/tapflow/tapshell-reference.md
diff --git a/docs/faq/agent-installation.md b/docs/faq/agent-installation.md
index b1087c74..94ae6402 100644
--- a/docs/faq/agent-installation.md
+++ b/docs/faq/agent-installation.md
@@ -1,9 +1,5 @@
# Deploy and Manage Agent
-import Content from '../reuse-content/_cloud-features.md';
-
-
-
This article lists common problems encountered by TapData Agent in deployment and operation.
## Deploy Agent
@@ -24,7 +20,7 @@ TapData Agent obtains data from the source, processes and transforms it, then se
The TapData Agent should be installed in the local network where the database is located since data flow is usually time-sensitive.
-See [Deploying TapData Agent](../installation/install-tapdata-agent.md) for more information.
+
### How many agents need to be deployed?
@@ -50,9 +46,7 @@ To change the Agent for a task when an exception occurs, you can edit the task a
When Oracle is in RAC mode with two nodes, the TapData Agent can be deployed on a separate device as long as it can establish a connection to the SCAN/VIP of the RAC environment. It is not necessary for the Agent to be deployed on the same device with Oracle.
-### What should I do if the test fails after installing Docker on Windows (64-bit)?
-The best way to [deploy Agent](../installation/install-tapdata-agent.md) is directly through Docker.
### How do I get the tokens needed for deployment again?
diff --git a/docs/faq/data-pipeline.md b/docs/faq/data-pipeline.md
index 647099ba..76cf4a23 100644
--- a/docs/faq/data-pipeline.md
+++ b/docs/faq/data-pipeline.md
@@ -1,8 +1,6 @@
# Data Pipelines
-import Content from '../reuse-content/_all-features.md';
-
This article lists potential issues and solutions encountered when constructing data pipelines, including data replication tasks, data transformation tasks, and data validation modules.
@@ -21,7 +19,7 @@ Yes, in most cases, tasks will resume automatically under the following conditio
- In the TapData Cloud, tasks continue running without impact. They will automatically resume after the network is restored.
- In the TapData Enterprise, short interruptions have no effect. If the disconnection lasts longer than 10 minutes, tasks will automatically switch to another available engine. Full sync tasks will restart from the beginning, while incremental sync tasks will resume from the last checkpoint.
- **Network interruption between the Engine and Data Source**:
- - The system will attempt to reconnect within the configured [retry interval](../user-guide/other-settings/system-settings.md#task-setting). Once the connection is restored, the task will continue.
+ - The system will attempt to reconnect within the configured [retry interval](../system-admin/other-settings/system-settings.md#task-setting). Once the connection is restored, the task will continue.
- If the network is not restored within the timeout period, the task will stop. After recovery, full sync tasks will restart from the beginning, and incremental tasks will resume from the last checkpoint.
### Does it support cross-region, cross-network data synchronization?
@@ -133,9 +131,7 @@ You need to add the following parameters to the PostgreSQL connection string dur
autosave=always&cleanupSavePoints=true
```
-
+
### What if testing the MySQL connection indicates: The server time zone value ' 'is unrecognized?
@@ -155,7 +151,7 @@ The topic expression is a regular expression used to match the name of the messa
### If Oracle synchronizes to SQL Server, is the Select permission sufficient?
-Oracle needs some additional permissions for CDC. For specific configuration and authorization, see [Oracle Preparation Work](../prerequisites/on-prem-databases/oracle.md).
+Oracle needs some additional permissions for CDC. For specific configuration and authorization, see [Oracle Preparation Work](../connectors/on-prem-databases/oracle.md).
### Can changes be made during task execution, such as adding tables to be synchronized?
@@ -291,7 +287,7 @@ Yes, you need to turn on the corresponding switch during task configuration. Add
### If manual deletion of a field in the target table causes an error during incremental synchronization, how can it be fixed?
-You can edit the task, add an [add/remove field node](../user-guide/data-development/process-node.md#add-and-del-cols) before the target node, filter out the deleted field, and then restart the task.
+You can edit the task, add an [add/remove field node](../data-transformation/process-node.md#add-and-del-cols) before the target node, filter out the deleted field, and then restart the task.
:::tip
@@ -305,7 +301,7 @@ During task configuration, you can open the advanced settings in the source node
:::tip
-This feature requires that the target node be a weak scheme-type data source (such as MongoDB/Kafka), etc. If you need to perform data filtering rules during both the full and incremental phases, you can add a [row filter](../user-guide/data-development/process-node.md) to achieve this.
+This feature requires that the target node be a weak scheme-type data source (such as MongoDB/Kafka), etc. If you need to perform data filtering rules during both the full and incremental phases, you can add a [row filter](../data-transformation/process-node.md) to achieve this.
:::
diff --git a/docs/faq/data-security.md b/docs/faq/data-security.md
index b6a92b9c..73d045d6 100644
--- a/docs/faq/data-security.md
+++ b/docs/faq/data-security.md
@@ -1,9 +1,5 @@
# Data Security and Network Configuration
-import Content from '../reuse-content/_cloud-features.md';
-
-
-
This article lists common problems related to data security and network configuration.
## How does data flow when using TapData Cloud?
diff --git a/docs/user-guide/error-code-solution.md b/docs/faq/error-code-solution.md
similarity index 93%
rename from docs/user-guide/error-code-solution.md
rename to docs/faq/error-code-solution.md
index ef8e1c2b..3303a5a6 100644
--- a/docs/user-guide/error-code-solution.md
+++ b/docs/faq/error-code-solution.md
@@ -1,10 +1,8 @@
# Task Error Codes and Solutions
-import Content from '../reuse-content/_all-features.md';
-
-If you encounter an exception with a task, you can view the relevant log information at the bottom of the task's [monitoring page](data-development/monitor-task.md). For common issues, TapData has codified them into specific error codes for easier lookup, and provides the cause of the error and its solution.
+If you encounter an exception with a task, you can view the relevant log information at the bottom of the task's [monitoring page](../data-transformation/monitor-view-tasks.md). For common issues, TapData has codified them into specific error codes for easier lookup, and provides the cause of the error and its solution.
## 10001
@@ -64,7 +62,7 @@ Before reading, the engine needs to locate the specific position in the logs to
**Solutions**:
* Refer to the error message below, compare the erroneous fields' types in the source and destination databases. If inconsistent, use database DDL or similar commands to correct it, then run the task again.
-* Use the [JS processing node](data-development/process-node.md#js-process) to filter out erroneous fields. For instance, if the problematic field is `field1`, the corresponding JS would be `record.remove('field1')`.
+* Use the [JS processing node](../data-transformation/process-node.md#js-process) to filter out erroneous fields. For instance, if the problematic field is `field1`, the corresponding JS would be `record.remove('field1')`.
* If the JS processing node changes the data type, the new type should be passed to TapData using the syntax provided below the JS editing box. Delete the target table and run the task again.
## 10007
diff --git a/docs/user-guide/no-supported-data-type.md b/docs/faq/no-supported-data-type.md
similarity index 98%
rename from docs/user-guide/no-supported-data-type.md
rename to docs/faq/no-supported-data-type.md
index 58722ffd..86110194 100644
--- a/docs/user-guide/no-supported-data-type.md
+++ b/docs/faq/no-supported-data-type.md
@@ -1,8 +1,6 @@
# Data Type Support Description
-import Content from '../reuse-content/_all-features.md';
-
:::tip
diff --git a/docs/faq/use-product.md b/docs/faq/use-product.md
index 965e07a9..41ca71d1 100644
--- a/docs/faq/use-product.md
+++ b/docs/faq/use-product.md
@@ -1,14 +1,12 @@
# Product Features/Usage
-import Content from '../reuse-content/_all-features.md';
-
This article lists common questions encountered while using TapData.
## What data sources does TapData support?
-TapData supports a wide range of databases, including common relational, non-relational, and queue-based data sources. For details, see [Supported Databases](../prerequisites/supported-databases.md).
+TapData supports a wide range of databases, including common relational, non-relational, and queue-based data sources. For details, see [Supported Databases](../connectors/supported-data-sources.md).
## Does TapData offer a trial?
@@ -20,12 +18,12 @@ TapData offers two deployment options, **Cloud** and **Enterprise**, to meet you
| Product | Applicable Scenarios | Pricing Information |
| ------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ |
-| TapData Cloud | Register an account on [TapData Cloud](https://cloud.tapdata.net/console/v3/) to use, suitable for scenarios requiring quick deployment and low initial investment. Helps you focus more on business development rather than infrastructure management. | Provides one SMALL specification Agent instance for free (semi-managed mode). You can also subscribe to higher specifications or more Agent instances based on business needs. See more at [Product Billing](../billing/billing-overview.md). |
+| TapData Cloud | Register an account on [TapData Cloud](https://cloud.tapdata.net/console/v3/) to use, suitable for scenarios requiring quick deployment and low initial investment. Helps you focus more on business development rather than infrastructure management. | Provides one SMALL specification Agent instance for free (semi-managed mode). You can also subscribe to higher specifications or more Agent instances based on business needs. |
| TapData Enterprise | Supports deployment to local data centers, suitable for scenarios requiring sensitive data handling or strict network isolation, such as financial institutions, government departments, or large enterprises wanting full control over their data. | Based on the number of server nodes deployed, pay the corresponding subscription fees annually. Before making a purchase, you can click “[Apply for a Trial](https://tapdata.net/tapdata-on-prem/demo.html)” and a TapData engineer will assist you with the trial. See more at [Product Pricing](https://tapdata.net/pricing.html). |
## What to do if the connection test fails?
-When creating a data connection, refer to the connection configuration help on the right side of the page and complete the settings according to the guide. You can also refer to [Preparation Work](../prerequisites) to complete the setup.
+When creating a data connection, refer to the connection configuration help on the right side of the page and complete the settings according to the guide. You can also refer to [Preparation Work](../connectors) to complete the setup.
## When configuring a replication task, why is the target node inference result abnormal?
@@ -48,18 +46,10 @@ To avoid such issues, set the lifespan of uncommitted transactions in the source
## Does TapData support publishing tables as API services?
-import Content1 from '../reuse-content/_enterprise-features.md';
-
-
-
-Yes (for single tables), you can publish processed tables [as API services](../user-guide/data-service/create-api-service.md) to allow other applications to easily access and retrieve data.
+Yes (for single tables), you can publish processed tables [as API services](../publish-apis/create-api-service.md) to allow other applications to easily access and retrieve data.
## How to publish complex multi-table queries as API services?
-import Content2 from '../reuse-content/_enterprise-features.md';
-
-
-
For complex multi-table query scenarios, common solutions include materialized views and ad-hoc queries:
* **Materialized Views**: A materialized view is a pre-calculated and stored virtual table that provides high-performance data access when queried. By pre-executing multi-table join operations and storing the results as a materialized view, you can significantly improve query performance and response times. This approach is suitable for scenarios where data change frequency is low because the materialized view needs to be updated with each data change.
@@ -69,15 +59,15 @@ In TapData, you can solidify complex multi-table queries into a materialized vie
**Real-Time View Strategy**
-Suitable for scenarios where SQL statements are relatively simple and there is a high requirement for data timeliness. The core idea is to use various [process nodes](../user-guide/data-development/process-node.md) to implement specific operations in SQL statements (such as joins), ultimately synchronizing the processed data to a new table in real-time and then creating and publishing an API service based on that table.
+Suitable for scenarios where SQL statements are relatively simple and there is a high requirement for data timeliness. The core idea is to use various [process nodes](../data-transformation/process-node.md) to implement specific operations in SQL statements (such as joins), ultimately synchronizing the processed data to a new table in real-time and then creating and publishing an API service based on that table.
Steps include:
-1. [Create a data transformation task](../user-guide/data-development/create-task.md).
-2. Replace specific operations in the SQL statement with process nodes. For example, as shown in the image below, we pre-join **customer** and **company** tables (implemented through a [join node](../user-guide/data-development/process-node.md#join)) and store the results in the **join_result** table.
+1. [Create a data transformation task](../data-transformation/create-views/README.md).
+2. Replace specific operations in the SQL statement with process nodes. For example, as shown in the image below, we pre-join **customer** and **company** tables (implemented through a [join node](../data-transformation/process-node.md#join)) and store the results in the **join_result** table.

3. Start the task to implement real-time data synchronization.
-4. Based on the new table (join_result), [create and publish an API service](../user-guide/data-service/create-api-service.md).
+4. Based on the new table (join_result), [create and publish an API service](../publish-apis/create-api-service.md).
**Batch View Strategy**
@@ -85,7 +75,7 @@ For extremely complex SQL statements (e.g., SQL nesting, complex joins), you can
Steps include:
-1. [Create a data transformation task](../user-guide/data-development/create-task.md).
+1. [Create a data transformation task](../data-transformation/create-views/README.md).
2. Add source and target nodes on the canvas.
:::tip
The target node should be a weak Schema class data source, such as MongoDB or Kafka.
@@ -93,4 +83,4 @@ Steps include:
3. In the source node settings, turn on the custom query switch for full sync and add the SQL query statement needed during the full data sync phase (does not affect the incremental phase).

4. After setting up the target node, click the settings in the upper right corner of the page, set the synchronization type to **full**, and then set a regular scheduling strategy based on real-time requirements.
-5. Start the task and wait for it to run to completion before creating and publishing an API service based on the new table [create and publish an API service](../user-guide/data-service/create-api-service.md).
\ No newline at end of file
+5. Start the task and wait for it to run to completion before creating and publishing an API service based on the new table [create and publish an API service](../publish-apis/create-api-service.md).
\ No newline at end of file
diff --git a/docs/getting-started/README.md b/docs/getting-started/README.md
new file mode 100644
index 00000000..00af4de2
--- /dev/null
+++ b/docs/getting-started/README.md
@@ -0,0 +1,34 @@
+# Getting Started
+
+Welcome to TapData! This quick-start guide walks you through building a real-time data pipeline—from connecting sources to publishing analysis-ready APIs.
+
+TapData’s **Incremental Materialized Views (IMVs)** let you define joins and transformations once, and automatically keep the results fresh in real time—ideal for analytics, applications, and APIs.
+
+:::tip
+If you’re looking to build a complete **Operational Data Hub (ODH)**, see the [ODH Handbook](../operational-data-hub/plan-data-platform.md) for the full architecture and design approach.
+For now, this guide gives you a fast way to experience TapData’s core capabilities.
+:::
+
+
+## What Will You Do?
+
+You'll learn how to:
+
+1. **[Install and Set Up TapData](install-and-setup/README.md)**
+ Deploy TapData and prepare your environment.
+
+2. **[Connect Your Data Sources](connect-data-source.md)**
+ Add connections to your **source databases** (like MySQL) and a **target MongoDB database** to sync and transform data.
+
+3. **[Create a Real-Time Incremental Materialized View](build-real-time-materialized-view.md)**
+ Use the visual designer to combine tables into a single, always-up-to-date view—ready for instant querying.
+
+4. **[Publish Your View as an API](publish-imv-as-api.md)**
+ Expose your real-time data as a secure API endpoint for dashboards, CRM systems, or downstream apps.
+
+Ready to try it out? Follow the steps below to create your first real-time data pipeline.
+
+import DocCardList from '@theme/DocCardList';
+
+
+
diff --git a/docs/getting-started/build-real-time-materialized-view.md b/docs/getting-started/build-real-time-materialized-view.md
new file mode 100644
index 00000000..6d6978c2
--- /dev/null
+++ b/docs/getting-started/build-real-time-materialized-view.md
@@ -0,0 +1,136 @@
+# Step 3: Build Real-time Materialized View
+
+This article shows you how to use TapData to build **Incremental Materialized Views (IMV)**—real-time, high-performance analytics tables that combine data from multiple sources to speed up your insights and decision-making.
+
+## Why Use Incremental Materialized Views?
+
+Imagine you're a data analyst at an e-commerce company. You need to quickly identify **high-value transactions over $300 in Q1 2025** and understand customer membership levels and regions to drive targeted marketing—like sending special coupons or offering tier upgrades to boost retention and sales.
+
+import TapDataFlowAnimation from '@site/src/components/Animation/TapDataFlowAnimation';
+
+
+
+In traditional databases, you'd have to run complex multi-table JOINs between your orders and users tables. When data volumes are large, these queries can be slow and put pressure on production systems. IT might even ask you to run them during off-peak hours—slowing down your entire analysis workflow and limiting your team's agility.
+
+
+
+With **TapData’s real-time materialized views**, you can automatically join your orders and users tables into a single, always-up-to-date view synced to MongoDB. Your BI tools or APIs can query this single, denormalized table to get the latest data instantly—no complex SQL, no load on your operational systems. It's fast, scalable, and designed for real-time analytics.
+
+## Prerequisites
+
+Make sure you have already connected your **source MySQL database** and **target MongoDB database** in TapData.
+
+If you haven't set up these connections yet, see [Connect Data Sources](connect-data-source.md) for detailed instructions.
+
+## Procedure
+
+1. Log in to TapData Platform.
+
+2. In the left navigation panel, go to **Data Transformation**.
+
+3. Click **Build Materialized View** to open the configuration workspace.
+
+ 1. Select your main source table.
+
+ For this example, choose the **orders** table as your primary data source.
+
+ 
+
+ 2. To bring in related user details, click **+ Add Field** and choose **Flatten**.
+
+ 3. In the field editor, pick the database and table you want to join. Set the join condition by selecting the key column. In this example, link the **users** table using **user_id**.
+
+ Once configured, the **orders** table will include user information as part of each record.
+
+ 
+
+4. Click **+ Write Target** in the top-right corner. Choose your MongoDB connection and enter a collection name where the view data will be stored.
+
+ On the right, you can preview field mappings and data types for the target collection (for example, **order_view**).
+
+ 
+
+5. When you’re ready, click **Start** in the top-right to launch your real-time materialized view.
+
+ After starting, you’ll be redirected to the task monitoring page, where you can track metrics such as records per second (RPS), latency, and event counts.
+
+ 
+
+
+
+## Verify Results
+
+Once your task is running successfully, TapData continuously performs real-time joins on both full and incremental data from your source tables, delivering an up-to-date view to your target MongoDB collection.
+
+Returning to our [high-value customer analysis](#why-use-imv) example from the introduction, here's how you might run a traditional SQL join in your MySQL source:
+
+```sql
+SELECT
+ o.order_id,
+ o.user_id,
+ o.order_amount,
+ o.order_time,
+ u.user_name,
+ u.user_level,
+ u.country,
+ u.city
+FROM
+ orders o
+JOIN
+ users u ON o.user_id = u.user_id
+WHERE
+ o.order_time BETWEEN '2025-01-01' AND '2025-03-31'
+ AND o.order_amount >= 300;
+```
+
+**Example result (single row):**
+
+```sql
+order_id | user_id | user_name | user_level | order_amount | payment_method | order_time
+----------------------------------------------------------------------------------------
+o2005 | u004 | David | PLATINUM | 310.40 | PAYPAL | 2025-01-04 12:00:00
+```
+
+With your Incremental Materialized View in MongoDB, you don't need to maintain or run these joins manually. Instead, you can query a single, analysis-ready view:
+
+```javascript
+db.order_flat_view.find(
+ {
+ order_time: { $gte: ISODate("2025-01-01"), $lte: ISODate("2025-03-31") },
+ order_amount: { $gte: 300 },
+ user_level: { $in: ["GOLD", "PLATINUM"] }
+ },
+ {
+ order_id: 1,
+ user_id: 1,
+ user_name: 1,
+ user_level: 1,
+ order_amount: 1,
+ payment_method: 1,
+ order_time: 1,
+ _id: 0
+ }
+).sort({ order_time: -1 });
+```
+
+**Example result (single document):**
+
+```javascript
+{
+ order_id: 'o2005',
+ order_amount: Decimal128('310.40'),
+ order_time: ISODate('2025-01-04T12:00:00Z'),
+ payment_method: 'PAYPAL',
+ user_id: 'u004',
+ user_level: 'PLATINUM',
+ user_name: 'David'
+}
+```
+
+Because the view updates in real time, any new orders from users will automatically appear in MongoDB within milliseconds—without extra configuration or manual queries. This ensures your BI dashboards and APIs always have access to the latest, fully joined and enriched data for your marketing and analysis needs.
+
+## See also
+
+* [Publish View as APIs](publish-imv-as-api.md)
+* [Validate View Results](../operational-data-hub/fdm-layer/validate-data-quality.md)
+* [View Design Considerations](../data-transformation/design-considerations.md)
\ No newline at end of file
diff --git a/docs/getting-started/connect-data-source.md b/docs/getting-started/connect-data-source.md
new file mode 100644
index 00000000..1ce77b45
--- /dev/null
+++ b/docs/getting-started/connect-data-source.md
@@ -0,0 +1,66 @@
+# Step 2: Connect a Data Sources
+
+
+Once you have installed TapData, you need to connect the Agent to the data sources through TapData, and you can create a data pipeline once the connection has been established.
+
+:::tip
+
+Before connecting to the data sources, you also need to ensure that the network environment is accessed properly and complete the authorization of the database account. For more information, see [Preparation](../connectors/README.md).
+
+:::
+
+## Procedure
+
+1. Log in to TapData Platform.
+
+2. In the left navigation panel, click **Connections**.
+
+3. On the right side of the page, click on **Create**. A dialog box will appear, where you can select the desired data source to establish a connection with.
+
+ 
+
+4. After being redirected to the connection configuration page, proceed to fill in the required data source connection information.
+
+ On the right panel of the page, you will find helpful information and guidance regarding the configuration of the connection.
+
+ :::tip
+
+ The operation process will be demonstrated using MySQL as an example. For more examples, see [Connect Data Sources](../connectors/README.md).
+
+ :::
+
+ 
+
+ * **Connection Settings**
+ * **Name**: Enter a unique name with business significance.
+ * **Type**: Support using MySQL as either a source or target database.
+ * **Deployment Mode**: Support for single-node and primary-replica architecture. When selecting the primary-replica architecture, provide the primary and replica server addresses and service ports. The primary server information should be entered in the first row.
+ * **Server Address**: Database connection address.
+ * **Port**: Database service port.
+ * **Database**: The database name. Each connection corresponds to one database. If there are multiple databases, create multiple connections.
+ * **Username**: The database username.
+ * **Password**: The database password.
+ * **Advanced Settings**
+ * **Connection Parameter String**: Default is `useUnicode=yes&characterEncoding=UTF-8`, indicating that data transmission will use the UTF-8 encoded Unicode character set, which helps avoid character encoding issues.
+ * **Timezone**: Default is set to 0 timezone. If configured to another timezone, it will affect fields without timezone information (e.g., `datetime`). Fields with timezone information (e.g., `timestamp`, `date`, and `time`) are not affected.
+ * **CDC Log Caching**: [Mining the source database's](../operational-data-hub/advanced/share-mining.md) incremental logs. This allows multiple tasks to share the same source database’s incremental log mining process, reducing duplicate reads and minimizing the impact of incremental synchronization on the source database. After enabling this feature, you will need to select an external storage to store the incremental log information.
+ * **Contain Table**: The default option is **All**, which includes all tables. Alternatively, you can select **Custom** and manually specify the desired tables by separating their names with commas (,).
+ * **Exclude Tables**: Once the switch is enabled, you have the option to specify tables to be excluded. You can do this by listing the table names separated by commas (,) in case there are multiple tables to be excluded.
+ * **Agent Settings**: Defaults to **Platform automatic allocation**, you can also manually specify an agent.
+ * **Model Load Time**: If there are less than 10,000 models in the data source, their schema will be updated every hour. But if the number of models exceeds 10,000, the refresh will take place daily at the time you have specified.
+ * **Enable Heartbeat Table**: When the connection type is source or target, you can enable this switch. TapData will create a `_tapdata_heartbeat_table` heartbeat table in the source database and update it every 10 seconds (requires appropriate permissions) to monitor the health of the data source connection and tasks. The heartbeat task starts automatically after the data replication/development task starts, and you can view the heartbeat task in the data source editing page.
+ * **SSL Settings**: Choose whether to enable SSL for the data source connection to enhance data security. After enabling this feature, you need to upload CA files, client certificates, client key files, etc.
+
+5. Click **Test** at the bottom of the page, and when passed the check, click **Save**.
+
+ :::tip
+
+ If the connection test fails, follow the prompts on the page to fix it.
+
+ :::
+
+
+
+## Next step
+
+[Create a Data Pipeline](build-real-time-materialized-view.md)
\ No newline at end of file
diff --git a/docs/getting-started/install-and-setup/README.md b/docs/getting-started/install-and-setup/README.md
new file mode 100644
index 00000000..b1f17a8b
--- /dev/null
+++ b/docs/getting-started/install-and-setup/README.md
@@ -0,0 +1,20 @@
+# Step 1: Install and Setup
+
+TapData offers two deployment modes **Enterprise** and **Community**, to meet your diversified needs:
+
+| Product | Applicable Scenarios |
+| --------------------------------------------------- | ------------------------------------------------------------ |
+| [TapData Enterprise](install-enterprise-edition.md) | Supports deployment to local data centers. Suitable for scenarios with strict requirements on data sensitivity or network isolation, such as financial institutions, government departments, or large enterprises that want full control over their data. |
+| [TapData Community](install-community-edition.md) | An [open-source](https://github.com/tapdata/tapdata) data integration platform that provides basic data synchronization and transformation capabilities. This helps you quickly explore and implement data integration projects. As your project or business grows, you can seamlessly upgrade to TapData Cloud or TapData Enterprise to access more advanced features and service support. |
+
+:::tip
+
+For more information, see [Edition Comparison](../../introduction/compare-editions.md).
+
+:::
+
+Follow the docs below to deploy TapData:
+
+import DocCardList from '@theme/DocCardList';
+
+
diff --git a/docs/installation/install-tapdata-community.md b/docs/getting-started/install-and-setup/install-community-edition.md
similarity index 90%
rename from docs/installation/install-tapdata-community.md
rename to docs/getting-started/install-and-setup/install-community-edition.md
index da827c04..694032e1 100644
--- a/docs/installation/install-tapdata-community.md
+++ b/docs/getting-started/install-and-setup/install-community-edition.md
@@ -1,8 +1,4 @@
-# TapData Community
-
-import Content from '../reuse-content/_community-features.md';
-
-
+# Deploy TapData Community
TapData Community is an open-source real-time data platform that facilitates data synchronization and transformation. This guide demonstrates how to quickly install and start TapData Community.
@@ -17,7 +13,7 @@ Before you begin, ensure your environment meets the following requirements:
- Hardware specifications: 8-core CPU (x86 architecture), 16 GB of memory
- Storage specifications: 100 GB
-- Operating System: **CentOS 7+** , **Ubuntu 16.04+** or **Red Hat Enterprise Linux(RHEL)7.x/8.x**
+- Operating System: **CentOS 7+** , **Ubuntu 16.04+** , **Red Hat Enterprise Linux(RHEL)7.x/8.x** or **Windows OS (64-bit)**.
## Component Overview
@@ -90,7 +86,15 @@ TapData Community includes the following main components:
For example, for version 3.5.16, the command would be: `tar -zxvf tapdata-v3.5.16-663b7b11.tar.gz && cd tapdata`
-3. [Install MongoDB](../administration/production-deploy/install-replica-mongodb.md) (version 4.0 or later). TapData will use it as an intermediary database to store tasks and metadata.
+3. Install environmental dependencies.
+
+ 1. Install Java 1.8 version.
+
+ ```bash
+ yum -y install java-1.8.0-openjdk
+ ```
+
+ 2. [Install MongoDB](../../platform-ops/production-deploy/install-replica-mongodb.md) (version 4.0 and above), which will serve as the storage system for TapData to run related data, such as logs and metadata.
3. Execute the following command to specify the [URI connection string](https://www.mongodb.com/docs/v5.0/reference/connection-string/#standard-connection-string-format) of the MongoDB instance you just deployed.
@@ -126,4 +130,4 @@ TapData Community includes the following main components:
## Next Steps
-[Connect to a Database](../quick-start/connect-database.md)
\ No newline at end of file
+[Connect to a Database](../connect-data-source.md)
\ No newline at end of file
diff --git a/docs/getting-started/install-and-setup/install-enterprise-edition.md b/docs/getting-started/install-and-setup/install-enterprise-edition.md
new file mode 100644
index 00000000..d2a4d2de
--- /dev/null
+++ b/docs/getting-started/install-and-setup/install-enterprise-edition.md
@@ -0,0 +1,266 @@
+# Deploy TapData Enterprise
+
+The Enterprise Edition supports both single-node and high-availability deployments. This article explains how to quickly deploy it locally on Linux and Windows platforms (single-node architecture). For production environments, it is recommended to use the [high-availability deployment](../../platform-ops/production-deploy/install-tapdata-ha.md) approach.
+
+```mdx-code-block
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+```
+
+## Prerequisites
+
+Before you begin, ensure your environment meets the following requirements:
+
+- Hardware specifications: 8-core CPU (x86 architecture), 16 GB of memory
+- Storage specifications: 100 GB
+- Operating System: **CentOS 7+** , **Ubuntu 16.04+** or **Red Hat Enterprise Linux(RHEL)7.x/8.x**
+
+
+
+## Procedure
+
+```mdx-code-block
+
+
+```
+This guide uses CentOS 7 as an example to demonstrate the deployment process.
+
+1. Log in to the target device and execute the following commands to set system parameters such as file access numbers and firewall.
+
+ ```bash
+ ulimit -n 1024000
+ echo "* soft nofile 1024000" >> /etc/security/limits.conf
+ echo "* hard nofile 1024000" >> /etc/security/limits.conf
+ systemctl disable firewalld.service
+ systemctl stop firewalld.service
+ setenforce 0
+ sed -i "s/enforcing/disabled/g" /etc/selinux/config
+ ```
+
+2. Install environmental dependencies.
+
+ 1. Install Java 1.8 version.
+
+ ```bash
+ yum -y install java-1.8.0-openjdk
+ ```
+
+ 2. [Install MongoDB](../../platform-ops/production-deploy/install-replica-mongodb.md) (version 4.0 and above), which will serve as the storage system for TapData to run related data, such as logs and metadata.
+
+3. Download the TapData installation package (contact us at [team@tapdata.io](mailto:team@tapdata.io) to obtain it) and upload it to the target device.
+
+4. On the target device, execute the command below to unzip the package and enter the unzipped directory.
+
+ ```bash
+ tar -zxvf package_name && cd tapdata
+ ```
+
+ For example: `tar -zxvf tapdata-release-v2.14.tar.gz && cd tapdata`
+
+5. Prepare the License file.
+
+ 1. Execute the following command to obtain the SID information required for the application.
+
+ ```bash
+ java -cp components/tm.jar -Dloader.main=com.tapdata.tm.license.util.SidGenerator org.springframework.boot.loader.PropertiesLauncher
+ ```
+
+ 2. Provide the printed SID information to the TapData support team to complete the License application process.
+
+ 3. Upload the acquired License file to the unzipped directory (**tapdata**).
+
+6. Execute `./tapdata start` and follow the command-line prompts to set TapData's login address, API service port, MongoDB connection information, etc. The example and explanation are as follows:
+
+ :::tip
+
+ If deploying with a non-root user, avoid using `sudo` to elevate privileges to prevent installation failure. Before executing commands, use `sudo chown -R :` or `sudo chmod -R 777 ` to grant full permissions to the installation directory for the current user.
+
+ :::
+
+ ```bash
+ ./tapdata start
+ _______ _____ _____ _______
+ |__ __|/\ | __ \| __ \ /\|__ __|/\
+ | | / \ | |__) | | | | / \ | | / \
+ | | / /\ \ | ___/| | | |/ /\ \ | | / /\ \
+ | |/ ____ \| | | |__| / ____ \| |/ ____ \
+ |_/_/ \_\_| |_____/_/ \_\_/_/ \_\
+
+ WORK DIR:/root/tapdata
+ Init tapdata...
+ ✔ Please enter backend url, comma separated list. e.g.:http://127.0.0.1:3030/ (Default: http://127.0.0.1:3030/): …
+ ✔ Please enter tapdata port. (Default: 3030): …
+ ✔ Please enter api server port. (Default: 3080): …
+ ✔ Does MongoDB require username/password?(y/n): … no
+ ✔ Does MongoDB require TLS/SSL?(y/n): … no
+ ✔ Please enter MongoDB host, port, database name(Default: 127.0.0.1:27017/tapdata): …
+ ✔ Does API Server response error code?(y/n): … yes
+ MongoDB uri: mongodb://127.0.0.1:27017/tapdata
+ MongoDB connection command: mongo mongodb://127.0.0.1:27017/tapdata
+ System initialized. To start TapData, run: tapdata start
+ WORK DIR:/root/tapdata
+ Testing JDK...
+ java version:1.8
+ Java environment OK.
+ Unpack the files...
+ Restart TapDataAgent ...:
+ TapDataAgent starting ...:
+ ```
+
+ * **Please enter backend url**: Set the login address for the TapData platform, by default `http://127.0.0.1:3030/`
+ * **Please enter tapdata port**: Set the login port for the TapData platform, by default `3030`.
+ * **Please enter api server port**: Set the service port for the API Server, by default `3080`.
+ * **Does MongoDB require username/password?**: If MongoDB database has security authentication enabled, enter **y** then follow the prompts to enter the username, password, and the authentication database (default `admin`).
+ * **Does MongoDB require TLS/SSL?(y/n)**: If MongoDB database has TLS/SSL encryption enabled, enter **y** then follow the prompts to enter the absolute path addresses of the CA certificate and Certificate Key files, as well as the file password for the Certificate Key.
+ * **Please enter MongoDB host, port, database name**: Set the URI connection information for the MongoDB database, by default `127.0.0.1:27017/tapdata`.
+ * **Does API Server response error code?**: Whether to enable the API Server to respond with error codes.
+
+ After successful deployment, the command line will return a message similar to the following:
+
+ ```bash
+ deployed connector.
+ Waiting for the flow engine to start \
+ FlowEngine is startup at : 2023-04-01 23:00
+ API service started
+ ```
+
+7. Log in to the TapData platform through a browser. The login address for this machine is [http://127.0.0.1:3030](http://127.0.0.1:3030).
+
+ Please change your password promptly upon first login to ensure security.
+
+ :::tip
+
+ If you need to access the TapData service from other devices in the same network, ensure network interoperability.
+
+ :::
+
+
+
+
+
+This example uses Windows Server 2019 to demonstrate the deployment process.
+
+1. [Install MongoDB](../../platform-ops/production-deploy/install-replica-mongodb.md) (version 4.0 and above), which will serve as the storage system for TapData to run related data, such as logs and metadata.
+
+2. Log in to the target device, install Java 1.8 and set environment variables.
+
+ 1. [Download Java 1.8](https://www.oracle.com/java/technologies/javase/javase8-archive-downloads.html) and follow the prompts to complete the installation.
+
+ 2. Go to **Control Panel** > **System and Security** > **System**.
+
+ 3. Click **Advanced System Settings** on the left, then click **Environment Variables**.
+
+ 
+
+ 4. In the dialog that appears, click **New** under **System Variables**, fill in the variable name and value, and click **OK**.
+
+ 
+
+ - **Variable Name**: `JAVA_HOME`
+ - **Variable Value**: The installation path of JDK, for example, `C:\Program Files\Java\jdk1.8.0_202`
+
+ 5. In the **System Variables** area, find and double-click the **Path** variable, then in the dialog that appears, add the following environment variables, and click **OK**.
+
+ 
+
+ - `%JAVA_HOME%\bin`
+ - `%JAVA_HOME%\jre\bin`
+
+ 6. Following step 4, continue to add a system variable with the name and value as follows, then click **OK** after completing the setup.
+
+ - **Variable Name**: `CLASSPATH`
+ - **Variable Value**: `.;%JAVA_HOME%\lib;%JAVA_HOME%\lib\dt.jar;%JAVA_HOME%\lib\tools.jar`
+
+ 7. (Optional) Open the command line, execute `java -version` to verify the effectiveness of the environment variable. Successful execution example:
+
+ ```bash
+ java version "1.8.0_202"
+ Java(TM) SE Runtime Environment (build 1.8.0_202-b08)
+ Java HotSpot(TM) 64-Bit Server VM (build 25.202-b08, mixed mode)
+ ```
+
+3. Download the TapData installation package (you can [contact us](mailto:team@tapdata.io) to obtain it) and unzip the package to the desired directory.
+
+4. Open the command line, navigate to the unzipped directory by executing the following command, in this example, `D:\tapdata`.
+
+ ```bash
+ cd /d D:\tapdata
+ ```
+
+5. Prepare the License file.
+
+ 1. Execute the following command to obtain the SID information required for the application.
+
+ ```bash
+ java -cp components/tm.jar -Dloader.main=com.tapdata.tm.license.util.SidGenerator org.springframework.boot.loader.PropertiesLauncher
+ ```
+
+ 2. Provide the printed SID information to the TapData support team to complete the License application process.
+
+ 3. Upload the obtained License file to the unzipped directory.
+
+6. Execute `./tapdata.exe start` and follow the command line prompts to set TapData's login address, API service port, MongoDB connection information, etc. Example and explanations are as follows:
+
+ ```bash
+ ./tapdata.exe start
+ _______ _____ _____ _______
+ |__ __|/\ | __ \| __ \ /\|__ __|/\
+ | | / \ | |__) | | | | / \ | | / \
+ | | / /\ \ | ___/| | |/ /\ \ | | / /\ \
+ | |/ ____ \| | | |__| / ____ \| |/ ____ \
+ |_/_/ \_\_| |_____/_/ \_\_/_/ \_\
+
+ WORK DIR:/root/tapdata
+ Init tapdata...
+ ✔ Please enter backend url, comma-separated list. e.g.:http://127.0.0.1:3030/ (Default: http://127.0.0.1:3030/): …
+ ✔ Please enter tapdata port. (Default: 3030): …
+ ✔ Please enter API server port. (Default: 3080): …
+ ✔ Does MongoDB require username/password?(y/n): … no
+ ✔ Does MongoDB require TLS/SSL?(y/n): … no
+ ✔ Please enter MongoDB host, port, database name(Default: 127.0.0.1:27017/tapdata): …
+ ✔ Does API Server response error code?(y/n): … yes
+ MongoDB URI: mongodb://127.0.0.1:27017/tapdata
+ MongoDB connection command: mongo mongodb://127.0.0.1:27017/tapdata
+ System initialized. To start TapData, run: tapdata start
+ WORK DIR:/root/tapdata
+ Testing JDK...
+ Java version:1.8
+ Java environment OK.
+ Unpack the files...
+ Restart TapDataAgent ...:
+ TapDataAgent starting ...:
+ ```
+
+ * **Please enter backend url**: Set the login address for the TapData platform, default is `http://127.0.0.1:3030/`.
+ * **Please enter tapdata port**: Set the login port for the TapData platform, default is `3030`.
+ * **Please enter API server port**: Set the service port for the API Server, default is `3080`.
+ * **Does MongoDB require username/password?**: If MongoDB database has enabled security authentication, enter **y** then follow prompts to enter username, password, and the authentication database (default is `admin`).
+ * **Does MongoDB require TLS/SSL?(y/n)**: If MongoDB database has enabled TSL/SSL encryption, enter **y** then follow prompts to enter the absolute path of the CA certificate and Certificate Key file, and the file password of the Certificate Key.
+ * **Please enter MongoDB host, port, database name**: Set the URI connection information for the MongoDB database, default is `127.0.0.1:27017/tapdata`.
+ * **Does API Server response error code?**: Whether to enable API Server response error code function.
+
+ After successful deployment, the command line returns the following example:
+
+ ```bash
+ deployed connector.
+ Waiting for the flow engine to start \
+ FlowEngine is startup at : 2023-04-01 23:00
+ API service started
+ ```
+
+7. Log in to the TapData platform through a browser. The local login address is [http://127.0.0.1:3030](http://127.0.0.1:3030). Please change the password promptly after the first login to ensure security.
+
+ :::tip
+
+ To access the TapData service from other devices on the same internal network, ensure the network is intercommunicable, for example, [setting Windows Firewall](https://learn.microsoft.com/en-us/windows/security/threat-protection/windows-firewall/configure-the-windows-firewall-to-allow-sql-server-access) to allow access to ports 3030 and 3080 on the local machine.
+
+ :::
+
+
+
+
+
+
+## Next Steps
+
+[Connect a Data Source](../connect-data-source.md)
diff --git a/docs/installation/install-tapdata-agent.md b/docs/getting-started/install-and-setup/install-tapdata-agent.md
similarity index 86%
rename from docs/installation/install-tapdata-agent.md
rename to docs/getting-started/install-and-setup/install-tapdata-agent.md
index 8ee04418..4c2139c0 100644
--- a/docs/installation/install-tapdata-agent.md
+++ b/docs/getting-started/install-and-setup/install-tapdata-agent.md
@@ -1,9 +1,5 @@
# TapData Cloud
-import Content from '../reuse-content/_cloud-features.md';
-
-
-
The TapData Agent is an essential component for data synchronization, data heterogeneity, and data pipeline scenarios. While it is recommended to install the TapData Agent within the local network where the database is located for real-time processing, an alternative option is available. You can also install the TapData Agent on the TapData Cloud server, eliminating the need for setting up a machine locally. This provides flexibility and convenience for managing your data flow.
```mdx-code-block
@@ -19,11 +15,11 @@ import TabItem from '@theme/TabItem';
## Procedure
-TapData Cloud offers pricing based on the specifications and quantity of subscribed Agent instances. You have the option to create one free instance of the **SMALL** specification Agent, and if required, you can [purchase additional Agent instances](../billing/billing-overview.md) to align with your specific business requirements.
+TapData Cloud offers pricing based on the specifications and quantity of subscribed Agent instances. You have the option to create one free instance of the **SMALL** specification Agent, and if required, you can purchase additional Agent instances to align with your specific business requirements.
Next, let's create a free Agent instance.
-1. [Log in to TapData Platform](../user-guide/log-in.md).
+1. Log in to TapData Platform.
2. In the left navigation panel, click **Resource Management**.
@@ -31,7 +27,7 @@ Next, let's create a free Agent instance.
4. In the pop-up dialog, select deploy mode, spec and subscription period.
- 
+ 
* **Deploy Mode**
* **Self-Hosted Mode**: You need provide the equipment for [deploying](#deploy-agent) and maintaining the Agent. This allows for the optimal utilization of existing hardware resources, resulting in lower costs and enhanced security.
@@ -39,7 +35,7 @@ Next, let's create a free Agent instance.
:::tip
When selecting the **Fully Managed Mode**, you also need to choose the cloud provider and region where the Agent will be deployed.
:::
- * **Agent Spec**: Select product specifications based on the number of tasks and performance requirements required for evaluation. You can create an example of **SMALL** specifications for free. For detailed descriptions of product pricing and specifications, see [Billing Overview](../billing/billing-overview.md).
+ * **Agent Spec**: Select product specifications based on the number of tasks and performance requirements required for evaluation. You can create an example of **SMALL** specifications for free.
* **Subscription Period**: Select the required subscription period, in order to avoid the expiration of the instance affecting the execution of the task, it is recommended to choose the Annually (**10% off**) or Monthly (**5% off**).
5. Click **Subscription**.
@@ -48,7 +44,7 @@ Next, let's create a free Agent instance.
1. Select the deployment platform on the redirected page.
- 
+ 
2. Click **Copy** to obtain the deployment command.
@@ -71,7 +67,7 @@ Next, let's create a free Agent instance.
1. Log in to the device where the Agent will be deployed (without root privileges), create a folder first (e.g., **tapdata**) and enter it for easier management of the Agent.
2. Paste and execute the installation command you copied before, which contains the process of downloading, deploying, and launching the Agent, and the launch success is shown in the figure below.
- 
+ 
@@ -81,7 +77,7 @@ Next, let's create a free Agent instance.
2. Paste and execute the installation command that you copied before, which includes the steps of downloading, deploying, and launching the Agent. After a successful launch, you can retrieve the container ID, as you can see in below picture.
- 
+ 
@@ -97,7 +93,7 @@ Next, let's create a free Agent instance.
4. (Optional) Double-click the **status.bat** in the Agent installation directory to check the status of the Agent. The following is an example of a normal startup.
- 
+ 
@@ -144,19 +140,10 @@ Next, let's create a free Agent instance.
5. Back to the Deployment page on TapData Cloud, select **Linux(64 bit)** as the target operating system and click **copy**.
- 
+ 
6. In the Docker container's command line, paste the copied command, remove the content before `./tapdata`, and then execute it. The startup is successful, you can refer to the below figure.
- 
-
-
-
-## Next step
-
-[Connect Data Sources](../quick-start/connect-database.md)
-
-## See also
+ 
-* [Manage Agent](../user-guide/manage-agent.md)
-* [FAQ about Agent](../faq/agent-installation.md)
+
\ No newline at end of file
diff --git a/docs/getting-started/publish-imv-as-api.md b/docs/getting-started/publish-imv-as-api.md
new file mode 100644
index 00000000..2f80b5bc
--- /dev/null
+++ b/docs/getting-started/publish-imv-as-api.md
@@ -0,0 +1,56 @@
+# Step 4: Publish Your View as an API
+
+After creating your Incremental Materialized View, you can easily publish APIs that provide flexible read-only access to your view. This lets your marketing platforms, BI dashboards, or any downstream services securely access up-to-date data in real time—without additional ETL or complex SQL queries.
+
+## Procedure
+
+Follow these steps to publish an API service that provides access to [your materialized view](build-real-time-materialized-view.md).
+
+1. Create an Application
+
+ Applications help organize your APIs by business domain. In this example, you’ll create an app called **E-commerce Analytics** to keep all related services together.
+
+ 1. In the left navigation panel, go to **Data Services > Application List**.
+
+ 2. In the top right corner, click **Create Application**.
+
+ 3. In the dialog, enter a name and description, then click **Save**.
+
+ 
+
+2. Create an API Service.
+
+ 1. Go to **Data Services > Service Management** in the navigation panel.
+
+ 2. Click **Create Service** in the top right, then fill in the basic service details:
+
+ 
+
+ - **Service Name**: Enter a clear, meaningful name for easy identification.
+
+ - **Access Scope**: Define which roles can call this API.
+
+ Haven’t [set up roles](../system-admin/manage-role.md) yet? You can leave this blank for now and add roles later.
+
+ - **Own Application**: Select the application you just created, such as **E-commerce Analytics**.
+
+ - **Type**: Choose **MongoDB** as the data source type.
+
+ - **Name**: Select the Incremental Materialized View you created earlier—for example, `order_flat_view`.
+
+ - **API Path Settings**: You can keep the default endpoint or customize it as needed.
+
+ 3. Click **Save**.
+
+3. Publish the API.
+
+ Locate the API service you just created in the list. In the actions column, click **Publish** to make it available for downstream systems.
+
+### What’s next?
+
+Now that your API service is configured, complete the access and delivery setup:
+
+- [Create a Client](../publish-apis/create-api-client.md): Define access control by binding one or more **Roles**, and generate your authentication credentials (**token-based** or **basic auth**).
+- [Create a Server](../publish-apis/create-api-server.md) (API Endpoint): Define the public-facing API path for your data service, which downstream systems like BI tools or marketing platforms will call.
+
+Once configured, your downstream applications can securely fetch fresh, high-value data via REST or GraphQL—without ever querying the production database.
\ No newline at end of file
diff --git a/docs/images/add_columns.png b/docs/images/add_columns.png
index 2f93164a..8e1604db 100644
Binary files a/docs/images/add_columns.png and b/docs/images/add_columns.png differ
diff --git a/docs/images/add_field_rename_node.png b/docs/images/add_field_rename_node.png
new file mode 100644
index 00000000..973e6e06
Binary files /dev/null and b/docs/images/add_field_rename_node.png differ
diff --git a/docs/images/add_order_items_to_main_table.png b/docs/images/add_order_items_to_main_table.png
new file mode 100644
index 00000000..7de6d2ee
Binary files /dev/null and b/docs/images/add_order_items_to_main_table.png differ
diff --git a/docs/images/add_products_to_order_item.png b/docs/images/add_products_to_order_item.png
new file mode 100644
index 00000000..524fed34
Binary files /dev/null and b/docs/images/add_products_to_order_item.png differ
diff --git a/docs/images/add_tables_to_canvas.png b/docs/images/add_tables_to_canvas.png
new file mode 100644
index 00000000..05c8a945
Binary files /dev/null and b/docs/images/add_tables_to_canvas.png differ
diff --git a/docs/images/add_user_info_to_main_table.png b/docs/images/add_user_info_to_main_table.png
new file mode 100644
index 00000000..4c12d88d
Binary files /dev/null and b/docs/images/add_user_info_to_main_table.png differ
diff --git a/docs/images/api_versioning_demo.png b/docs/images/api_versioning_demo.png
new file mode 100644
index 00000000..f2638f33
Binary files /dev/null and b/docs/images/api_versioning_demo.png differ
diff --git a/docs/images/build_mv_in_mdm.png b/docs/images/build_mv_in_mdm.png
new file mode 100644
index 00000000..deeab22e
Binary files /dev/null and b/docs/images/build_mv_in_mdm.png differ
diff --git a/docs/images/check_data_result_en.png b/docs/images/check_data_result_en.png
index a5fe8737..f43ab56a 100644
Binary files a/docs/images/check_data_result_en.png and b/docs/images/check_data_result_en.png differ
diff --git a/docs/images/choose_replication_mode.png b/docs/images/choose_replication_mode.png
new file mode 100644
index 00000000..556f7ab7
Binary files /dev/null and b/docs/images/choose_replication_mode.png differ
diff --git a/docs/images/connect_metabase.png b/docs/images/connect_metabase.png
new file mode 100644
index 00000000..01d258f0
Binary files /dev/null and b/docs/images/connect_metabase.png differ
diff --git a/docs/images/connect_tables_with_merge_node.png b/docs/images/connect_tables_with_merge_node.png
new file mode 100644
index 00000000..ee06df5c
Binary files /dev/null and b/docs/images/connect_tables_with_merge_node.png differ
diff --git a/docs/images/create_api_for_order_flat_view.png b/docs/images/create_api_for_order_flat_view.png
new file mode 100644
index 00000000..c17dfca8
Binary files /dev/null and b/docs/images/create_api_for_order_flat_view.png differ
diff --git a/docs/images/create_api_group.png b/docs/images/create_api_group.png
new file mode 100644
index 00000000..441345c7
Binary files /dev/null and b/docs/images/create_api_group.png differ
diff --git a/docs/images/create_api_service.png b/docs/images/create_api_service.png
index 51ac591f..e1c7d851 100644
Binary files a/docs/images/create_api_service.png and b/docs/images/create_api_service.png differ
diff --git a/docs/images/create_category_in_mdm.png b/docs/images/create_category_in_mdm.png
new file mode 100644
index 00000000..d7804125
Binary files /dev/null and b/docs/images/create_category_in_mdm.png differ
diff --git a/docs/images/create_transformation_task_mdm.png b/docs/images/create_transformation_task_mdm.png
new file mode 100644
index 00000000..3bdfab50
Binary files /dev/null and b/docs/images/create_transformation_task_mdm.png differ
diff --git a/docs/images/delete_phone_field.png b/docs/images/delete_phone_field.png
new file mode 100644
index 00000000..7e05942d
Binary files /dev/null and b/docs/images/delete_phone_field.png differ
diff --git a/docs/images/design_imv_table_relations.png b/docs/images/design_imv_table_relations.png
new file mode 100644
index 00000000..82a6c1a4
Binary files /dev/null and b/docs/images/design_imv_table_relations.png differ
diff --git a/docs/images/drag_table_to_fdm.png b/docs/images/drag_table_to_fdm.png
new file mode 100644
index 00000000..5c5dbb53
Binary files /dev/null and b/docs/images/drag_table_to_fdm.png differ
diff --git a/docs/images/drag_user_on_orders.png b/docs/images/drag_user_on_orders.png
new file mode 100644
index 00000000..f9ba36b5
Binary files /dev/null and b/docs/images/drag_user_on_orders.png differ
diff --git a/docs/images/drag_view_to_api.png b/docs/images/drag_view_to_api.png
new file mode 100644
index 00000000..a54cb40d
Binary files /dev/null and b/docs/images/drag_view_to_api.png differ
diff --git a/docs/images/drag_view_to_downstream.png b/docs/images/drag_view_to_downstream.png
new file mode 100644
index 00000000..2b1750bc
Binary files /dev/null and b/docs/images/drag_view_to_downstream.png differ
diff --git a/docs/images/enable-odh.png b/docs/images/enable-odh.png
new file mode 100644
index 00000000..de8edd27
Binary files /dev/null and b/docs/images/enable-odh.png differ
diff --git a/docs/images/fdm_category.png b/docs/images/fdm_category.png
new file mode 100644
index 00000000..9f3b9caa
Binary files /dev/null and b/docs/images/fdm_category.png differ
diff --git a/docs/images/feishu-bitable_connection_setting.png b/docs/images/feishu-bitable_connection_setting.png
new file mode 100644
index 00000000..bfb2d172
Binary files /dev/null and b/docs/images/feishu-bitable_connection_setting.png differ
diff --git a/docs/images/fmd_users_table_demo.png b/docs/images/fmd_users_table_demo.png
new file mode 100644
index 00000000..848aa00c
Binary files /dev/null and b/docs/images/fmd_users_table_demo.png differ
diff --git a/docs/images/import_export_api.png b/docs/images/import_export_api.png
index 954845f9..ef12ce6f 100644
Binary files a/docs/images/import_export_api.png and b/docs/images/import_export_api.png differ
diff --git a/docs/images/imv-solution.gif b/docs/images/imv-solution.gif
new file mode 100644
index 00000000..c4625cfa
Binary files /dev/null and b/docs/images/imv-solution.gif differ
diff --git a/docs/images/join_order_items_and_order.png b/docs/images/join_order_items_and_order.png
new file mode 100644
index 00000000..6d9c2528
Binary files /dev/null and b/docs/images/join_order_items_and_order.png differ
diff --git a/docs/images/join_users_and_orders.png b/docs/images/join_users_and_orders.png
new file mode 100644
index 00000000..898d5f57
Binary files /dev/null and b/docs/images/join_users_and_orders.png differ
diff --git a/docs/images/lineage_for_fdm.png b/docs/images/lineage_for_fdm.png
new file mode 100644
index 00000000..ab76a4f8
Binary files /dev/null and b/docs/images/lineage_for_fdm.png differ
diff --git a/docs/images/mask_email_info.png b/docs/images/mask_email_info.png
new file mode 100644
index 00000000..027a92e5
Binary files /dev/null and b/docs/images/mask_email_info.png differ
diff --git a/docs/images/monitor_view_task.png b/docs/images/monitor_view_task.png
index 24b9521a..4e6fa9ef 100644
Binary files a/docs/images/monitor_view_task.png and b/docs/images/monitor_view_task.png differ
diff --git a/docs/images/obtain_access_token.png b/docs/images/obtain_access_token.png
index 25b0b2c8..5911cc2f 100644
Binary files a/docs/images/obtain_access_token.png and b/docs/images/obtain_access_token.png differ
diff --git a/docs/images/obtain_restful_address.png b/docs/images/obtain_restful_address.png
index edb833ee..87678ed9 100644
Binary files a/docs/images/obtain_restful_address.png and b/docs/images/obtain_restful_address.png differ
diff --git a/docs/images/odh-layer.png b/docs/images/odh-layer.png
new file mode 100644
index 00000000..0b0f8ccb
Binary files /dev/null and b/docs/images/odh-layer.png differ
diff --git a/docs/images/odh_architecture.png b/docs/images/odh_architecture.png
new file mode 100644
index 00000000..03f4ceba
Binary files /dev/null and b/docs/images/odh_architecture.png differ
diff --git a/docs/images/odh_architecture_bak.png b/docs/images/odh_architecture_bak.png
new file mode 100644
index 00000000..dc3bb92a
Binary files /dev/null and b/docs/images/odh_architecture_bak.png differ
diff --git a/docs/images/orders_enhanced_IMV_task.png b/docs/images/orders_enhanced_IMV_task.png
new file mode 100644
index 00000000..4a7d3099
Binary files /dev/null and b/docs/images/orders_enhanced_IMV_task.png differ
diff --git a/docs/images/performance_benchmark.png b/docs/images/performance_benchmark.png
new file mode 100644
index 00000000..0e322731
Binary files /dev/null and b/docs/images/performance_benchmark.png differ
diff --git a/docs/images/pre_build_in_mdm.png b/docs/images/pre_build_in_mdm.png
new file mode 100644
index 00000000..5c7329f9
Binary files /dev/null and b/docs/images/pre_build_in_mdm.png differ
diff --git a/docs/images/related_task_in_fdm.png b/docs/images/related_task_in_fdm.png
new file mode 100644
index 00000000..ab0109ee
Binary files /dev/null and b/docs/images/related_task_in_fdm.png differ
diff --git a/docs/images/schema_in_fdm.png b/docs/images/schema_in_fdm.png
new file mode 100644
index 00000000..48de5204
Binary files /dev/null and b/docs/images/schema_in_fdm.png differ
diff --git a/docs/images/select_main_table.png b/docs/images/select_main_table.png
index 89a9cf1d..c6d653b3 100644
Binary files a/docs/images/select_main_table.png and b/docs/images/select_main_table.png differ
diff --git a/docs/images/select_tables_from_fdm.png b/docs/images/select_tables_from_fdm.png
new file mode 100644
index 00000000..13dfdb76
Binary files /dev/null and b/docs/images/select_tables_from_fdm.png differ
diff --git a/docs/images/select_view_write_target.png b/docs/images/select_view_write_target.png
index 5f50645f..e18cb794 100644
Binary files a/docs/images/select_view_write_target.png and b/docs/images/select_view_write_target.png differ
diff --git a/docs/images/set_primary_key_for_user.png b/docs/images/set_primary_key_for_user.png
new file mode 100644
index 00000000..69bb0e72
Binary files /dev/null and b/docs/images/set_primary_key_for_user.png differ
diff --git a/docs/images/set_targe_mongodb.png b/docs/images/set_targe_mongodb.png
new file mode 100644
index 00000000..aff03680
Binary files /dev/null and b/docs/images/set_targe_mongodb.png differ
diff --git a/docs/images/table_overview_in_fdm.png b/docs/images/table_overview_in_fdm.png
new file mode 100644
index 00000000..c78ae781
Binary files /dev/null and b/docs/images/table_overview_in_fdm.png differ
diff --git a/docs/images/tapdata_architecture.png b/docs/images/tapdata_architecture.png
new file mode 100644
index 00000000..6e68d729
Binary files /dev/null and b/docs/images/tapdata_architecture.png differ
diff --git a/docs/images/tapdata_resource_efficiency.png b/docs/images/tapdata_resource_efficiency.png
new file mode 100644
index 00000000..3110e490
Binary files /dev/null and b/docs/images/tapdata_resource_efficiency.png differ
diff --git a/docs/images/tapdata_throughput.png b/docs/images/tapdata_throughput.png
new file mode 100644
index 00000000..592dc903
Binary files /dev/null and b/docs/images/tapdata_throughput.png differ
diff --git a/docs/images/tapdata_vs_informatica_pef.png b/docs/images/tapdata_vs_informatica_pef.png
new file mode 100644
index 00000000..bfcd5c84
Binary files /dev/null and b/docs/images/tapdata_vs_informatica_pef.png differ
diff --git a/docs/images/try_query_api.png b/docs/images/try_query_api.png
index 81180f02..0fcc4cc0 100644
Binary files a/docs/images/try_query_api.png and b/docs/images/try_query_api.png differ
diff --git a/docs/images/view_table_in_fdm.png b/docs/images/view_table_in_fdm.png
new file mode 100644
index 00000000..c7e0ad69
Binary files /dev/null and b/docs/images/view_table_in_fdm.png differ
diff --git a/docs/installation/README.md b/docs/installation/README.md
deleted file mode 100644
index 9f4d539f..00000000
--- a/docs/installation/README.md
+++ /dev/null
@@ -1,25 +0,0 @@
-# Installation
-
-import Content from '../reuse-content/_all-features.md';
-
-
-
-TapData offers three deployment modes, **Cloud** , **Enterprise** and **Community**, to meet your diversified needs:
-
-| Product | Applicable Scenarios |
-|-----------------|----------------------------------------------------------------------|
-| [TapData Cloud](install-tapdata-agent.md) | Sign up for a [TapData Cloud](https://cloud.tapdata.net/console/v3/) account for use. Suitable for scenarios requiring rapid deployment and low initial investment, helping you focus more on business development rather than infrastructure management. |
-| [TapData Enterprise](install-tapdata-enterprise/README.md) | Supports deployment to local data centers. Suitable for scenarios with strict requirements on data sensitivity or network isolation, such as financial institutions, government departments, or large enterprises that want full control over their data. |
-| [TapData Community](install-tapdata-community.md) | An [open-source](https://github.com/tapdata/tapdata) data integration platform that provides basic data synchronization and transformation capabilities. This helps you quickly explore and implement data integration projects. As your project or business grows, you can seamlessly upgrade to TapData Cloud or TapData Enterprise to access more advanced features and service support. |
-
-:::tip
-
-For more information, see [Edition Comparison](../introduction/compare-editions.md).
-
-:::
-
-Follow the docs below to deploy TapData:
-
-import DocCardList from '@theme/DocCardList';
-
-
diff --git a/docs/installation/install-tapdata-enterprise/README.md b/docs/installation/install-tapdata-enterprise/README.md
deleted file mode 100644
index e8bfeb0d..00000000
--- a/docs/installation/install-tapdata-enterprise/README.md
+++ /dev/null
@@ -1,11 +0,0 @@
-# TapData Enterprise
-
-import Content2 from '../../reuse-content/_enterprise-features.md';
-
-
-
-TapData Enterprise supports single-node or high-availability deployment. If you are deploying in a production environment, it is recommended to use the [high-availability deployment](../../administration/production-deploy/install-tapdata-ha.md) method.
-
-import DocCardList from '@theme/DocCardList';
-
-
\ No newline at end of file
diff --git a/docs/installation/install-tapdata-enterprise/install-on-windows.md b/docs/installation/install-tapdata-enterprise/install-on-windows.md
deleted file mode 100644
index ced1838d..00000000
--- a/docs/installation/install-tapdata-enterprise/install-on-windows.md
+++ /dev/null
@@ -1,154 +0,0 @@
-# Install on Windows
-
-import Content from '../../reuse-content/_enterprise-features.md';
-
-
-
-This guide explains how to quickly deploy TapData services on a Windows platform.
-
-:::tip
-
-Stand-alone deployment is suitable for functional testing scenarios. For production environments, it is recommended to use [high-availability deployment](../../administration/production-deploy/install-tapdata-ha.md).
-
-:::
-
-## Hardware & Software Requirements
-
-- CPU: 8 cores
-- Memory: 16 GB
-- Storage Space: 100 GB
-- Operating System: Windows OS (64-bit)
-
-## Preparation
-
-1. [Install MongoDB](../../administration/production-deploy/install-replica-mongodb.md) (version 4.0 and above), which will serve as the storage system for TapData to run related data, such as logs and metadata.
-
-2. Log in to the target device, install Java 1.8 and set environment variables.
-
- 1. [Download Java 1.8](https://www.oracle.com/java/technologies/javase/javase8-archive-downloads.html) and follow the prompts to complete the installation.
-
- 2. Go to **Control Panel** > **System and Security** > **System**.
-
- 3. Click **Advanced System Settings** on the left, then click **Environment Variables**.
-
- 
-
- 4. In the dialog that appears, click **New** under **System Variables**, fill in the variable name and value, and click **OK**.
-
- 
-
- - **Variable Name**: `JAVA_HOME`
- - **Variable Value**: The installation path of JDK, for example, `C:\Program Files\Java\jdk1.8.0_202`
-
- 5. In the **System Variables** area, find and double-click the **Path** variable, then in the dialog that appears, add the following environment variables, and click **OK**.
-
- 
-
- - `%JAVA_HOME%\bin`
- - `%JAVA_HOME%\jre\bin`
-
- 6. Following step 4, continue to add a system variable with the name and value as follows, then click **OK** after completing the setup.
-
- - **Variable Name**: `CLASSPATH`
- - **Variable Value**: `.;%JAVA_HOME%\lib;%JAVA_HOME%\lib\dt.jar;%JAVA_HOME%\lib\tools.jar`
-
- 7. (Optional) Open the command line, execute `java -version` to verify the effectiveness of the environment variable. Successful execution example:
-
- ```bash
- java version "1.8.0_202"
- Java(TM) SE Runtime Environment (build 1.8.0_202-b08)
- Java HotSpot(TM) 64-Bit Server VM (build 25.202-b08, mixed mode)
- ```
-
-
-
-## Procedure
-
-:::tip
-
-This example uses Windows Server 2019 to demonstrate the deployment process.
-
-:::
-
-1. Download the TapData installation package (you can [contact us](mailto:team@tapdata.io) to obtain it) and unzip the package to the desired directory.
-
-2. Open the command line, navigate to the unzipped directory by executing the following command, in this example, `D:\tapdata`.
-
- ```bash
- cd /d D:\tapdata
- ```
-
-3. Prepare the License file.
-
- 1. Execute the following command to obtain the SID information required for the application.
-
- ```bash
- java -cp components/tm.jar -Dloader.main=com.tapdata.tm.license.util.SidGenerator org.springframework.boot.loader.PropertiesLauncher
- ```
-
- 2. Provide the printed SID information to the TapData support team to complete the License application process.
-
- 3. Upload the obtained License file to the unzipped directory.
-
-2. Execute `./tapdata.exe start` and follow the command line prompts to set TapData's login address, API service port, MongoDB connection information, etc. Example and explanations are as follows:
-
- ```bash
- ./tapdata.exe start
- _______ _____ _____ _______
- |__ __|/\ | __ \| __ \ /\|__ __|/\
- | | / \ | |__) | | | | / \ | | / \
- | | / /\ \ | ___/| | |/ /\ \ | | / /\ \
- | |/ ____ \| | | |__| / ____ \| |/ ____ \
- |_/_/ \_\_| |_____/_/ \_\_/_/ \_\
-
- WORK DIR:/root/tapdata
- Init tapdata...
- ✔ Please enter backend url, comma-separated list. e.g.:http://127.0.0.1:3030/ (Default: http://127.0.0.1:3030/): …
- ✔ Please enter tapdata port. (Default: 3030): …
- ✔ Please enter API server port. (Default: 3080): …
- ✔ Does MongoDB require username/password?(y/n): … no
- ✔ Does MongoDB require TLS/SSL?(y/n): … no
- ✔ Please enter MongoDB host, port, database name(Default: 127.0.0.1:27017/tapdata): …
- ✔ Does API Server response error code?(y/n): … yes
- MongoDB URI: mongodb://127.0.0.1:27017/tapdata
- MongoDB connection command: mongo mongodb://127.0.0.1:27017/tapdata
- System initialized. To start TapData, run: tapdata start
- WORK DIR:/root/tapdata
- Testing JDK...
- Java version:1.8
- Java environment OK.
- Unpack the files...
- Restart TapDataAgent ...:
- TapDataAgent starting ...:
- ```
-
- * **Please enter backend url**: Set the login address for the TapData platform, default is `http://127.0.0.1:3030/`.
- * **Please enter tapdata port**: Set the login port for the TapData platform, default is `3030`.
- * **Please enter API server port**: Set the service port for the API Server, default is `3080`.
- * **Does MongoDB require username/password?**: If MongoDB database has enabled security authentication, enter **y** then follow prompts to enter username, password, and the authentication database (default is `admin`).
- * **Does MongoDB require TLS/SSL?(y/n)**: If MongoDB database has enabled TSL/SSL encryption, enter **y** then follow prompts to enter the absolute path of the CA certificate and Certificate Key file, and the file password of the Certificate Key.
- * **Please enter MongoDB host, port, database name**: Set the URI connection information for the MongoDB database, default is `127.0.0.1:27017/tapdata`.
- * **Does API Server response error code?**: Whether to enable API Server response error code function.
-
- After successful deployment, the command line returns the following example:
-
- ```bash
- deployed connector.
- Waiting for the flow engine to start \
- FlowEngine is startup at : 2023-04-01 23:00
- API service started
- ```
-
-3. Log in to the TapData platform through a browser. The local login address is [http://127.0.0.1:3030](http://127.0.0.1:3030). Please change the password promptly after the first login to ensure security.
-
- :::tip
-
- To access the TapData service from other devices on the same internal network, ensure the network is intercommunicable, for example, [setting Windows Firewall](https://learn.microsoft.com/en-us/windows/security/threat-protection/windows-firewall/configure-the-windows-firewall-to-allow-sql-server-access) to allow access to ports 3030 and 3080 on the local machine.
-
- :::
-
-
-
-## Next Steps
-
-[Connect to Databases](../../quick-start/connect-database.md)
\ No newline at end of file
diff --git a/docs/installation/install-tapdata-enterprise/install-tapdata-stand-alone.md b/docs/installation/install-tapdata-enterprise/install-tapdata-stand-alone.md
deleted file mode 100644
index b0f4911f..00000000
--- a/docs/installation/install-tapdata-enterprise/install-tapdata-stand-alone.md
+++ /dev/null
@@ -1,139 +0,0 @@
-# Install on Linux
-
-import Content from '../../reuse-content/_enterprise-features.md';
-
-
-
-This document explains how to quickly deploy TapData service on a Linux platform.
-
-:::tip
-
-Stand-alone deployment is suitable for functional testing scenarios. For production environments, it is recommended to use [high availability deployment](../../administration/production-deploy/install-tapdata-ha.md).
-
-:::
-
-## Hardware & Software Requirements
-
-* CPU: 8 cores
-* Memory: 16 GB
-* Storage Space: 100 GB
-* Operating System: **CentOS 7+** , **Ubuntu 16.04+** or **Red Hat Enterprise Linux(RHEL)7.x/8.x**
-
-## Procedure
-
-This guide uses CentOS 7 as an example to demonstrate the deployment process.
-
-1. Log in to the target device and execute the following commands to set system parameters such as file access numbers and firewall.
-
- ```bash
- ulimit -n 1024000
- echo "* soft nofile 1024000" >> /etc/security/limits.conf
- echo "* hard nofile 1024000" >> /etc/security/limits.conf
- systemctl disable firewalld.service
- systemctl stop firewalld.service
- setenforce 0
- sed -i "s/enforcing/disabled/g" /etc/selinux/config
- ```
-
-2. Install environmental dependencies.
-
- 1. Install Java 1.8 version.
-
- ```bash
- yum -y install java-1.8.0-openjdk
- ```
-
- 2. [Install MongoDB](../../administration/production-deploy/install-replica-mongodb.md) (version 4.0 and above), which will serve as the storage system for TapData to run related data, such as logs and metadata.
-
-3. Download the TapData installation package (contact us at [team@tapdata.io](mailto:team@tapdata.io) to obtain it) and upload it to the target device.
-
-4. On the target device, execute the command below to unzip the package and enter the unzipped directory.
-
- ```bash
- tar -zxvf package_name && cd tapdata
- ```
-
- For example: `tar -zxvf tapdata-release-v2.14.tar.gz && cd tapdata`
-
-5. Prepare the License file.
-
- 1. Execute the following command to obtain the SID information required for the application.
-
- ```bash
- java -cp components/tm.jar -Dloader.main=com.tapdata.tm.license.util.SidGenerator org.springframework.boot.loader.PropertiesLauncher
- ```
-
- 2. Provide the printed SID information to the TapData support team to complete the License application process.
-
- 3. Upload the acquired License file to the unzipped directory (**tapdata**).
-
-6. Execute `./tapdata start` and follow the command-line prompts to set TapData's login address, API service port, MongoDB connection information, etc. The example and explanation are as follows:
-
- :::tip
-
- If deploying with a non-root user, avoid using `sudo` to elevate privileges to prevent installation failure. Before executing commands, use `sudo chown -R :` or `sudo chmod -R 777 ` to grant full permissions to the installation directory for the current user.
-
- :::
-
- ```bash
- ./tapdata start
- _______ _____ _____ _______
- |__ __|/\ | __ \| __ \ /\|__ __|/\
- | | / \ | |__) | | | | / \ | | / \
- | | / /\ \ | ___/| | | |/ /\ \ | | / /\ \
- | |/ ____ \| | | |__| / ____ \| |/ ____ \
- |_/_/ \_\_| |_____/_/ \_\_/_/ \_\
-
- WORK DIR:/root/tapdata
- Init tapdata...
- ✔ Please enter backend url, comma separated list. e.g.:http://127.0.0.1:3030/ (Default: http://127.0.0.1:3030/): …
- ✔ Please enter tapdata port. (Default: 3030): …
- ✔ Please enter api server port. (Default: 3080): …
- ✔ Does MongoDB require username/password?(y/n): … no
- ✔ Does MongoDB require TLS/SSL?(y/n): … no
- ✔ Please enter MongoDB host, port, database name(Default: 127.0.0.1:27017/tapdata): …
- ✔ Does API Server response error code?(y/n): … yes
- MongoDB uri: mongodb://127.0.0.1:27017/tapdata
- MongoDB connection command: mongo mongodb://127.0.0.1:27017/tapdata
- System initialized. To start TapData, run: tapdata start
- WORK DIR:/root/tapdata
- Testing JDK...
- java version:1.8
- Java environment OK.
- Unpack the files...
- Restart TapDataAgent ...:
- TapDataAgent starting ...:
- ```
-
- * **Please enter backend url**: Set the login address for the TapData platform, by default `http://127.0.0.1:3030/`
- * **Please enter tapdata port**: Set the login port for the TapData platform, by default `3030`.
- * **Please enter api server port**: Set the service port for the API Server, by default `3080`.
- * **Does MongoDB require username/password?**: If MongoDB database has security authentication enabled, enter **y** then follow the prompts to enter the username, password, and the authentication database (default `admin`).
- * **Does MongoDB require TLS/SSL?(y/n)**: If MongoDB database has TLS/SSL encryption enabled, enter **y** then follow the prompts to enter the absolute path addresses of the CA certificate and Certificate Key files, as well as the file password for the Certificate Key.
- * **Please enter MongoDB host, port, database name**: Set the URI connection information for the MongoDB database, by default `127.0.0.1:27017/tapdata`.
- * **Does API Server response error code?**: Whether to enable the API Server to respond with error codes.
-
- After successful deployment, the command line will return a message similar to the following:
-
- ```bash
- deployed connector.
- Waiting for the flow engine to start \
- FlowEngine is startup at : 2023-04-01 23:00
- API service started
- ```
-
-7. Log in to the TapData platform through a browser. The login address for this machine is [http://127.0.0.1:3030](http://127.0.0.1:3030).
-
-Please change your password promptly upon first login to ensure security.
-
-:::tip
-
-If you need to access the TapData service from other devices in the same network, ensure network interoperability.
-
-:::
-
-
-
-## Next Steps
-
-[Connect to Databases](../../quick-start/connect-database.md)
\ No newline at end of file
diff --git a/docs/introduction/README.md b/docs/introduction/README.md
index 3c93e553..33ce2d89 100644
--- a/docs/introduction/README.md
+++ b/docs/introduction/README.md
@@ -1,8 +1,6 @@
# Product Introduction
-import Content from '../reuse-content/_all-features.md';
-
import DocCardList from '@theme/DocCardList';
diff --git a/docs/introduction/benefits.md b/docs/introduction/_benefits.md
similarity index 97%
rename from docs/introduction/benefits.md
rename to docs/introduction/_benefits.md
index aabcff48..f8d84d37 100644
--- a/docs/introduction/benefits.md
+++ b/docs/introduction/_benefits.md
@@ -1,8 +1,6 @@
# Benefits
-import Content from '../reuse-content/_all-features.md';
-
## Innovative Real-time Data Synchronization
diff --git a/docs/introduction/architecture.md b/docs/introduction/architecture.md
index 3a5769be..47062daf 100644
--- a/docs/introduction/architecture.md
+++ b/docs/introduction/architecture.md
@@ -1,41 +1,74 @@
# Architecture and Workflow
-import Content from '../reuse-content/_all-features.md';
+Discover how TapData’s unified, real-time architecture brings together data integration, transformation, and delivery—making high-quality, always-fresh data available wherever your business needs it.
-
+## Live Data Platform Overview
-As a new-generation real-time data service platform, TapData enables enterprises to easily break the limitations of data silos. It provides real-time, accurate data for analytical and transactional business workloads through real-time data collection technology, flexible data processing methods, comprehensive data governance capabilities, and convenient data publishing methods, supporting businesses in achieving more agile innovation.
+TapData Live Data Platform turns fragmented data into real-time, deliver actionable data services in three simple steps:
-## TapData Cloud Architecture
-TapData Cloud components include TapData Cloud Manager and TapData Agent:
+
-* **TapData Cloud Manager** (TCM): Responsible for installing and configuring agents, as well as designing data tasks and monitoring the status of tasks.
-* **TapData Agent:** Obtain task information from the TapData Cloud Manager (TCM), processing and converting the data to be sent to the target, and reporting the task status back to the TCM during the execution of the task.
+1. Connect
+ Seamlessly ingest data from any source—databases (Oracle, MSSQL, MySQL etc), or event streams (Kafka)—using built-in no-code connectors based on CDC (Change Data Capture) technology
+2. Transform
+ Design data flows with drag-and-drop transformations: merge, clean, or enrich data without writing code. Need custom logic? Add JavaScript customer script node in the pipeline in seconds.
+3. Serve
+ Instantly expose data as REST/GraphQL APIs, sync to downstream systems, or push to analytics platforms—all with sub-second latency.
-
+### Tap CDC: Connector Layer (Ingest)
+- **100+ Ready-to-Use Connectors**
+ Connect to databases (PostgreSQL, MySQL, Oracle with CDC), SaaS apps (Salesforce, Workday), and event streams (Kafka, Debezium).
+- **Change Data Capture (CDC)**
+ Log-based (binlog, WAL) sync with sub-second latency for critical systems.
-TapData employs a range of cyber-security measures to ensure the protection and security of user data and information.
+### Tap Store: Storage Layer (Persist & Model)
-* **One-way Connection**: The TapData Agent instance does not actively expose network information, and only connects to the TCM management service to obtain task information and report status information.
-* **HTTPS Protocol**: TapData Agent instances establish communication with TCM using the HTTPS protocol, ensuring protection against information theft and tampering.
-* **Trusted Environment**: In self-built mode, all data is transmitted exclusively within the user's server and networking environment, ensuring that there is no risk of data leakage.
+- **High-Availability Storage**
+ Built on MongoDB’s replica set architecture, Tap Store ensures data durability and automatic failover—keeping your pipelines running even if a node goes down.
+- **Materialized Views & Data Models**
+ Persist processed datasets and Incremental Materialized Views (IMVs) for instant query access.
+- **Schema Flexibility**
+ Store structured, semi-structured, or nested JSON data without complex schema migrations.
+- **Query-Ready**
+ Expose stored data via SQL, REST, or GraphQL APIs, enabling fast lookups for applications, analytics, and AI/ML workloads.
+### Tap Flow: Processing layer (Transform)
-## TapData Enterprise Architecture
+- **Streaming-Native Pipelines**
+ In-memory execution enables millisecond-level transformations.
+- **Visual Orchestration**
+ Drag-and-drop operators to filter, join, mask—no code required.
-
+### Serving Layer (Deliver)
-TapData Enterprise is structured into four layers, from left to right:
+- **Multi-Model Data Serving / Delivering**
+ Deliver real-time data via APIs, supports reverse sync to databases, and integrates with LLMs through MCP Server.
+- **Virtual Data Products**
+ Package and expose trusted datasets (e.g., `UserProfile`) with built-in access control.
-- **Data Collection Layer**: Based on log parsing capabilities and through established plugin data connectors, it collects changes in data sources in real-time and standardizes them. The standardized data then enters the stream processing framework.
-- **Stream Data Processing Layer**: Through TapData's proprietary solution, data computation, modeling, and transformation are completed within the process, quickly yielding results that move into the storage layer.
-- **Storage Layer**: By the time data is placed into the storage layer, a logical model has already been formed. Users only need to focus on the data required for their business, without concern for the storage location.
-- **Service Layer**: In the service layer, there are two mainstream data service models: Pull and Push. The API supports low-code publishing and can be released according to specific needs. When the required data is already stored in the business system, it can be pushed to users through REVERSE ETL after being organized with governance applied.
-:::tip
+## Key Concepts
-If you do not need to deploy TapData locally, you can choose to use TapData Cloud. For more introduction, see [Version Comparison](https://tapdata.net/pricing.html).
+### Data Pipeline
-:::
+A real-time, always-on dataflow that continuously ingests, transforms, and delivers updates—unlike traditional batch-based ETL.
+### Change Data Capture (CDC)
+
+TapData captures row-level changes (inserts, updates, deletes) at the source to eliminate full-table scans. [Learn more →](change-data-capture-mechanism.md)
+
+### Incremental Materialized View (IMV)
+
+A continuously updated view that processes only changes—ideal for low-latency analytics without expensive refreshes.
+
+#### Key Benefits
+
+- **CDC-Powered Updates**
+ Keeps views fresh by applying only new changes via binlog or WAL—no batch jobs required.
+- **Sub-Second Freshness**
+ Each change is processed in real time, with typical latency under 500ms.
+- **Optimized for Query Performance**
+ Results are persisted in high-speed storage (e.g., Redis, PostgreSQL), accessible via SQL, REST, or GraphQL.
+
+[Learn more about IMV →](../getting-started/build-real-time-materialized-view.md)
\ No newline at end of file
diff --git a/docs/introduction/change-data-capture-mechanism.md b/docs/introduction/change-data-capture-mechanism.md
index a0ce4c44..171f224d 100644
--- a/docs/introduction/change-data-capture-mechanism.md
+++ b/docs/introduction/change-data-capture-mechanism.md
@@ -1,7 +1,5 @@
# Change Data Capture (CDC)
-import Content from '../reuse-content/_all-features.md';
-
Change Data Capture (CDC) is a method for capturing and tracking data changes in a database. It plays a crucial role in data synchronization and integration, enabling incremental data synchronization. This document provides a detailed overview of the various CDC methods, helping you understand their working principles, advantages, and disadvantages, and offering specific usage instructions.
@@ -30,7 +28,7 @@ binlog_format = row
binlog_row_image = full
```
-After completing [permission granting and data source connection](../prerequisites/on-prem-databases/mysql.md), you can configure it as a data source in Tapdata's task configuration to achieve full and incremental data synchronization (default).
+After completing [permission granting and data source connection](../connectors/on-prem-databases/mysql.md), you can configure it as a data source in Tapdata's task configuration to achieve full and incremental data synchronization (default).

@@ -39,7 +37,7 @@ After completing [permission granting and data source connection](../prerequisit
For Oracle and Db2 data sources, Tapdata provides raw log parsing capability in addition to the traditional LogMiner-based CDC. This approach directly parses the native binary log files, achieving more efficient event capture with higher collection performance (Records Per Second, RPS, over 20,000), and reduces the impact on the source database.
-This solution requires the additional installation of a log parsing plugin. For example, with Oracle, after [contacting Tapdata technical support](../appendix/support.md) to complete the plugin deployment, you can choose the log plugin as **bridge** when [configuring the Oracle connection](../prerequisites/on-prem-databases/oracle.md). Then, fill in the IP address of the raw log service, with the default service port of **8190**.
+This solution requires the additional installation of a log parsing plugin. For example, with Oracle, after [contacting Tapdata technical support](../appendix/support.md) to complete the plugin deployment, you can choose the log plugin as **bridge** when [configuring the Oracle connection](../connectors/on-prem-databases/oracle.md). Then, fill in the IP address of the raw log service, with the default service port of **8190**.

@@ -55,7 +53,7 @@ For example, in MySQL, suppose there is a table `orders` where the `last_updated
SELECT * FROM orders WHERE last_updated > '2024-06-01 00:00:00';
```
-After completing [permission granting and data source connection](../prerequisites/on-prem-databases/mysql.md), you can set the incremental synchronization method to **Polling** for the source node and select the target field (`last_updated`) in Tapdata when [configuring the data transformation task](../user-guide/data-development/create-task.md).
+After completing [permission granting and data source connection](../connectors/on-prem-databases/mysql.md), you can set the incremental synchronization method to **Polling** for the source node and select the target field (`last_updated`) in Tapdata when [configuring the data transformation task](../data-transformation/create-views/README.md).

@@ -142,11 +140,11 @@ This method is not optimal and increases maintenance costs, so Tapdata does not
* Q: Which data sources does Tapdata support CDC capture for?
- A: Please refer to the tables in [Supported Data Sources](../prerequisites/supported-databases.md). If incremental data is supported as a data source, CDC information can be obtained.
+ A: Please refer to the tables in [Supported Data Sources](../connectors/supported-data-sources.md). If incremental data is supported as a data source, CDC information can be obtained.
* Q: If my data source supports CDC, how do I choose the CDC collection method?
A: To maximize compatibility and collection performance, Tapdata supports the following CDC collection methods:
* **Database Log API**: The default collection method, supported by most databases. If permission restrictions prevent log access or for certain SaaS data sources, choose the **Field Polling** method.
* **Database Log File**: Currently supported only for Oracle and Db2 data sources.
- * **Field Polling**: Set the incremental synchronization method for the source node in Tapdata when [configuring the data transformation task](../user-guide/data-development/create-task.md).
\ No newline at end of file
+ * **Field Polling**: Set the incremental synchronization method for the source node in Tapdata when [configuring the data transformation task](../data-transformation/create-views/README.md).
\ No newline at end of file
diff --git a/docs/introduction/compare-editions.md b/docs/introduction/compare-editions.md
index 6863edc0..938e28c8 100644
--- a/docs/introduction/compare-editions.md
+++ b/docs/introduction/compare-editions.md
@@ -1,7 +1,5 @@
# Editions Comparison
-import Content from '../reuse-content/_all-features.md';
-
TapData offers three different editions of its product: TapData Cloud, TapData Enterprise, and TapData Community, catering to different user needs and scenarios. Below is a detailed introduction to their features and applicable scenarios.
@@ -41,7 +39,7 @@ TapData Cloud adopts a SaaS (Software as a Service) model. Register for a [TapDa
**Features:**
- **Quick Deployment**: No complex installation and configuration; get up and running in minutes.
-- **Low Upfront Investment**: No need to purchase and maintain hardware; provides one free Agent instance and pay-as-you-go [pricing](../billing/billing-overview.md).
+- **Low Upfront Investment**: No need to purchase and maintain hardware; provides one free Agent instance and pay-as-you-go pricing.
- **Automated Operations**: The system automatically updates and maintains versions, allowing you to focus on business development.
- **High Availability**: Cloud architecture provides high availability and scalability, ensuring the continuity and security of data integration services.
@@ -88,112 +86,112 @@ Building on the free offerings of TapData Community for developers, TapData Ente
diff --git a/docs/introduction/features.md b/docs/introduction/features.md
index 78cfe7d4..8745dea3 100644
--- a/docs/introduction/features.md
+++ b/docs/introduction/features.md
@@ -1,36 +1,78 @@
# Features
-import Content from '../reuse-content/_all-features.md';
+Deliver trusted, real-time data across your systems with a unified, low-latency pipeline. This page outlines the core capabilities that make TapData a powerful live data platform.
-
+## Data Ingestion: Capture at the Source
-This article introduces the features of TapData to help you quickly understand its core capabilities.
+
-## Data Replication
+TapData supports both **real-time** and **historical** data ingestion, ensuring a complete and continuously updated view of your business data.
-TapData offers support for both full data synchronization and real-time incremental data synchronization. TapData can help you to quickly achieve real-time synchronization between the same/heterogeneous data sources, which is suitable for data migration/synchronization, data disaster recovery, reading performance expansion, and other [business scenarios](use-cases.md).
+- **[Change Data Capture (CDC)](change-data-capture-mechanism.md)**
+ Log-based CDC (e.g., MySQL binlog, PostgreSQL WAL) and trigger-based sync methods ensure sub-second latency and zero data loss with automatic retries.
+- [**Broad Connector Ecosystem**](../connectors/supported-data-sources.md)
+ 100+ prebuilt connectors covering relational databases (Oracle, PostgreSQL, Sybase), SaaS platforms (Salesforce, Shopify), cloud services (S3, BigQuery), and legacy systems (Mainframe, FTP).
+- [**Hybrid Pipeline Support**](../data-replication/create-task.md)
+ Combine historical backfill with ongoing streaming sync in a single unified pipeline—no need to separate batch and real-time workflows.
-
+## Transformation: Prepare and Shape Data in Real Time
+
+Transform your data on the fly with a zero-code interface or flexible custom logic.
-## Data transformation
+- **[Visual Pipeline Builder](../operational-data-hub/mdm-layer/prepare-and-transform.md)**
+ Drag-and-drop operators like `filter`, `join`, `aggregate`, `mask` for rapid flow creation. Power users can inject JavaScript for advanced logic.
+- **Incremental Materialized Views (IMV)**
+ Maintain fresh, pre-computed results using CDC—no full-table recalculations. Ideal for building analytics-ready views with millisecond latency.
+- **Built-in Data Quality**
+ Apply schema validation and anomaly detection (e.g., null spike, value drift). Pipelines can auto-pause on quality failures to prevent bad data propagation.
-Aiming at complex data processing needs, TapData supports a variety of [processing nodes](../user-guide/data-development/process-node.md) between data sources based on data replication capabilities. These nodes provide advanced data processing capabilities such as data splitting, field addition, and deletion, and shared mining.
+## Data Delivery: Serve Real-Time Data Anywhere
-
+Distribute trusted, real-time data to [downstream systems](../operational-data-hub/adm-layer/sync-downstream.md), [APIs](../operational-data-hub/adm-layer/integrate-apis.md), and apps—no ETL required.
+
+- **Multi-Protocol Outputs**
+ Automatically expose your datasets as [REST](../publish-apis/query/query-via-restful.md)/[GraphQL](../publish-apis/query/query-via-graphql.md) APIs, publish to Kafka, or sync directly to data warehouses like Snowflake or Delta Lake.
+- **Virtual Data Products**
+ Publish curated business views (e.g., `user_profile`, `finance.revenue`) with access controls, lineage, and usage monitoring—ideal for MDM, analytics, or API-based consumption.
+
+## Operational Control: Build with Confidence
+
+[Enterprise-grade governance](../operational-data-hub/plan-data-platform.md) and deployment options let you manage pipelines at scale with full observability.
+
+- **Governance & Security**
+ Track full [lineage](../operational-data-hub/fdm-layer/explore-fdm-tables.md) from source to consumer. Protect sensitive fields with masking and hashing (GDPR/CCPA ready). RBAC and Kubernetes-native deployment supported.
+- **[Pipeline Monitoring](../data-replication/monitor-task.md)**
+ Built-in dashboards show lag, throughput, and error metrics. Proactive alerts via Slack, email, or webhook.
+- **Flexible Deployment**
+ Run on-premises, in the cloud, or across hybrid environments. Cross-cloud sync supported (e.g., RDS → Synapse).
+
+## Developer Experience: Built for Extensibility
+Whether you're building pipelines or extending the platform, TapData offers tools for full lifecycle automation.
+- **[API-First Control](../experimental/tapflow/introduction.md)**
+ Manage pipelines via declarative configs (YAML/JSON), CLI, or Terraform.
+- [**Custom Extensions**](../operational-data-hub/advanced/README.md)
+ Build your own connectors in Java/JS. Extend pipeline logic with plugin architecture.
-## Real-Time Data Hub
+## AI Agent Integration: Connect LLMs to Live Data (Preview)
-With TapData's [Real-Time Data Hub](../user-guide/real-time-data-hub/daas-mode/enable-daas-mode.md), you can synchronize data scattered in different business systems to a unified platform cache layer, which can provide basic data for subsequent data processing, while avoiding the performance impact of directly reading/manipulating the source database. This helps create a consistent, real-time data platform and connects disparate data silos.
+
-
+Empower AI models and agents with real-time business context through standardized protocols.
+- **LLM & Agent Integration**
+ Connect popular AI tools like Cursor, Claude, and custom agents to your live data through Model Context Protocol (MCP).
+- **Real-Time Context Delivery**
+ MCP provides structured, real-time business data to AI models, enhancing inference accuracy and reducing hallucinations.
+- **Enterprise-Ready Security**
+ Field-level masking, role-based permissions, and controlled access ensure AI models get only authorized, fresh data during inference.
+:::tip
-## Supported sources and targets
+[Try AI Agent Integration via MCP (Preview) →](../experimental/mcp/introduction.md)
-TapData supports mainstream databases, including commercial databases, open-source databases, cloud databases, SaaS platform data sources, file data sources, and allows for custom data sources. For more information, see [Supported Data Sources](../prerequisites/supported-databases.md).
+:::
diff --git a/docs/introduction/performance.md b/docs/introduction/performance.md
new file mode 100644
index 00000000..f8bb6383
--- /dev/null
+++ b/docs/introduction/performance.md
@@ -0,0 +1,112 @@
+# Performance Benchmarks
+
+This benchmark summarizes TapData's real-world performance across mainstream data sources. It reflects production-like workloads—including full sync, incremental CDC, and mixed writes—validating the platform’s ability to handle high-throughput, low-latency pipelines at scale.
+
+## Overview
+
+A high-level snapshot of TapData’s performance across full sync, incremental CDC, and mixed DML workloads.
+
+
+
+:::tip
+
+ClickHouse supports append-only operations and is excluded from CDC and DML tests.
+
+:::
+
+## Full Sync Throughput
+
+TapData achieves strong full-sync performance across both structured and unstructured data systems. The following table shows read and write throughput measured in records per second (RPS), based on 1KB records with \~50 fields.
+
+| Data Source | Full Read RPS | Full Write RPS |
+| -------------- | ------------- | -------------- |
+| **Oracle** | 300,000 | 240,000 |
+| **MySQL** | 86,000 | 32,000 |
+| **Kafka** | 330,000 | 110,000 |
+| **MongoDB** | 450,000 | 95,000 |
+| **PostgreSQL** | 102,000 | 31,000 |
+| **ClickHouse** | 280,000 | 250,000 |
+
+## Incremental Sync (CDC)
+
+TapData supports change data capture (CDC) with consistently high throughput and low latency across traditional and NoSQL systems.
+
+| Data Source | CDC Read RPS | P99 latency |
+|---------------------------| ------------ |--------------|
+| **Oracle (Direct Parse)** | 62,000 | < 1s |
+| **Oracle (LogMiner)** | 19,000 | < 3s |
+| **MySQL** | 22,000 | < 1s |
+| **MongoDB** | 19,000 | < 1s |
+| **PostgreSQL** | 22,000 | < 1s |
+
+:::tip
+- **P99 latency** indicates the maximum delay experienced by 99% of change events.
+- TapData supports both Oracle LogMiner and native log parsing; the latter delivers higher throughput for high-frequency scenarios. [Learn more](../connectors/on-prem-databases/oracle.md#incremental-data-capture-methods).
+:::
+
+
+## Mixed Load Performance
+
+To simulate transactional behavior, we issued INSERT, UPDATE, and DELETE operations in a **1:1:1 ratio**. TapData sustained stable throughput across major targets.
+
+
+| Data Target | Mixed Write RPS |
+| -------------- | --------------- |
+| **Oracle** | 12,000 |
+| **MySQL** | 13,000 |
+| **MongoDB** | 2,500 |
+| **PostgreSQL** | 8,000 |
+
+## Engine-Level Capacity
+
+Under synthetic loads with 1KB records, a single TapData engine instance sustained up to **450,000 RPS**—demonstrating strong performance and scalability in high-concurrency scenarios.
+
+## Key Takeaways
+
+- **Real-time CDC** with low-latency support across relational and NoSQL databases
+- **High throughput**: Up to 450K RPS for full sync, 60K+ RPS for incremental reads
+- **Mixed DML support**: Stable performance for real-world transactional workloads
+- **Flexible CDC modes**: Choose Oracle parsing mode based on performance/ops needs
+
+
+## About This Report
+
+These results are based on **TapData v3.7.0**, tested under typical enterprise configurations.
+
+This benchmark evaluates TapData’s performance across full sync, incremental CDC, and mixed DML workloads—validating its ability to handle high-throughput, low-latency pipelines at scale.
+
+**Test Scope**
+- Full sync throughput (initial batch replication)
+- Incremental CDC read/write performance
+- Mixed write throughput (DML: INSERT, UPDATE, DELETE in a 1:1:1 ratio)
+- Engine scalability under concurrent workloads
+
+**Test Environment**
+- **TapData Version**: v3.7.0
+- **TapData Server**: Alibaba Cloud ecs.u1-c1m2.4xlarge (16 vCPU, 32 GB RAM, 300 GB ESSD)
+- **TapData Memory Allocation**: JVM configured via -Xmx with 16 GB for engine, 8 GB for management; metadata database allocated 4 GB via --wiredTigerCacheSizeGB
+- **Database Servers**: Most databases use ecs.u1-c1m2.2xlarge (8 vCPU, 16 GB RAM) except Oracle which uses ecs.ebmhfc6.20xlarge (80 vCPU, 192 GB RAM)
+
+
+**Database Configurations**
+
+All databases use ESSD storage and are configured with the following system optimizations:
+```bash
+ulimit -n 655350
+sysctl -w net.core.somaxconn=1024
+sysctl -w net.core.netdev_max_backlog=5000
+sysctl -w net.ipv4.tcp_max_syn_backlog=8192
+```
+
+**Database-Specific Settings:**
+- **Oracle 11g** (Single node): Online redo log files increased to 4GB (vs. default 512MB) for smoother writes and stable incremental read performance
+- **ClickHouse 24.5.3.5** (Single node): Memory usage limit increased to 10GB
+- **MongoDB 6.0.15** (Replica set, single node): Storage cache memory set to 12GB
+- **MySQL 8.0** (Single node): InnoDB buffer pool set to 8GB with optimized log and flush settings
+- **Kafka 3.6** (Single broker with ZooKeeper): Compression set to Snappy, PLAIN transport (no encryption), ACK set to write to primary partition only
+- **SQL Server 2016** (Windows OS, Single node): Default configuration
+- **PostgreSQL 12** (Single node): Default configuration with Pgoutput as CDC plugin
+- **Elasticsearch 7** (Single node): Default configuration
+- **Redis 7** (Single node): Default configuration
+
+Results may vary depending on hardware, deployment mode, and connector settings.
diff --git a/docs/introduction/security.md b/docs/introduction/security.md
index 5c744d9b..c8531a55 100644
--- a/docs/introduction/security.md
+++ b/docs/introduction/security.md
@@ -1,8 +1,6 @@
# Data Security
-import Content from '../reuse-content/_all-features.md';
-
As we embrace cloud services, the safety of our data has become a top priority. This concern not only relates to the regulatory compliance of enterprise data services, but more crucially, to the protection of vital business data. Recognizing this, TapData was designed with security at its core. From architectural design, technical implementation, and operational procedures, strict safeguards have been put in place, ensuring a safe and secure user experience.
@@ -14,11 +12,6 @@ As we embrace cloud services, the safety of our data has become a top priority.
What is the role of Agent?
The TapData Agent plays a crucial role in data synchronization, handling data heterogeneity, and supporting data transformation scenarios. It is responsible for extracting data from the source system, performing necessary processing, and transmitting it to the target system. The TapData Agent is centrally managed by TapData Cloud.
-
----
-
-
-
## Systematic Security Design
### Account Access Control
@@ -66,9 +59,6 @@ Clear guidelines have been established for the usage and retention of user data.
- All database and API credentials you provide are encrypted stringently. Apart from the application, no one has access to these details.
- Support for SSL or SSH tunnel encrypted connections to data sources, safeguarding data connectivity and transmission. HTTPS encrypted connections to SaaS-type data sources are also available.
-- Both fully managed and semi-managed [Agent deployment modes](../billing/purchase.md) are available to meet diverse data transfer requirements:
- - *Semi-Managed:* All of your data, whether in its raw form or has been processed, is stored and managed within your private environment exclusively. The Agent handles data orchestration and processing tasks in-house, ensuring that no data is ever uploaded to TapData Cloud.
- - *Fully Managed:* During any task execution, your data only travels between the source database, the Agent, and the destination database. At no point will data be uploaded to TapData Cloud. The Agent provides a securely managed external service address, allowing you to bolster security measures through database whitelists or specific firewall rules.
### Account Password Security Policies
diff --git a/docs/introduction/tapdata-vs-informatica.md b/docs/introduction/tapdata-vs-informatica.md
new file mode 100644
index 00000000..fdb992a8
--- /dev/null
+++ b/docs/introduction/tapdata-vs-informatica.md
@@ -0,0 +1,54 @@
+# TapData vs. Informatica MDM
+
+Both TapData and Informatica help you manage master data—but with very different goals in mind. Informatica MDM is built for batch-oriented, governance-heavy environments. TapData offers a real-time, low-latency platform for delivering “always accurate” golden records—ideal for operational use and cloud-native teams.
+
+This guide helps you compare both, and shows when TapData’s **Active MDM** offers the modern alternative.
+
+## Comparing TapData and Informatica MDM
+
+| Capability | **TapData Active MDM** | **Informatica MDM** |
+| ----------------------- | ------------------------------------------ | ------------------------------------------ |
+| **Data Freshness** | Sub-second CDC updates | Hourly/daily batch jobs |
+| **Golden Record Logic** | Real-time merge with conflict resolution | Scheduled merges |
+| **Source Connectivity** | 100+ CDC connectors, zero-code setup | ETL tools and PowerExchange required |
+| **Change Propagation** | <500ms latency to downstream systems | Delayed by next batch cycle (hours) |
+| **API Access** | Auto-generated REST & GraphQL APIs | Requires separate CIAM setup |
+| **Cloud Architecture** | Unified control plane across cloud/on-prem | Separate modules for each deployment model |
+
+
+
+## What Makes TapData “Active” MDM
+
+### Real-Time Golden Records
+
+TapData merges incoming changes from 10+ systems via CDC, keeping master data always current.
+
+*Example: A customer address update in SAP triggers a golden record refresh in <1s.*
+
+### Built-In Data Quality
+
+Validate values like email formats before merging. Resolve conflicts using rules, timestamps, or priority.
+
+### Instant API Access
+
+Expose golden records as secure REST/GraphQL endpoints—instantly consumable by downstream apps.
+
+### Event-Driven by Default
+
+Push Kafka/webhook notifications on golden record changes (e.g., customer tier upgrade).
+
+
+
+## When TapData Makes Sense
+
+Use TapData when:
+
+- You need sub-second golden record refreshes
+- You want APIs out of the box, no backend needed
+- You’re building cloud-native apps with real-time needs
+- You prefer low maintenance, no batch jobs or ETL
+
+**Example:**
+ “Unify customer records across Salesforce, and MongoDB, validate the data, resolve conflicts, and publish a GraphQL API.”
+
+> With traditional tools, delivering the same result could take **weeks of work**—and a cross-functional team juggling ETL tools, job schedulers, and backend API code.
\ No newline at end of file
diff --git a/docs/introduction/tapdata-vs-kafka.md b/docs/introduction/tapdata-vs-kafka.md
new file mode 100644
index 00000000..010a0e67
--- /dev/null
+++ b/docs/introduction/tapdata-vs-kafka.md
@@ -0,0 +1,101 @@
+# TapData vs. Kafka
+
+Both TapData and Apache Kafka enable real-time data movement—but they’re designed for different audiences and use cases.
+
+**Kafka** is a powerful foundation for custom, developer-built event systems that typically require development effort. **TapData** is data integration platform designed with no code or low code data engineering in mind.
+
+This guide helps you decide which fits your goals—or how they can work together.
+
+## Comparing TapData and Apache Kafka
+
+| Capability | **TapData** | **Apache Kafka** |
+| ----- | --- | ----- |
+| **Purpose** | Unified platform for real-time CDC, transform, APIs | Event streaming backbone |
+| **Setup Time** | Minutes (no code, UI-driven) | Weeks (custom setup & plugins) |
+| **Change Data Capture** | Built-in for 100+ sources | Requires Debezium / custom dev |
+| **Transformation** | UI + SQL/JS logic + IMV support | Requires Flink/Kafka Streams |
+| **Serving Layer** | Real-time REST/GraphQL APIs | Requires additional systems |
+| **Schema Handling** | Auto-detect, versioned, GUI-managed | Manual Registry config |
+| **Ops & Scaling** | Built-in auto-scaling and alerting | Requires manual tuning + external tooling |
+| **Learning Curve** | Low-code, team-friendly | Steep (Java/Scala required) |
+| **Pricing Model** | Predictable SaaS pricing (pipeline + volume) | Open source core, but hidden enterprise costs (Confluent, infra, ops) |
+
+
+
+## What Makes TapData Different
+
+### Built for Speed, Not Complexity
+
+Launch pipelines across 100+ sources in minutes. No need to wire together Kafka + Debezium + Flink—TapData does it all in one place.
+
+### Transformations Made Easy
+
+Join, clean, deduplicate, mask, and enrich data—**visually** or with lightweight JS/SQL. Ideal for both operational and analytical use cases.
+
+### Incremental Materialized Views (IMV)
+
+Skip nightly rebuilds. TapData lets you cache joined, aggregated, or filtered views that auto-refresh as source data changes.
+
+### API-Ready by Design
+
+Publish any pipeline output as a versioned REST or GraphQL API, complete with row/column-level permissions. No extra backend needed.
+
+### Layered Architecture
+
+TapData promotes reusability and control through layered design:
+
+- **FDM**: Mirror raw source tables via CDC
+- **MDM**: Transform into wide, analytics-ready business entities
+- **ADM**: Deliver via APIs, pipelines, or data sync
+
+
+
+## When TapData Makes Sense
+
+Use TapData when:
+
+- You need real-time pipelines **ready in days**, not weeks
+- You want **CDC + transform + APIs in one UI**
+- You support **business apps** that rely on fresh, accurate data
+- You want **low maintenance**—no Kafka tuning, no Flink jobs
+
+**Example:**
+ “Sync Salesforce, PostgreSQL, and MongoDB into a real-time user view, apply masking, and publish as an API—in 2 hours.”
+
+
+
+## When Kafka Excels
+
+Kafka is your best bet when:
+
+- You’re building **custom event architectures**
+- You need **ultra-high throughput** (1M+ events/sec)
+- Your team has **deep streaming expertise**
+
+**Example:**
+ “Build a fraud detection engine with Flink jobs on raw Kafka streams.”
+
+
+
+## TapData + Kafka: Better Together
+
+TapData can simplify and supercharge your Kafka stack.
+
+| Use Case | Flow Diagram | TapData Benefits |
+| --------------------------------- | ------------------------------------------ | ------------------------------------------------------------ |
+| **CDC Frontend for Kafka** | DBs → TapData → Kafka Topics | Visual UI for schema handling, masking, filtering, and deduplication |
+| **Serving Layer on Top of Kafka** | Kafka Topics → TapData → REST/GraphQL APIs | Exposes topics as secure, queryable APIs—no need to build API layers manually |
+
+**Why this hybrid works:**
+Kafka excels at raw event distribution. TapData brings **developer experience, governance, and business access**—all while staying real-time.
+
+
+
+## Final Takeaways
+
+1. **TapData = Real-Time Simplicity**
+ Pipelines, transformations, IMVs, and APIs—fully managed, no code required.
+2. **Kafka = Custom-Built Power**
+ Ideal for teams building large-scale, event-first architectures from the ground up.
+3. **Together = The Best of Both Worlds**
+ TapData reduces the engineering effort to adopt, extend, and operationalize Kafka.
\ No newline at end of file
diff --git a/docs/introduction/terms.md b/docs/introduction/terms.md
index 3f40c35d..e0e3bbc5 100644
--- a/docs/introduction/terms.md
+++ b/docs/introduction/terms.md
@@ -1,62 +1,79 @@
# Terminology
-import Content from '../reuse-content/_all-features.md';
+This article introduces common terms used in TapData to help you quickly understand product and feature concepts.
-
+## Data Source
-This article introduces common terms used in TapData to help you quickly understand product and feature concepts.
+A system or platform from which TapData can ingest data. This includes relational databases (MySQL, PostgreSQL, Oracle), NoSQL databases (MongoDB, Redis), SaaS platforms (Salesforce), message queues (Kafka), and more. Support for additional types such as files, GridFS, UDP, and custom plugins is planned.
-## Full Data Synchronization
+## Connection
-Database migration or cloning, within the data flow task, is ideal for business scenarios involving complete data migration between different library-level data sources. This includes instances where data needs to be migrated, moved up or down the cloud, or when databases need to be split and expanded.
+A configured instance of a data source, including credentials, host information, and metadata access. Connections are the entry point for TapData to interact with external systems.
-## Incremental Data Synchronization
+## ODH (Operational Data Hub)
-In the data flow task, the real-time synchronization of data among multiple data sources through specific association relationships or processing is suitable for meeting user scenarios such as data analysis, processing, and disaster recovery without impacting user business operations.
+A real-time architecture pattern built atop FDM and MDM. TapData’s ODH delivers continuously updated, query-ready views of key business entities, breaking down data silos. It powers analytics, APIs, and business logic with a live, unified source of truth.
-## Data Source
+## CDC (Change Data Capture)
-The data sources that can be connected to the TapData system from external sources include databases, and in the future, there are plans to gradually expand the support for other types such as files, GridFS, RestAPI, Dummy, Custom, UDP, Cache, and more.
+A technique that captures insert, update, and delete operations from source systems in real time. TapData supports both log-based (e.g., binlog, WAL) and trigger-based CDC methods, enabling sub-second latency. CDC serves as the foundation for all downstream layers: FDM, MDM, and ADM.
-## Data Replication
-Also known as database replication/cloning, involves full or real-time incremental migration of data between various levels of data sources in data flow tasks. Applicable for instance data migration, cloud migration, database splitting, and expansion scenarios.
+## FDM (Foundational Data Model)
-## Data Transformation
-In data flow tasks, real-time synchronization of data between multiple tables or other types of data through specific association or processing. Suitable for scenarios such as data analysis, processing, and disaster recovery without affecting user operations.
+Also known as the **Platform Cache Layer**, FDM mirrors raw source tables using CDC. It reduces the load on operational databases while preserving data fidelity. The FDM layer maintains source-like schemas and provides a real-time foundation for modeling and transformation.
-## Data Service
-In data flow tasks, generating a new model from one or more tables' different fields and publishing it externally via an API. Users can obtain data content through the API.
+## MDM (Master Data Model)
+A Master Data Model in TapData is a standardized data model that defines the structure, attributes, relationships for core business entities (e.g., Customer, Product) to ensure consistency, accuracy, and reusability across integrated systems. It serves as the single source of truth for master data in real-time synchronization, data pipelines, and API-based data services.
+Master data model is typically created based on FDM models, using TapData's real time pipeline. TapData uses JSON to store master data model, hence these pipelines typically reads data from multiple FDM tables and merge them into a rich structured master data model.
-## Connection
-Also known as a data source, it refers to the database that connects externally to the TapData system. Currently supported connections include: MySQL, Oracle, MongoDB, SQL Server, PostgreSQL, Kafka, Redis, etc.
+The master data model in TapData is continuously updated by every insert/update/delete change from each of the contributing tables.
-## Node
-Refers to the general term for data sources and processing methods selected in the data task arrangement page.
+## ADM (Application Data Model)
+
+The delivery layer where curated data is served to consuming systems. TapData supports low-latency delivery via REST/GraphQL APIs, Kafka streams, and direct sync to analytical databases. ADM enables real-time consumption of cleaned and modeled data across platforms.
+
+## Initialization
+
+The process of synchronizing historical (existing) data before switching to real-time incremental sync (CDC). Initialization typically uses full data replication.
+
+## Full Data Synchronization
+
+A one-time or scheduled process that copies all data from a source to a target. Often used for initial loads, cloud migrations, database sharding, or offline backups.
+
+## Incremental Data Synchronization
+
+Continuously captures changes (insert/update/delete) from source systems and applies them to the target in real time—typically powered by Change Data Capture (CDC). Enables real-time analytics, disaster recovery, and low-latency data integration.
+
+## Data Replication
+
+A general term for both full and incremental data synchronization. It refers to the process of copying data from one system to another, either completely (full load) or incrementally (CDC-based).
+
+## Data Transformation
+
+The process of modifying, enriching, or reshaping data in motion—between ingestion and delivery. Includes operations like field mapping, joins, filtering, deduplication, data masking, and schema validation.
## Processing Node
-Refers to nodes for various processing functions to meet data synchronization needs. Currently supported processing nodes include: JavaScript/Java processing, database table filtering, field processing, row-level processing, etc.
-## Source Node
-In data tasks, among any two adjacent connected nodes, it refers to the node that is at the source/end generating the connection.
+A logical unit in the pipeline used to apply transformations or rules. Examples include JavaScript/Java processors, field mappers, row-level filters, deduplicators, and anomaly detectors.
-## Target Node
-In data tasks, among any two adjacent connected nodes, it refers to the node that is at the target/end being pointed to by the connection.
+## Source Node / Target Node
+
+In any pipeline, the source node provides the data, while the target node receives it after transformation. Every connection between two nodes is directional and reflects this source-target relationship.
## Shared Mining
-Refers to the sharing of incremental logs. When the feature is enabled, shared mining extracts incremental logs, eliminating the need for multiple incremental tasks to start a log collection process from the same source, significantly alleviating resource consumption and wastage on the source database.
+
+A feature that allows multiple pipelines to reuse a single CDC log extraction stream from the same source database. Reduces performance overhead and avoids redundant log parsing.
## Shared Cache
-Refers to storing some commonly used data from tables into the cache for different tasks to call and process, eliminating the need to retrieve data from the source, thereby improving efficiency.
-## Initialization
-In data migration or synchronization tasks, the mode of migrating or synchronizing existing data in the data source node.
+Caches frequently accessed lookup tables or static datasets across pipelines to reduce source load and improve processing speed.
## TapData Agent
-Refers to the execution program that runs the synchronization task, and is responsible for obtaining the task from the management side, connecting the source data source, performing data conversion, and outputting to the target data source.
+A lightweight runtime component that executes pipelines. It connects to data sources and targets, performs transformations, and ensures reliable data delivery.
-## TCM Management Side
+## TCM (TapData Control Manager)
-The TapData management console enables users to define custom orchestration synchronization tasks and deploy these tasks to synchronization instances for execution.
\ No newline at end of file
+The centralized management plane for pipeline orchestration, configuration, monitoring, and deployment. Users interact with TCM to create, modify, and observe pipelines.
\ No newline at end of file
diff --git a/docs/introduction/use-cases.md b/docs/introduction/use-cases.md
index 386b24f8..9b134e4d 100644
--- a/docs/introduction/use-cases.md
+++ b/docs/introduction/use-cases.md
@@ -1,110 +1,112 @@
# Application Scenarios
+Tapdata is a real-time data platform that unifies change capture, in-memory processing, and API/service delivery. Below are the most common scenarios—organized by who cares and what outcome they need.
-import Content from '../reuse-content/_all-features.md';
-
+## Technical Use Cases
-TapData is a next-generation real-time data platform that centralizes core enterprise data in real-time into a centralized data platform. It supports downstream interactive applications, microservices, or interactive analytics by providing real-time data through APIs or reverse synchronization.
+*(What engineers can build and accelerate using TapData’s real-time architecture)*
-## Build Real-time Data Pipelines
+### Active Master Data & Operational Data Hub
-Traditional master data management retrieves source data from business systems in a T+1 manner, processes it into standard enterprise data, and delivers it to business systems via export. This approach's limitation lies in the lag in data updates. Challenges such as CDC data collection errors and Kafka blockages make troubleshooting difficult in real-time data pipeline constructions using CDC + Kafka + Flink.
+- **Real-Time CDC Merge Across Databases**
+ Sync core entities(e.g., customers, products)across heterogeneous systems—powering a continuously updated MDM or ODH layer.
+- **Quality Gates at Ingest**
+ Apply validation, standardization, deduplication, and schema checks as data flows in—no downstream surprises.
+- **Schema Version Tracking & Drift Detection**
+ Track schema changes and detect metric drift before it affects downstream analytics.
-TapData offers a one-stop real-time data synchronization experience, enabling you to build complete data collection and flow pipelines in just a few steps, with the following advantages:
+### Real-Time Integration
-* Supports a rich array of [data sources](../prerequisites/supported-databases.md) for data synchronization between homogeneous/heterogeneous data sources.
-* Supports event-triggered data processing logic and multiple data verification methods to ensure high reliability and low latency.
-* Enables deduplication, rule judgment, and other master data governance functions through powerful UDF capabilities.
-* Supports low-code [API services](../user-guide/data-service/README.md) deployment for end-to-end data consumption.
+- **Zero/Low-Code Pipeline Builder**
+ Drag-and-drop to connect 100+ sources (DBs, SaaS, APIs) with built-in CDC and sub-second sync.
+- **Unified Streaming + Batch**
+ Seamlessly combine historical backfills with CDC updates in the same logical pipeline—no need to juggle Airflow + Kafka.
+- **One-Stop Schema & Transformation Handling**
+ Eliminate the need for Kafka, Flink, and schema registries—TapData handles the full transformation lifecycle natively.
-## Extract/Transform/Load Data (ETL)
+### Query Acceleration with Incremental Materialized Views
-Traditional approaches use tools like Kettle, Informatica, or Python to process and transfer data to new business system databases. These ETL solutions often have complex links, are non-reusable, and can significantly impact source system performance.
+- **Hot Path Joins**
+ Pre-join operational tables (e.g., Orders + Customers) to reduce OLAP query times.
+- **IMV (Incremental Materialized Views)**
+ Cache pre-defined aggregations (e.g., revenue by day), auto-refreshed on source change—define once, no orchestration needed.
+- **Optional Federated Pushdown**
+ Push filters and joins to source systems to reduce duplication and latency.
-TapData's real-time data services can perform a final ETL of data, synchronizing it to a distributed data platform based on MongoDB. Combined with no-code APIs, it provides quick data API support directly from the data platform for many downstream businesses, offering the following advantages:
+### API Services & Data Productization
-* Drag-and-drop-based next-generation data development simplifies processes.
-* Distributed deployment capabilities provide higher processing performance.
-* JS or Python-based UDF features infinitely extend processing capabilities.
-* Supports rapid expansion of data processing and refining capabilities through custom operators.
+- **Auto-Generated REST & GraphQL APIs**
+ Expose curated views or datasets as APIs with Swagger/OpenAPI—no backend code required.
+- **Modernize Legacy with JSON Wrappers**
+ Wrap mainframes, COBOL, or flat-file systems with real-time APIs—avoid risky rewrites.
+- **Row/Field-Level Access Control**
+ Enforce granular ACLs on exposed APIs to protect sensitive data while enabling secure sharing.
+### Zero-Downtime Migration & Multi-Cloud Sync
-## Seamless Database Migration with Zero Downtime
+- **Full + Incremental Sync for Seamless Cutovers**
+ Migrate data across systems or clouds with parallel real-time sync and instant switch-over.
+- **Hybrid & Cross-Region Deployments**
+ Keep databases in sync across regions, on-prem to cloud, or cloud to cloud—ideal for HA, DR, or modernization projects.
-Traditional migration methods often require stopping data writing to the source database, resulting in downtime during the migration process to ensure data consistency. This downtime can last for hours or even days, significantly impacting business operations.
+## Business Use Cases
-TapData offers a downtime-free migration solution that minimizes the impact on your business. The downtime only occurs when switching from the source instance to the target instance, and the rest of the time your business can continue to operate normally, with downtime reduced to the minute level. The migration process consists of two stages: full data synchronization and incremental data synchronization. During the incremental data synchronization stage, data from the source instance is continuously synchronized to the target instance in real-time. You can validate your business in the target database and once verified, smoothly switch your operations to the target database for a seamless migration.
+*(Outcome-focused: what the platform delivers for ops, product & execs)*
-## Cloud Migration/Cross-cloud Synchronization
+### Unified Customer Operations (Customer 360)
-For scenarios from offline to cloud, cloud to offline, or across cloud platforms, TapData can provide seamless data migration and synchronization.
+- Merge CRM, ticketing, and order systems into one live API.
-## Enhance Query Performance
+- Trigger personalization within **milliseconds** based on user actions.
-In scenarios with heavy read and light write operations, a single database may not handle all read pressures. By synchronizing data to another database and routing read requests to these read-only databases, you can horizontally expand overall read performance and relieve pressure on the primary database.
+### Real-Time Risk & Transaction Monitoring
-Moreover, you can choose to synchronize data to Redis, MongoDB, ElasticSearch, and other next-generation NoSQL databases to provide high-concurrency, low-latency query capabilities for your system.
+- Payment/fintech: update balances, detect fraud, and block suspicious transactions instantly.
-## Accelerate Full-text Searching
+- IT/production: stream metrics to alerting systems for immediate anomaly detection.
-Traditional relational databases accelerate data retrieval by indexing, but cannot support the need for full-text data retrieval. TapData offers a solution by enabling seamless data synchronization from relational databases to Elastic-Search, empowering users to effortlessly retrieve data using full-text search capabilities.
+### Omni-Channel Inventory & Order Visibility
-## Cache Update Without Development
+- Sync ERP/WMS across regions; prevent overselling with live stock updates.
-To enhance business efficiency and optimize user experience, it is a common practice to introduce a cache layer in the business architecture, improving access speed and read concurrency. However, as cache data cannot be permanently stored, abnormal cache exits can result in data loss, impacting business stability and reliability. TapData's data synchronization function addresses this challenge by enabling real-time synchronization from the business database to the cached database. This facilitates a lightweight cache update strategy, simplifies the application architecture, and ensures both simplicity and safety.
+- Push disruption alerts (Slack/Kafka) on stock-outs or fulfillment delays.
+### AI/ML Feature Freshness
-## Accelerate Access with Read/Write Separation
+- Stream user events to feature stores (e.g., Feast/Tecton) for model retraining.
-In cross-regional/cross-border businesses, relying on a single-region deployment in traditional architectures leads to significant access delays and poor user experiences when users access services from different regions. To address this, TapData optimizes the deployment architecture and adjusts access logic. All write requests from users across regions are directed to the main business center, while real-time synchronization via TapData ensures that the data is replicated to the respective sub-business centers. Furthermore, read requests from users in various regions are routed to the nearest sub-business center, eliminating remote access and greatly enhancing the speed of business access.
+- Align batch vs. online feature generation to avoid serving/training skew.
+### Geo-Redundancy & Disaster Recovery
+- Real-time replication across regions/clouds to ensure continuity.
-## Empower Reading Capacity with Horizontal Scaling
+- Automatic failover by redirecting traffic when a primary site fails.
-In scenarios with a high volume of read requests, a single database instance may not be able to handle the entire read load effectively. To address this, you can utilize the real-time synchronization feature of DFS (Distributed File System) to establish read-only instances. By redirecting read requests to these read-only instances, you can achieve elastic scalability of the read capacity while reducing the load on the primary database instance.
-## Offsite Data Disaster Recovery
+## Technical Differentiation
-In order to mitigate the risk of business unavailability resulting from service disruptions, many enterprises are adopting a multi-region or multi-cloud deployment strategy. By spreading their business across different regions or public clouds, they can minimize the impact of any single point of failure. To further enhance service availability and mitigate risks at the Availability Zone level, establishing off-site Disaster Recovery Centers is a recommended approach. These centers serve as backup locations and are equipped to quickly restore service in the event of a failure at the primary business center. Real-time data synchronization through DFS ensures data consistency between the disaster recovery center and the primary business center.
+| Use Case | TapData Approach | Legacy Alternative |
+| ------------------ | ------------------------------------------------- | --------------------------------------- |
+| Master Data Sync | CDC-based merge with SCD2 support | Nightly batch reconciliation |
+| API Services | Auto-generated APIs from live DB schemas | Hand-coded API middleware |
+| Query Acceleration | In-memory pre-joins + incremental materialization | ETL to DWH + scheduled aggregation jobs |
-You can seamlessly redirect the business traffic to the Disaster Recovery Center, enabling a swift restoration of services. This proactive measure helps minimize downtime and ensures uninterrupted service delivery.
+→ [Explore Architecture](architecture.md) ‖ [Talk to Solutions Engineers](https://tapdata.feishu.cn/share/base/form/shrcnoYXtxkXe7L4wu3vKDYzUUc)
-## Geo-redundancy
+**Why It Matters**
-With the rapid development of the business and the growth of the number of users, if the business is deployed in a single region, it may face the following problems:
+- **Engineer-Centric Design**
-- The user is widely distributed in the geographical location, and the user access delay is higher in the geographical distance, which affects the user experience.
-- The capacity of the infrastructure of a single geography limits business expansion, such as power supply capacity, network bandwidth building capacity, and so on.
+ Uses real-world patterns like `SCD2`, `Feast/Tecton`, `pushdown`, and `materialization`—resonating with modern data engineers.
-To solve the above problems, you can use TapData to synchronize data in real time between multiple business units built in the same city/off-site to ensure global data consistency. When any unit fails, just switch the traffic to other available units automatically, effectively guaranteeing the high availability of the service.
+- **Business-to-Tech Mapping**
-## Build Materialized Views (Wide Tables)
+ Each use case links to clear business value:
+ _e.g., MDM → real-time compliance, API services → product agility_.
-From big data analysis to data warehouse construction to data dashboards, data engineers often need to use batch processing tasks to display and analyze wide tables or views, consuming significant resources and causing data updates to lag. TapData supports incremental wide table construction capabilities to provide the latest data at minimal cost.
-
-## Real-time Metrics Calculation
-
-Utilize TapData's real-time aggregation calculation capabilities for statistical calculations on logs, click streams, or database events in a streaming manner, producing various operational metrics such as login counts and conversion funnels.
-
-
-
-
-
-## **Balance Updates in Financial Transaction Systems**
-TapData enables real-time updates to account balances upon transaction completion, allowing users to instantly view the latest account status, meeting high consistency requirements.
-
-## Real-Time Inventory Management Systems
-E-commerce platforms use TapData to manage cross-platform inventory updates, ensuring that displayed inventory reflects recent sales or returns, preventing overselling.
-
-## Real-Time Monitoring and Alert Systems
-In IT and production monitoring systems, TapData promptly synchronizes monitoring metrics, ensuring that alert systems operate on the latest data to quickly respond to anomalies.
-
-## Customer Real-Time Status in CRM Systems
-TapData enables real-time updates of customer interactions and order status in CRM systems, allowing sales teams to make timely decisions and respond based on up-to-date customer information.
-
-## User Behavior Updates in Recommendation Systems
-E-commerce and content recommendation platforms use TapData to process real-time user behavior data, dynamically generating personalized recommendations that reflect users’ current interests and preferences.
\ No newline at end of file
+- **Real-Time Advantage**
+ Outperforms batch-based stacks (like Kafka + Flink + DWH) by simplifying architecture and minimizing latency (<500ms typical).
diff --git a/docs/operational-data-hub/README.md b/docs/operational-data-hub/README.md
new file mode 100644
index 00000000..2df8eb42
--- /dev/null
+++ b/docs/operational-data-hub/README.md
@@ -0,0 +1,6 @@
+# Build Operational Data Hub
+
+import DocCardList from '@theme/DocCardList';
+
+
+
diff --git a/docs/operational-data-hub/adm-layer/README.md b/docs/operational-data-hub/adm-layer/README.md
new file mode 100644
index 00000000..f25035b8
--- /dev/null
+++ b/docs/operational-data-hub/adm-layer/README.md
@@ -0,0 +1,6 @@
+# Deliver Data
+
+import DocCardList from '@theme/DocCardList';
+
+
+
diff --git a/docs/operational-data-hub/adm-layer/integrate-apis.md b/docs/operational-data-hub/adm-layer/integrate-apis.md
new file mode 100644
index 00000000..d542ae7c
--- /dev/null
+++ b/docs/operational-data-hub/adm-layer/integrate-apis.md
@@ -0,0 +1,48 @@
+# Integrate with APIs
+
+TapData makes it easy to turn your core data models into secure, reusable API services. Whether from the Foundation Data Model (FDM) or Master Data Model (MDM), your curated tables can be exposed as low-latency APIs—ideal for powering apps, microservices, and real-time integrations without point-to-point complexity.
+
+## Why Publish APIs
+
+Modern data teams often struggle with fragmented systems, slow batch interfaces, and duplicated integration logic. Publishing APIs directly from trusted data models addresses these challenges by:
+
+- Delivering always-fresh data from a single source of truth
+- Enabling secure, scalable access with role-based controls
+- Simplifying integration with apps, partners, and internal tools
+- Reducing the need to replicate business logic in every service
+- Accelerating innovation with reusable, governed data services
+
+## Porcedure
+
+There’s no need to build an API manually. Instead, simply link a curated table to an API application with a quick drag-and-drop:
+
+1. Log in to TapData Platform.
+
+2. In the left sidebar, select **Real-time Data Center**.
+
+3. In the **Foundation** or **Master Data Model**, locate the table you want to expose.
+
+4. Drag the table into the **API** application under **Targets & Services** on the right-hand side.
+
+ 
+
+5. A configuration dialog will appear, allowing you to customize:
+
+ - Which fields are exposed in the API
+ - Role-based access and security scope
+ - API endpoint naming and versioning
+
+:::
+
+For detailed instructions on configuring API fields, access policies, and security settings, see [Create Data API](../../publish-apis/create-api-service.md).
+
+:::
+
+## Next Steps
+
+At this stage, the API application is linked to your table—but it is not yet published or accessible. To activate it, you’ll need to define access credentials and endpoints:
+
+- [Create a Client](../../publish-apis/create-api-client.md): Define who can access the API and how. Set up authentication via access tokens or basic auth, and assign the appropriate roles.
+- [Create a Server](../../publish-apis/create-api-server.md): Define the public-facing endpoint that client applications will call. You can customize the path, method, and exposure scope.
+
+Once published, your API delivers secure, real-time access to curated business data—ideal for dashboards, embedded analytics, mobile apps, or enterprise integrations.
diff --git a/docs/operational-data-hub/adm-layer/sync-downstream.md b/docs/operational-data-hub/adm-layer/sync-downstream.md
new file mode 100644
index 00000000..182019b3
--- /dev/null
+++ b/docs/operational-data-hub/adm-layer/sync-downstream.md
@@ -0,0 +1,49 @@
+# Sync Data to Downstream Targets
+
+TapData empowers you to distribute real-time, processed data to a wide range of downstream systems—without building fragile, point-to-point pipelines. Whether you’re connecting cloud warehouses, on-prem databases or messaging systems, TapData provides a unified and scalable way to deliver fresh, trusted data where it’s needed most.
+
+## Why Sync to Downstream Targets
+
+Organizations today rarely operate on a single data system. You might be running operational dashboards from PostgreSQL, triggering real-time alerts through Kafka, and analyzing large datasets in ClickHouse or a cloud warehouse (e.g. snowflake). Without a centralized distribution layer, this often results in complex ETL chains, data silos, and inconsistent views across teams.
+
+TapData acts as a real-time data hub, allowing you to:
+
+- Streamline cross-system delivery from one trusted source
+- Power real-time analytics and automation with sub-second freshness
+- Distribute curated data views across databases, queues, and APIs
+- Trigger downstream workflows (e.g. Kafka, webhooks) on every change
+- Connect to anything with open, pluggable integration
+
+For example, a retail company might consolidate customer and order data into a [unified view via MDM](../mdm-layer/build-view-in-odh.md), and then:
+
+- **Push it to ClickHouse** to accelerate analytical queries on billions of rows
+- **Stream it to Kafka** to power real-time recommendation engines
+- **Sync it to PostgreSQL** for operational BI dashboards used by sales and operations teams
+
+TapData ensures all downstream systems stay continuously in sync through [Change Data Capture](../../introduction/change-data-capture-mechanism.md) (CDC) and intelligent transformation, so you can build once and deliver everywhere.
+
+## Porcedure
+
+From either the **Foundation Data Model (FDM)** or the **Master Data Model (MDM)**, you can deliver curated data to any downstream system using a simple drag-and-drop interface:
+
+1. Log in to the TapData platform.
+
+2. In the left sidebar, go to **Real-Time Data Center**.
+
+3. Locate the table you want to sync in your data model panel.
+
+4. Drag it into your desired **target** under **Targets & Services** on the right.
+
+ Any destination previously configured as a **Target**—whether a relational database, document store, message queue, or cloud platform—will appear here.
+
+ 
+
+5. In the pop-up dialog, choose your sync strategy (e.g., **Full + Incremental** or **Full Only**), then click **Save and Run** to start the pipeline.\
+
+ Once configured, TapData handles everything automatically—schema tracking, change detection, and real-time sync—so your data stays fresh and consistent across all systems.
+
+:::tip
+
+TapData supports dozens of downstream systems across databases, message queues, warehouses, and SaaS platforms. [See full list ›](../../connectors/supported-data-sources.md)
+
+:::
\ No newline at end of file
diff --git a/docs/operational-data-hub/advanced/README.md b/docs/operational-data-hub/advanced/README.md
new file mode 100644
index 00000000..3c757f7e
--- /dev/null
+++ b/docs/operational-data-hub/advanced/README.md
@@ -0,0 +1,6 @@
+# Advanced Features & Extensions
+
+import DocCardList from '@theme/DocCardList';
+
+
+
diff --git a/docs/user-guide/advanced-settings/custom-node.md b/docs/operational-data-hub/advanced/custom-node.md
similarity index 89%
rename from docs/user-guide/advanced-settings/custom-node.md
rename to docs/operational-data-hub/advanced/custom-node.md
index 4842a3b0..ae52bb04 100644
--- a/docs/user-guide/advanced-settings/custom-node.md
+++ b/docs/operational-data-hub/advanced/custom-node.md
@@ -1,9 +1,5 @@
# User Defined Processors
-import Content from '../../reuse-content/_enterprise-and-community-features.md';
-
-
-
By customizing the node function, you have the flexibility to organize your JavaScript script into reusable processing nodes. Once created, these custom nodes can be easily referenced in your data transformation tasks without the need for rewriting the script. This significantly reduces the development workload.
In this article, we will guide you on how to use custom nodes effectively and provide use cases as examples for your reference.
@@ -12,7 +8,7 @@ In this article, we will guide you on how to use custom nodes effectively and pr
## Create Custom Processors
-1. [Log in to TapData Platform](../../user-guide/log-in.md).
+1. Log in to TapData platform.
2. In the left navigation bar, select **Advanced** > **User Defined Processors**.
@@ -54,7 +50,7 @@ To ensure information security, if you need to desensitize certain mobile phone
**Procedure:**
-1. [Log in to TapData Platform](../log-in.md).
+1. Log in to TapData platform.
2. In the left navigation bar, select **Advanced** > **User Defined Processors**.
@@ -86,7 +82,7 @@ To ensure information security, if you need to desensitize certain mobile phone
6. Click the **Save** in the top right corner.
-7. [Create a data transformation task](../data-development/create-task.md). Add the phone number desensitization node between the source and target nodes in the data development task, and specify the field **mobile** as the input for the desensitization process.
+7. [Create a data transformation task](../../data-transformation/create-views/README.md). Add the phone number desensitization node between the source and target nodes in the data development task, and specify the field **mobile** as the input for the desensitization process.

diff --git a/docs/user-guide/advanced-settings/manage-external-storage.md b/docs/operational-data-hub/advanced/manage-external-storage.md
similarity index 89%
rename from docs/user-guide/advanced-settings/manage-external-storage.md
rename to docs/operational-data-hub/advanced/manage-external-storage.md
index 1cb063e5..2d023792 100644
--- a/docs/user-guide/advanced-settings/manage-external-storage.md
+++ b/docs/operational-data-hub/advanced/manage-external-storage.md
@@ -1,9 +1,5 @@
# Manage External Storage
-import Content from '../../reuse-content/_enterprise-and-community-features.md';
-
-
-
To facilitate the quick reading of task-related information subsequently, TapData stores necessary configurations, incremental logs of source tables, and other information related to the task in its internal MongoDB database. To store more data, you can create an external database to store relevant data.
## Prerequisites
@@ -12,7 +8,7 @@ An external database intended for data storage has been created. Currently, Mong
## Create External Storage
-1. [Log in to TapData Platform](../log-in.md).
+1. Log in to TapData platform.
2. In the left navigation bar, select **Advanced** > **External Storage**.
@@ -38,7 +34,7 @@ An external database intended for data storage has been created. Currently, Mong
## Use External Storage
-You can enable the Shared Mining feature and select the recently configured external storage when [creating a connection](../../prerequisites/README.md), as shown in the example below:
+You can enable the Shared Mining feature and select the recently configured external storage when [creating a connection](../../connectors/README.md), as shown in the example below:

diff --git a/docs/user-guide/advanced-settings/manage-function.md b/docs/operational-data-hub/advanced/manage-function.md
similarity index 82%
rename from docs/user-guide/advanced-settings/manage-function.md
rename to docs/operational-data-hub/advanced/manage-function.md
index a2199e13..45977118 100644
--- a/docs/user-guide/advanced-settings/manage-function.md
+++ b/docs/operational-data-hub/advanced/manage-function.md
@@ -1,13 +1,10 @@
# Manage Functions
-import Content from '../../reuse-content/_enterprise-and-community-features.md';
-
-
-TapData supports a wide range of functions to facilitate the definition of processing steps, allowing for their use in [JavaScript (JS) nodes](../data-development/process-node.md#js-process). Additionally, you can freely define custom functions or import third-party JAR packages to introduce new functions as needed.
+TapData supports a wide range of functions to facilitate the definition of processing steps, allowing for their use in [JavaScript (JS) nodes](../../data-transformation/process-node.md#js-process). Additionally, you can freely define custom functions or import third-party JAR packages to introduce new functions as needed.
## Procedure
-1. [Log in to TapData Platform](../log-in.md).
+1. Log in to TapData platform.
2. In the left navigation bar, select **Advanced** > **Function List**.
diff --git a/docs/user-guide/advanced-settings/share-cache.md b/docs/operational-data-hub/advanced/share-cache.md
similarity index 95%
rename from docs/user-guide/advanced-settings/share-cache.md
rename to docs/operational-data-hub/advanced/share-cache.md
index 163926a7..c43f0a00 100644
--- a/docs/user-guide/advanced-settings/share-cache.md
+++ b/docs/operational-data-hub/advanced/share-cache.md
@@ -1,14 +1,10 @@
# Create Live Cache
-import Content from '../../reuse-content/_enterprise-and-community-features.md';
-
-
-
Live Cache is primarily designed to alleviate the pressure on the source database by multiple tasks processing some hot data. By placing these data in the cache, it can be used by multiple tasks.
## Create Live Cache
-1. [Log in to TapData Platform](../log-in.md).
+1. Log in to TapData platform.
2. In the left navigation bar, select **Advanced** > **Live Cache**.
@@ -24,7 +20,7 @@ Live Cache is primarily designed to alleviate the pressure on the source databas
* **Automatic Index Creation**: Turn on this feature will automatically create indexes for cache keys in the source table, which may impact the source database's performance.
* **Cache Keys**: Choose one or more fields as the primary key to identify data for caching.
* **Cache Fields**: Select the commonly used fields you need to cache.
- * **External Storage Configuration**: Choose external storage, you can [create external storage](../advanced-settings/manage-external-storage.md) separately for the cache to store related data.
+ * **External Storage Configuration**: Choose external storage, you can [create external storage](manage-external-storage.md) separately for the cache to store related data.
* **Maximum Memory**: The maximum memory amount the system will save, exceeding it will delete the least frequently used data based on call time.
* **Use CDC log Caching**: CDC log Caching digs into incremental logs, eliminating the need to start multiple log collection processes for multiple incremental tasks. This significantly reduces the resource usage and waste of the source database.
* **Maximum Cache Memory**: The default is 500 MB. TapData will save up to the maximum memory amount, and if exceeded, it will delete the least frequently used data based on the call time.
diff --git a/docs/user-guide/advanced-settings/share-mining.md b/docs/operational-data-hub/advanced/share-mining.md
similarity index 88%
rename from docs/user-guide/advanced-settings/share-mining.md
rename to docs/operational-data-hub/advanced/share-mining.md
index 17a93deb..7111c312 100644
--- a/docs/user-guide/advanced-settings/share-mining.md
+++ b/docs/operational-data-hub/advanced/share-mining.md
@@ -1,20 +1,17 @@
# Manage CDC Log Cache
-import Content from '../../reuse-content/_enterprise-features.md';
-
-
To alleviate the pressure on the source database during increments, TapData supports shared mining of **change data capture** (CDC) logs. Once the CDC Log Cache is activated, it will not start mining immediately. Instead, it begins when you create a task for the table belonging to that data source. Regardless of whether the mining task is paused or encounters errors, it will not affect the normal operation of the synchronization task.
## Enable CDC Log Cache
-You can enable CDC Log Cache when [creating a connection](../../prerequisites/README.md), which allows for the collection of the source database's incremental logs into external storage. Once enabled, these logs can be used by multiple tasks, eliminating the need for redundant reads of the source database's incremental logs.
+You can enable CDC Log Cache when [creating a connection](../../connectors/README.md), which allows for the collection of the source database's incremental logs into external storage. Once enabled, these logs can be used by multiple tasks, eliminating the need for redundant reads of the source database's incremental logs.

## Use CDC Log Cache
-Create a data transformation or data replication task. When the task contains incremental tasks and the data source has cached the CDC logs, you can use this feature in the task settings. For more introductions about task configuration, see [create data replication/transformation task](../../quick-start/create-task.md).
+Create a data transformation or data replication task. When the task contains incremental tasks and the data source has cached the CDC logs, you can use this feature in the task settings. For more introductions about task configuration, see [create task](../../data-replication/create-task.md).

diff --git a/docs/operational-data-hub/fdm-layer/README.md b/docs/operational-data-hub/fdm-layer/README.md
new file mode 100644
index 00000000..6ad3df4f
--- /dev/null
+++ b/docs/operational-data-hub/fdm-layer/README.md
@@ -0,0 +1,6 @@
+# Ingest and Sync Data (FDM Layer)
+
+import DocCardList from '@theme/DocCardList';
+
+
+
diff --git a/docs/operational-data-hub/fdm-layer/explore-fdm-tables.md b/docs/operational-data-hub/fdm-layer/explore-fdm-tables.md
new file mode 100644
index 00000000..89625919
--- /dev/null
+++ b/docs/operational-data-hub/fdm-layer/explore-fdm-tables.md
@@ -0,0 +1,33 @@
+# Explore FDM Table Details
+
+Explore comprehensive details for any table in the Platform Cache (FDM Layer), including metadata, schema, related tasks, and data lineage—all designed to help you manage synchronized data with confidence.
+
+## Procedure
+
+1. Log in to TapData Platform.
+
+2. In the left sidebar, select **Real-time Data Center**.
+
+3. On this page, you’ll see a clear overview of all tables currently synchronized into the **Foundation Data Model (FDM Layer)**.
+
+ 
+
+4. Click the name of the table you want to inspect. This opens a detailed view with the following sections:
+
+ - **Overview**: See essential metadata for your table—including size, row count, column types, column comments/descriptions (by default sourced from your schema), and sample data.
+
+ You can also edit the business description of the table or individual fields here. This lets you assign clear, meaningful names and notes that make it easier for teams to understand, identify, and manage data in future workflows.
+
+ 
+
+ - **Schema**: View detailed column definitions including types, primary keys, foreign keys, and default values.
+
+ 
+
+ - **Tasks**: Review all replication tasks associated with this table, along with their current statuses. Click any task name to open its detailed monitoring page with metrics like sync performance, incremental delay, and logs. For more details, see [Monitor Tasks](../../data-transformation/monitor-view-tasks.md).
+
+ 
+
+ - **Lineage**: Visualize data lineage as an interactive graph. This helps you track and manage data quality across your pipelines. Clicking a task node in the lineage view will take you directly to its monitoring page.
+
+ 
\ No newline at end of file
diff --git a/docs/operational-data-hub/fdm-layer/replicate-data.md b/docs/operational-data-hub/fdm-layer/replicate-data.md
new file mode 100644
index 00000000..3ae5a3b5
--- /dev/null
+++ b/docs/operational-data-hub/fdm-layer/replicate-data.md
@@ -0,0 +1,85 @@
+# Replicate Data from Sources
+
+Learn how to set up real-time data replication from your source systems into the FDM (Platform Cache) layer using Tapdata’s flexible, CDC-powered engine. This ensures your business always works with fresh, reliable data—without burdening production databases.
+
+## Why Use the FDM Layer?
+
+The FDM (Federated Data Management) layer acts as a high-performance, real-time cache for your most critical business data.
+
+- Decouples analytics and downstream processes from production databases
+- Boosts security and performance by limiting access to sensitive source systems
+- Enables scalable, flexible data delivery for multiple teams and use cases
+
+Tapdata’s replication is powered by **Change Data Capture (CDC)** technology, allowing you to sync data with low latency and minimal impact on source systems. Tapdata supports a wide range of popular databases and cloud data sources—so you can unify your data, no matter where it lives.
+
+## Prerequisites
+
+Before you start, make sure you’re set up for a smooth experience:
+
+- [ODH mode is enabled](../set-up-odh.md) on your Tapdata workspace.
+- Your source database connections are configured and tested, for details, see [Connect Data Sources](../../connectors/README.md).
+- Your source tables must have a primary key or unique index.
+
+## Procedure
+
+1. Log in to TapData Platform.
+
+2. In the left sidebar, select **Real-time Data Center**.
+
+3. On this page, you’ll see all connected data sources, displayed across four platform layers for clear governance and data flow management.
+
+ For details on each layer, see [ODH Architecture Overview](https://docs.tapdata.net/user-guide/real-time-data-hub/daas-mode/enable-daas-mode).
+
+4. In the **Source Data Layer**, find the database and tables you want to replicate.
+ Or use the icon to search, then drag the desired table(s) to the **Platform Cache Layer**.
+
+ 
+
+ :::tip Alternative Approach
+
+ You can also replicate data using the traditional [Data Replication](../../data-replication/create-task.md) approach, which provides more granular control over source-to-target configurations and processing nodes. However, the FDM layer approach shown here is specifically optimized for building an operational data hub with standardized naming and governance.
+
+ :::
+
+5. In the configuration dialog, set a **table prefix** and choose the replication mode (**Full and Incremental Sync** or **Full Sync**).
+
+ 
+
+ Tapdata automatically generates the FDM-layer naming pattern for you, which typically includes the `FDM_` prefix plus your table name.
+ Here you can add your **source system identifier** to make it easier for business teams to recognize the origin of the data.
+
+ For more naming best practices, see [Plan Your Operational Data Hub](../plan-data-platform.md).
+
+6. Click **Save & Run** to immediately start the replication.
+
+ Tapdata will automatically create and launch a real-time replication task, continuously syncing your selected tables into the **FDM (Platform Cache) Layer** with built-in validation.
+
+ To monitor the task, click the icon next to the table name—this opens the job monitoring page with live status and performance metrics.
+
+ 
+
+ :::tip
+
+ When multiple tables are selected from the same database source, Tapdata will group them into a single replication task by default. This makes it easier to manage schema changes and monitor task status consistently.
+
+ :::
+
+ Tapdata also helps keep things organized by automatically creating a **folder** in the Platform Cache named after your source connection. Your new replication task will appear inside this folder, making it easier to manage and find related tables from the same source.
+
+ If you want to rename the folder, simply **hover over it**, click the vertical three-dot (**⋮**) icon, and select **Edit** to change its name.
+
+
+
+
+:::tip
+
+**Need more control?**
+If you want to adjust advanced options—like read concurrency, batch size, hash-based sharding (for splitting large tables during full sync), or index replication—click **Only Save** instead of **Save & Run**.
+After saving, locate your new task in the list, click its name, and configure these settings before running it.
+
+:::
+
+## Next Step
+
+- [Explore FDM Table Details](explore-fdm-tables.md)
+- [Data Validation](validate-data-quality.md)
\ No newline at end of file
diff --git a/docs/operational-data-hub/fdm-layer/validate-data-quality.md b/docs/operational-data-hub/fdm-layer/validate-data-quality.md
new file mode 100644
index 00000000..90752acf
--- /dev/null
+++ b/docs/operational-data-hub/fdm-layer/validate-data-quality.md
@@ -0,0 +1,71 @@
+# Validate Data Quality
+
+Tapdata’s data validation feature helps you ensure data integrity throughout your real-time pipelines. Built on robust in-house technology, it verifies that your replicated data remains consistent between source and target, helping you meet strict production-quality requirements. This guide explains how to configure and manage data validation tasks.
+
+## Background
+
+Maintaining trust in your data is critical—especially when delivering real-time insights or meeting strict compliance requirements. Tapdata’s CDC-powered pipelines deliver high-quality replication to the Platform Cache (FDM Layer), while built-in validation tools ensure your data remains consistent, reliable, and fully traceable at every step.
+
+Tapdata provides a flexible data validation framework that helps you:
+
+- Confidently verify record counts and field-level accuracy between your source systems and the Platform Cache (FDM Layer)
+- Quickly detect and analyze discrepancies to support real-time operations
+- Configure alerts and outputs to fit your team’s workflow
+
+Whether you’re managing a single replication job or hundreds of pipelines across teams, Tapdata’s validation features give you complete confidence in your data integrity.
+
+## Limitation
+
+Data validation is **not supported** for queue-based sources like Kafka.
+
+## Configure a Validation Task
+
+1. Log in to TapData Platform.
+
+2. In the left sidebar, select **Data Validation**.
+
+3. On the validation page, click the button in the top-right corner to choose your validation type:
+
+ - **Task Consistency Validation**: Use this to validate data consistency for a specific replication task between your source and FDM layer.
+ - **Any Table Data Validation**: Use this option to validate data between any two tables you have access to in Tapdata, even if they aren’t linked to a specific replication task. This is useful for ad hoc checks or legacy system comparisons.
+
+4. In the setup page, complete the following fields:
+
+ 
+
+ - **Choose Task**: Select the replication task you want to validate. Tasks created when syncing to the FDM layer follow the naming pattern `TableName_Clone_To_FDM_`.
+ Tip: You can also see the linked task by clicking the table name within the FDM layer view.
+ - **Validation Name**: Enter a meaningful name for your validation task that reflects the business context.
+ - **Type**: Choose the validation method that matches your needs:
+ - **Count Validation**: Compares row counts only. Fastest, but doesn’t show detailed differences.
+ - **All Fields Validation**: Compares all fields row by row. Shows all differences but is slower.
+ - **Related Fields Validation**: Compares only key fields that have sortable indexes. Medium speed, good for partial checks.
+ - **Hash Validation**: Computes and compares hashes between source and target tables. Faster than row-by-row but only supported for homogeneous data sources.
+ - **Advanced Configuration (Optional)**: Click **Advanced Settings** to customize:
+ - **Result Output**: Choose whether to output all mismatched records or only those missing in the source.
+ - **Validation Frequency**: By default, validation runs once. You can schedule repeated validations with a start time, end time, and interval.
+ - **Validation Task Alert**: Set up rules for sending alerts when the task fails or when discrepancies are found.
+ - **Number of Saved Errors**: Set the maximum number of inconsistent records to save (default is 100, max 10,000). Recommended to use higher limits for better traceability.
+ - **Table Configuration**: By default, Tapdata automatically loads the source and target tables from your replication or development task. You can enable **Data Filtering** to validate only a subset of data with custom SQL or aggregation queries. Advanced users can also add **custom JavaScript validation logic**.
+
+5. Click **Save** to add the validation task to your list.
+ Then, in the task list, click **Run** next to your new validation task to start validation.
+
+## Manage Validation Results
+
+Click the **Result** link for any validation task to view its results in depth. For any inconsistencies, you can:
+
+- Use **One-Click Repair** to automatically align data between source and target.
+- Download detailed mismatch reports for further analysis.
+
+Additionally, you can **generate repair SQL** for in-depth analysis or execute it manually in the database to fix the issue.
+
+
+
+For **Full Table Value Validation** or **Key Field Value Validation**, you can also click **Diff Validation** in the top-right corner to re-run validation only on the previously identified differences, confirming whether they have been resolved.
+
+## FAQ
+
+**Q: Why might my validation task fail or report differences?**
+A: See our troubleshooting guide here: [Data Validation FAQs](https://docs.tapdata.net/faq/data-pipeline#check-data).
+
diff --git a/docs/user-guide/verify-data.md b/docs/operational-data-hub/fdm-layer/validate-views.md
similarity index 93%
rename from docs/user-guide/verify-data.md
rename to docs/operational-data-hub/fdm-layer/validate-views.md
index 65523320..cb8e2329 100644
--- a/docs/user-guide/verify-data.md
+++ b/docs/operational-data-hub/fdm-layer/validate-views.md
@@ -1,8 +1,6 @@
-# Data Validation
+# Validate View Results
-import Content from '../reuse-content/_all-features.md';
-
Leveraging various proprietary technologies, TapData ensures maximum data consistency. In addition, TapData supports data table validation to further verify and ensure the correctness of data flow, meeting the stringent requirements of production environments. This document introduces the configuration process for data validation tasks.
@@ -13,7 +11,7 @@ import TabItem from '@theme/TabItem';
## Procedure
-1. [Log in to TapData Platform](log-in.md).
+1. Log in to TapData Platform.
2. In the left navigation bar, click **Data Validation**.
@@ -24,7 +22,7 @@ import TabItem from '@theme/TabItem';
```
-
+
- **Choose Job**: Choose the data replication/data transformation task to verify.
- **Verify Task Name**: Enter a meaningful name for the task.
@@ -44,7 +42,7 @@ import TabItem from '@theme/TabItem';
-
+
@@ -70,7 +68,7 @@ import TabItem from '@theme/TabItem';
5. (Optional) Click on **Result** for the verification task to view detailed verification results. For discrepancies, you can click **Data Correction** to align the data or **Download** for in-depth analysis.
- 
+ 
:::tip
@@ -84,8 +82,8 @@ import TabItem from '@theme/TabItem';
## Common Issues
-For troubleshooting methods regarding failed validation tasks or inconsistent validation data, see [Common Questions on Data validation](../faq/data-pipeline.md#check-data).
+For troubleshooting methods regarding failed validation tasks or inconsistent validation data, see [Common Questions on Data validation](../../faq/data-pipeline.md#check-data).
## See also
-[Incremental Data Validation](incremental-check.md)
\ No newline at end of file
+[Incremental Data Validation](../../data-replication/incremental-check.md)
\ No newline at end of file
diff --git a/docs/operational-data-hub/mdm-layer/README.md b/docs/operational-data-hub/mdm-layer/README.md
new file mode 100644
index 00000000..3460660e
--- /dev/null
+++ b/docs/operational-data-hub/mdm-layer/README.md
@@ -0,0 +1,6 @@
+# Design and Transform Data (MDM Layer)
+
+import DocCardList from '@theme/DocCardList';
+
+
+
diff --git a/docs/operational-data-hub/mdm-layer/build-view-in-odh.md b/docs/operational-data-hub/mdm-layer/build-view-in-odh.md
new file mode 100644
index 00000000..fc2f017b
--- /dev/null
+++ b/docs/operational-data-hub/mdm-layer/build-view-in-odh.md
@@ -0,0 +1,38 @@
+# Create Real-Time Views in the MDM Layer
+
+The MDM layer is where you transform and serve business-ready data. You can create materialized views directly from raw tables in the FDM layer or build on top of existing standardized models—such as the *[Unified User View](prepare-and-transform.md)* we created earlier.
+
+Defining views at this stage helps you clearly separate **core data modeling** from **downstream usage**, making your pipeline easier to manage, reuse, and scale.
+
+### Why Build Materialized Views in MDM?
+
+- **Layered Governance**
+ The FDM layer handles raw [1:1 replication](../fdm-layer/replicate-data.md). The MDM layer focuses on cleaning, enriching, and modeling. Creating materialized views here helps you track lineage and control logic in one place.
+
+- **Reusable Data Models**
+ Define core business entities—like [users](prepare-and-transform.md), products, or transactions—and reuse them across wide-table views, API responses, or ML pipelines.
+
+- **Security & Compliance**
+ Since sensitive fields are masked or removed in the MDM layer, downstream teams can safely access only the curated outputs without needing raw data access.
+
+### How to Create a View
+
+The creation process is the same as described in [Build Incremental Materialized Views](../../data-transformation/create-views/using-data-pipeline-ui.md), with only the **starting point** and **source tables** being different:
+
+- **Start here**: Go to **Real-Time Data Hub > MDM**, then click the  icon to create a new Materialized View.
+
+ 
+
+- **Choose your sources**: Use FDM tables or existing MDM models—TapData lets you flexibly combine and transform both.
+
+ 
+
+### Where to Use These Views
+
+Once your view is running in real time, you can:
+
+- **Expose it via API** with TapData’s built-in API Service
+- **Sync it to the Downstream Services** for BI dashboards or reporting
+- **Use it as input** in additional MDM pipelines
+
+For example, the *[Unified User View](prepare-and-transform.md)* we created earlier can now be enriched with order data to produce a wide, real-time customer profile—ideal for dashboards, personalization, or marketing automation.
diff --git a/docs/operational-data-hub/mdm-layer/define-data-categories.md b/docs/operational-data-hub/mdm-layer/define-data-categories.md
new file mode 100644
index 00000000..60bc021a
--- /dev/null
+++ b/docs/operational-data-hub/mdm-layer/define-data-categories.md
@@ -0,0 +1,35 @@
+# Define Data Categories
+
+Data categories help teams organize and standardize business entities within the **Master Data Model (MDM)**. By defining categories, you can clearly structure key data like user profiles, order details, payment transactions, and risk scoring models.
+
+## Why Define Data Categories?
+
+Defining categories for your data brings several key benefits:
+
+- **Clear business context:** Everyone knows what each table is for.
+- **Simpler collaboration:** New team members can get up to speed faster.
+- **Better governance and access control:** Easily assign ownership and usage boundaries.
+- **Standardization:** Supports consistent naming, documentation, and data lineage tracking.
+
+## Procedure
+
+In an e-commerce scenario, you might organize your Master Data Model around **core business domains**. For example, you could create categories (folders) like *Orders*, *Customers*, *Marketing*, and *Payments* to mirror your business architecture and make collaboration more intuitive.
+
+1. Log in to TapData platform.
+
+2. In the **Master Data Model** section, click the **New Category** icon.
+
+ 
+
+3. In the dialog that appears, fill in the **Directory Name** and **Catalog Description** fields.
+
+ - **Directory Name** is your category’s business-facing name—for example, you might use **Marketing** to group all tables supporting marketing campaigns.
+ - **Catalog Description** explains its purpose—for example, *“Data supporting marketing analysis and campaign optimization.”*
+
+4. Click **OK** to create the new category.
+
+## Next Steps
+
+- [Prepare and Transform Data](prepare-and-transform.md)
+- [Create Real-Time Materialized Views](build-view-in-odh.md)
+
diff --git a/docs/operational-data-hub/mdm-layer/prepare-and-transform.md b/docs/operational-data-hub/mdm-layer/prepare-and-transform.md
new file mode 100644
index 00000000..e774e103
--- /dev/null
+++ b/docs/operational-data-hub/mdm-layer/prepare-and-transform.md
@@ -0,0 +1,131 @@
+# Prepare and Transform Data
+
+Transform scattered data into a clean, standardized, and privacy-safe foundation for master data modeling. TapData lets you visually enrich and align records across multiple tables—ready for analytics, personalization, and operational use.
+
+## Why Transform Data
+
+In most organizations, business data is scattered across many systems—CRMs, e-commerce platforms, loyalty programs, marketing tools, and more. These systems use different naming conventions, formats, and standards, making the raw data inconsistent, redundant, or even privacy-sensitive. It’s not immediately usable for downstream analytics, applications, or AI.
+
+To turn this fragmented data into trusted, analysis-ready information, you need to standardize, clean, and organize it into unified structures. That’s where the **MDM (Master Data Management) Layer** in TapData comes into play.
+
+Once your raw data is replicated into the FDM (Platform Cache) Layer, you can design transformation pipelines in the MDM layer to shape it for downstream use. Typical tasks include removing or renaming fields, masking sensitive values, merging records from multiple sources, or applying custom rules and calculations. You have full flexibility to define logic that matches your business requirements—whether simple or complex.
+
+The MDM layer supports two key roles:
+
+- Building **reusable, clean data models**—like standardized user profiles or product views—that serve as shared building blocks across projects.
+- Creating **real-time materialized views** that are optimized for direct consumption by BI dashboards, API integrations, or other applications.
+
+These two roles often work together: a reusable user model can be further enriched or joined with transaction data to produce a wide table materialized view for reporting or personalization. In this example, we’ll walk through the first step—building a reusable user view.
+
+## Example: Build a Unified User View
+
+Let’s walk through a practical case: creating a privacy-compliant, reusable user view by cleaning and merging two FDM tables:
+
+
+
+In this scenario, we’ll:
+
+- Clean the `user_registration` table by removing phone numbers, masking email addresses, and extracting email domains for future segmentation.
+- Standardize the field names by renaming `userId` to `user_id`, aligning it with the membership table.
+- Use a **Master-Detail Merge** node to combine both tables by `user_id` into a unified profile.
+
+The result is a well-structured user view that can be reused in later transformation tasks—for example, joining with order data to build a real-time wide table view for marketing, dashboards, or machine learning models.
+
+If the output view is likely to be reused across teams or workflows, we recommend creating it as a standalone MDM transformation task. If it’s a one-off integration, you can embed the logic directly into a downstream pipeline for better performance and simpler management.
+
+## Procedure
+
+1. Start a New Data Transformation Task.
+
+ 1. Log in to TapData platform.
+
+ 2. In the left navigation menu, go to **Real-Time Data Hub**.
+
+ 3. Under the **MDM** section, click the  icon.
+
+ 
+
+ 4. In the pop-up dialog, name your view (e.g., `users_main_view`). TapData will automatically add the `MDM_` prefix.
+
+ 5. Click **OK** to create the task. You’ll be taken to the pipeline configuration page.
+
+ Here, you’ll see that TapData has pre-built the basic structure for you—a visual pipeline with a **Master-Detail Merge** node and a **Target Node** already in place.
+
+ 
+
+2. Add Source Tables from FDM.
+
+ 1. In the **Connections** panel on the left, find your source connection (e.g., an FDM storage engine).
+
+ 
+
+ 2. Drag the following two tables onto the canvas:
+
+ - `FDM_ECommerce_user_registration`
+ - `FDM_ECommerce_user_membership`
+
+ 3. Connect both tables to the **Master-Detail Merge** node by dragging from the **+** handle on each table node to the merge node.
+
+3. Standardize Fields in the Membership Table.
+
+ 1. Hover over the line between `FDM_ECommerce_user_membership` and the merge node, and click the **+** icon.
+
+ 
+
+ 2. Select the **Field Rename** node.
+
+ 3. Click the new node to configure it. Rename the `userId` field to `user_id` to match the schema used in the registration table.
+
+4. Clean and Transform the Registration Table.
+
+ We’ll now remove the phone number and mask the email address to meet privacy and compliance standards.
+
+ 1. Hover over the line from `FDM_ECommerce_user_registration` to the merge node and click the **+** icon.
+
+ 2. Add an **Add and Delete Fields** node.
+
+ 3. In the configuration panel, delete the `phone` field.
+
+ 
+
+ 4. Hover over the line from this node to the merge node again, click **+**, and add a **Standard JS** node.
+
+ 5. In the **Script** box, paste the following JavaScript to mask the email address:
+
+ ```js
+ var parts = record.email.split("@");
+ record.email = parts[0].substring(0,1) + "****@" + parts[1];
+ return record;
+ ```
+
+ 
+
+ :::tip
+
+ TapData offers many prebuilt nodes, but you can also use JS nodes for advanced customization. See [Appendix: JS Examples](../../appendix/standard-js.md) for more.
+
+ :::
+
+5. Configure the Merge Logic.
+
+ Click on the **Master-Detail Merge** node and set the join key to `user_id` for both tables.
+
+ 
+
+6. Run the Task.
+
+ Click **Start** in the upper-right corner. TapData will launch the real-time transformation task and take you to the monitoring page, where you can track data volume, processing status, and latency in real time.
+
+## What’s Next?
+
+This new Unified User View is a powerful, reusable asset. You can now:
+
+- Join it with transaction or behavior tables to enrich order-level analytics
+- Use it as input for marketing segmentation or churn prediction
+- Power AI-driven recommendation systems with high-quality, unified profiles
+
+If you don’t plan to reuse this view across multiple flows, consider performing all processing steps within a single task to reduce transformation latency.
+
+## See also
+
+[Supported Processing Node](../../data-transformation/process-node.md)
diff --git a/docs/operational-data-hub/plan-data-platform.md b/docs/operational-data-hub/plan-data-platform.md
new file mode 100644
index 00000000..cb376151
--- /dev/null
+++ b/docs/operational-data-hub/plan-data-platform.md
@@ -0,0 +1,105 @@
+# Plan Your Data Platform
+
+Build a unified, real-time data foundation that connects all your systems, breaks down silos, and delivers consistent, high-quality data to power your business. This guide explains why you need an Operational Data Hub (ODH) and how to plan and implement one step by step.
+
+## Why Plan Your Data Platform
+
+As businesses grow, data often ends up scattered across siloed systems—making it hard to share, analyze, or act on in real time.
+
+In ecommerce or fintech, functions like fraud detection, inventory checks, or customer segmentation all depend on fast, reliable data. But when systems are fragmented, timely decision-making becomes difficult.
+
+Traditional data warehouses help with historical analysis but fall short for real-time needs:
+
+- T+1 latency isn’t fast enough
+- High complexity adds engineering burden
+- Hard to adapt as needs change
+
+Some teams try building real-time pipelines with stream tools, but these often come with high complexity and steep learning curves.
+
+A better solution is an Operational Data Hub (ODH)—a lightweight, real-time layer that unifies data across systems and makes it instantly usable across teams and applications.
+
+## What is an Operational Data Hub (ODH)?
+
+An Operational Data Hub is a real-time data integration and delivery layer that sits between your source systems and consuming applications. It's designed to solve fragmented data landscapes, reduce integration complexity, and deliver consistent, standardized data in real time.
+
+At its core, an ODH is about connecting, transforming, and delivering data:
+
+- **Connect:** Seamlessly integrate diverse data sources—databases, APIs, event streams—without requiring major system changes.
+- **Transform:** Clean, standardize, and model data into consistent formats and business entities that everyone can understand.
+- **Deliver:** Make high-quality, up-to-date data available to consuming systems and teams via APIs or downstream databases.
+
+With an ODH, you move from siloed, hard-to-manage data flows to a single, unified, reusable data service that powers real-time operations and decision-making.
+
+Tapdata's ODH design breaks this journey into clear, manageable layers:
+
+
+
+| Layer | Purpose |
+| --------------------------------------------------- | ------------------------------------------------------------ |
+| **[Source Data Layer](../connectors/README.md)** | Connect to and abstract data from all business systems and sources, without disrupting existing operations. |
+| **[Platform Cache (FDM)](fdm-layer/README.md)** | Use real-time change data capture (CDC) to [mirror source tables](fdm-layer/replicate-data.md) safely, reducing load on critical systems. |
+| **[Curated Data Layer (MDM)](mdm-layer/README.md)** | [Transform, clean, and model data](mdm-layer/prepare-and-transform.md) into standardized business entities and wide tables for consistent consumption. |
+
+**Delivering Real-time Data to Business Systems**
+
+Once your data is processed and modeled in the MDM layer, you can deliver it to downstream business systems through [API Services](../publish-apis/README.md), [Data Replication](../data-replication/README.md), [Data Transformation](../data-transformation/README.md), Event Streaming, or Direct Database Access.
+
+This approach aligns with best practices for **Master Data Management (MDM)** as defined by Gartner: enabling IT and business teams to work together to ensure consistency, accuracy, governance, and shared understanding of core business data.
+
+## How to Plan Your Data Platform
+
+Once you understand the why and what of an ODH, the next question is: **How do you actually build it?**
+
+Below is a practical roadmap, based on proven best practices and real-world implementations, to help you plan and implement your own operational data platform.
+
+### 1. Define Goals and Priorities
+
+Start with business needs, not just technical architecture.
+
+- Identify critical use cases (e.g., real-time fraud scoring, customer segmentation).
+- List the core data assets required to enable these scenarios.
+
+### 2. Audit Existing Data Assets
+
+- Map out data sources, formats, and update frequencies.
+- Document owners and integration points.
+- Build an asset inventory or data catalog to clarify what's available.
+
+### 3. Establish Standards and Governance
+
+- Define unified data models and clear, agreed-upon metrics.
+
+- Standardize naming conventions and security classifications.
+
+ *Examples:*
+
+ - `FDM_SourceSystem_TableName` for raw mirrors
+ - `MDM_Domain_BusinessLogic` for processed wide tables
+ - `ADM_Domain_Metric_Frequency` for business-facing aggregates
+
+- Document data definitions, lineage, and ownership so everyone understands what's being delivered.
+
+### 4. Design and Build Data Pipelines
+
+Follow a layered approach:
+
+- **FDM:** Mirror source data in real time without overloading production systems.
+- **MDM:** Clean, enrich, and model data into consistent, business-friendly forms.
+- **ADM:** Create ready-to-use data services or tables tailored to specific use cases.
+
+*Example:*
+
+> For fraud risk, replicate transactions and user profiles into FDM. Merge and enrich them in MDM to create a real-time user risk profile table for the scoring engine.
+
+### 5. Deploy Monitoring and Quality Checks
+
+- Set up automated monitoring and alerts for data pipelines.
+- Conduct regular quality reviews to ensure data freshness, accuracy, and availability.
+
+### 6. Iterate and Improve
+
+- Start with a pilot project for one high-impact use case.
+- Gather feedback, improve models and processes.
+- Gradually scale to additional teams and data domains.
+
+By following this approach, you can turn fragmented, hard-to-use data into a **single, standardized, reusable data service** that fuels real-time decision-making and enables your entire organization to move faster and smarter.
diff --git a/docs/operational-data-hub/set-up-odh.md b/docs/operational-data-hub/set-up-odh.md
new file mode 100644
index 00000000..a8a0f8fd
--- /dev/null
+++ b/docs/operational-data-hub/set-up-odh.md
@@ -0,0 +1,59 @@
+# Enable Operational Data Hub
+
+Now that you've [planned your data platform](plan-data-platform.md) strategy, it's time to set up your Operational Data Hub (ODH) in Tapdata. This guide will walk you through setting up the core storage engine and activating ODH mode in Tapdata.
+
+## Preparation
+
+To enable the Operational Data Hub (ODH) in Tapdata, you’ll first need to connect a MongoDB database (version 4.0 or above) as the core storage engine.
+
+This MongoDB instance will serve two critical roles:
+
+- **Platform Cache (FDM Layer):** Stores real-time mirrored data from source systems.
+- **Processing Layer (MDM Layer):** Hosts cleaned, structured business data for analytics and API access.
+
+:::tip
+
+When creating the MongoDB connection, be sure to set the **connection role as both Source and Target**, so it can support full read/write capabilities across Tapdata’s layers. For step-by-step setup, refer to [Connect On-Premises MongoDB](../connectors/on-prem-databases/mongodb.md).
+:::
+
+Recommended best practices
+
+
+- You can use one shared database for both FDM and MDM layers, or create dedicated databases for better isolation and scalability.
+- Deploy MongoDB as a [replica set](../platform-ops/production-deploy/install-replica-mongodb.md) to ensure high availability and fault tolerance.
+- Ensure the MongoDB instance has enough disk space and a 14-day Oplog retention to support stable real-time synchronization and CDC.
+
+
+
+
+## Enable ODH in Tapdata
+
+Once your MongoDB database is ready and connected, follow these steps to enable the Operational Data Hub mode in Tapdata:
+
+1. Log in to TapData Platform.
+
+2. Click **Realtime Data Center** in the left sidebar.
+
+3. On the right side of the screen, click the  icon.
+
+4. Choose the **Data Service Platform** mode.
+
+5. Specify the MongoDB data source(s) you prepared for the **FDM** and **MDM** layers.
+
+ 
+
+ :::tip
+
+ Once saved, the selected storage engine can’t be changed later. Review your choice carefully before confirming.
+
+ :::
+
+6. Click **Save** to apply the configuration.
+
+After you complete this setup, Tapdata will automatically present the **ODH layered view** you saw in the planning section.
+
+
+
+## Next Step
+
+You’re now ready to start [syncing your source data into the FDM layer](fdm-layer/replicate-data.md), enabling real-time, consistent data delivery across your business systems.
\ No newline at end of file
diff --git a/docs/user-guide/manage-agent.md b/docs/other/manage-agent.md
similarity index 96%
rename from docs/user-guide/manage-agent.md
rename to docs/other/manage-agent.md
index 1f5c0a12..cff9c4ab 100644
--- a/docs/user-guide/manage-agent.md
+++ b/docs/other/manage-agent.md
@@ -1,15 +1,11 @@
# Manage Agent
-import Content from '../reuse-content/_cloud-features.md';
-
TapData Cloud provides visual management and maintenance capabilities for Agents. You can manage installed Agents through the dedicated page or by executing commands.
-
-
## Manage Agent by Web
-1. [Log in to TapData Platform](log-in.md).
+1. Log in to TapData Platform.
2. Click **Resource Management** in the left navigation panel, and then choose which operation to perform.
@@ -22,7 +18,7 @@ import TabItem from '@theme/TabItem';
-
Agent support multi-platform installation, see Install Agent.
+
Agent support multi-platform installation, see Install Agent.
Click Stop to pause the Agent, which can be used for temporary maintenance scenarios, to restart the Agent later, you should run it from the command line.
diff --git a/docs/billing/refund.md b/docs/other/refund.md
similarity index 93%
rename from docs/billing/refund.md
rename to docs/other/refund.md
index fba014e1..55ee587d 100644
--- a/docs/billing/refund.md
+++ b/docs/other/refund.md
@@ -1,14 +1,12 @@
# Unsubscribe Instance
-import Content from '../reuse-content/_cloud-features.md';
-
If you no longer need to use the Agent instance, you can follow the process to unsubscribe from the instance in this article while ensuring that its associated tasks do not affect your business.
## Procedure
-1. Log in to [TapData Cloud](https://cloud.tapdata.io/).
+1. Log in to TapData Cloud
2. In the left navigation panel, click **Resource Management**.
diff --git a/docs/platform-ops/README.md b/docs/platform-ops/README.md
new file mode 100644
index 00000000..7c6fea55
--- /dev/null
+++ b/docs/platform-ops/README.md
@@ -0,0 +1,7 @@
+# Platform Operations
+
+
+
+import DocCardList from '@theme/DocCardList';
+
+
\ No newline at end of file
diff --git a/docs/administration/emergency-plan.md b/docs/platform-ops/emergency-plan.md
similarity index 99%
rename from docs/administration/emergency-plan.md
rename to docs/platform-ops/emergency-plan.md
index 35cbbe7e..6cdad01f 100644
--- a/docs/administration/emergency-plan.md
+++ b/docs/platform-ops/emergency-plan.md
@@ -1,9 +1,5 @@
# Emergency Plans
-import Content from '../reuse-content/_enterprise-features.md';
-
-
-
This document provides a comprehensive emergency handling process and contingency strategies for TapData products, aiming to help you respond quickly and effectively in the event of an emergency or product issue, thereby mitigating the impact of failures and enhancing the overall stability and security of the product.
:::tip
diff --git a/docs/administration/operation.md b/docs/platform-ops/operation.md
similarity index 97%
rename from docs/administration/operation.md
rename to docs/platform-ops/operation.md
index 8cbd5830..3590e487 100644
--- a/docs/administration/operation.md
+++ b/docs/platform-ops/operation.md
@@ -1,9 +1,5 @@
# Maintenance
-import Content from '../reuse-content/_enterprise-features.md';
-
-
-
This article lists common issues related to TapData maintenance.
## How to Start or Stop Services?
@@ -366,10 +362,10 @@ By closely reviewing system high-risk operations, timely identification and resp
Common high-risk operations include:
-* [Connection Management](../prerequisites/README.md)
+* [Connection Management](../connectors/README.md)
* **Deleting data source connections**: To avoid accidental deletion, when performing a deletion operation, a prompt will appear if the connection is referenced by a task.
* **Editing data source connections**: If the parameters of the data source are set incorrectly, it may cause the connection to fail. Tasks referencing this data source will use the previous parameters and will not be affected, but new tasks or tasks reset afterwards may trigger errors.
-* [Data Replication](../user-guide/copy-data/create-task.md)/[Data Transformation](../user-guide/data-development/create-task.md) Tasks
+* [Data Replication](../data-replication/create-task.md)/[Data Transformation](../data-transformation/create-views/README.md) Tasks
* **Resetting tasks**: This operation will reset the task to its initial state, clearing historical monitoring data. Subsequent task starts will require re-executing full data synchronization.
* **Data duplication processing strategy**: In the target node settings, setting different data duplication strategies will affect the structure and data of the target table. For example, selecting **Clear existing target table structure and data** will clear the target table's structure and all data upon task start, synchronizing new table structures and data from the source.
* **Setting data write strategy**: In the advanced settings of the target node, if append write is selected, TapData will only process insert events, discarding update and delete events. Choose carefully based on business needs to avoid the risk of data inconsistency.
@@ -378,7 +374,7 @@ Common high-risk operations include:
a data replication task is used for scenarios that only synchronize incremental data, i.e., retaining target table data, if the target table's data scale is large, the synchronization index operation may affect the overall performance of the target database.
* **Setting update condition fields**: If there is no index on the target, an index will be created based on the update condition fields.
* **Task Agent settings**: In the task settings in the upper right corner, if an Agent is manually specified, this configuration item will remain unchanged when the task is copied, which may cause excessive pressure on a single Agent. It is recommended to set it to **Automatically assigned by the platform**.
-* [Data Services](../user-guide/data-service/README.md)
+* [Data Services](../publish-apis/README.md)
* Deleting or taking an API offline will render it unavailable.
-* [System Management](../user-guide/manage-system/README.md)
- * When [managing a cluster](../user-guide/manage-system/manage-cluster.md), only perform close or restart operations on related services when they are experiencing anomalies.
\ No newline at end of file
+* [System Management](../system-admin/other-settings/system-settings.md)
+ * When [managing a cluster](../system-admin/manage-cluster.md), only perform close or restart operations on related services when they are experiencing anomalies.
\ No newline at end of file
diff --git a/docs/administration/production-deploy/README.md b/docs/platform-ops/production-deploy/README.md
similarity index 55%
rename from docs/administration/production-deploy/README.md
rename to docs/platform-ops/production-deploy/README.md
index 1babcb9e..85564730 100644
--- a/docs/administration/production-deploy/README.md
+++ b/docs/platform-ops/production-deploy/README.md
@@ -1,9 +1,5 @@
# Production Deployment & Maintenance
-import Content from '../../reuse-content/_enterprise-features.md';
-
-
-
import DocCardList from '@theme/DocCardList';
\ No newline at end of file
diff --git a/docs/platform-ops/production-deploy/capacity-planning.md b/docs/platform-ops/production-deploy/capacity-planning.md
new file mode 100644
index 00000000..87eaaee6
--- /dev/null
+++ b/docs/platform-ops/production-deploy/capacity-planning.md
@@ -0,0 +1,78 @@
+# Capacity Planning
+
+This document provides a comprehensive capacity planning reference to help users effectively allocate resources based on specific requirements in their environment. The actual system requirements may vary due to workload characteristics, network conditions, server specifications, and other factors. Therefore, we recommend conducting performance tests in the specific environment to obtain more accurate configuration data.
+
+
+## Terminology
+
+* **Data Pipeline**: A data pipeline can replicate one or multiple tables from a source database to a target database. During the synchronization process, data can also be [transformed and processed](../../data-transformation/process-node.md) (e.g., data filtering) to ensure that the target database receives accurate and optimized data.
+* **RPS (Records Per Second)**: A metric that measures data transfer speed and system processing capability, reflecting the number of records the system processes per second.
+
+## Pipeline Resource Requirements
+
+* **Memory Requirements**:
+`(read_batch * 8 + 10240 + write_batch * (2 + threads)) * (10 * row_size + 5KB) + log buffer ≈ 1GB / 1KB row size`
+
+ :::tip
+ The read and write batch sizes can be adjusted in the [data pipeline configuration](../../data-replication/create-task.md) through the basic parameters of the source and target nodes.
+ :::
+
+* **CPU Requirements**: The requirements for computing resources vary under different business load scenarios, as a general reference:
+ - **Total Threads**: `Server Cores * 2`
+ - **Average Threads Required per Data Pipeline**: 1 ~ 8
+ - **CPU Cores Required per Data Pipeline**: 0.5 ~ 4
+
+## Quick Reference Table
+
+
+
+
+
Category
+
Business Load
+
CPU Cores Required
+
Memory Required
+
Number of Pipelines per 16-core Server
+
+
+
+
Full Synchronization
+
Large Data Volume (Table Data > 1 TB)
+
4
+
1 GB per 1KB row size
+
8
+
+
+
Medium/Small Data Volume (Table Data < 1TB)
+
2
+
16
+
+
+
Incremental Replication
+
High Throughput (RPS > 10,000)
+
2
+
8
+
+
+
Medium Throughput (1,000 ~ 9,999 RPS)
+
1
+
16
+
+
+
Low Throughput (RPS < 1,000)
+
0.5
+
32
+
+
+
+## High Availability Configuration Recommendations
+
+In [High Availability (HA) deployment](install-tapdata-ha.md) scenarios, at least two TapData instances are typically deployed to ensure failover and business continuity. During failover, all pipelines from one instance will automatically transfer to the other instance. In this case, the remaining instance will bear additional load. To avoid excessive load, it is recommended to configure the number of pipelines at 50% ~ 75% of the server capacity to maintain the necessary performance buffer.
+
+For example, if a 16-core server is configured to run 16 pipelines, in an HA setup, it is advisable to run only 8 ~ 12 pipelines to ensure system stability and high availability.
+
+## Performance Monitoring and Adjustment
+
+* [Real-Time Task Monitoring](../../data-replication/monitor-task.md): Observe task operation details, such as synchronization rate and latency during full/incremental phases, through the task monitoring page.
+* [Cluster Metrics Monitoring](../../system-admin/manage-cluster.md): Monitor the operating status of all components within the cluster and the number of external connections through the cluster management page. Use third-party performance monitoring tools to track CPU, memory, network, and other resource usage of the cluster.
+
+Based on the above monitoring data, dynamically adjust pipeline configuration and resource allocation to ensure the system remains stable and efficient under high load conditions.
\ No newline at end of file
diff --git a/docs/administration/production-deploy/install-replica-mongodb.md b/docs/platform-ops/production-deploy/install-replica-mongodb.md
similarity index 98%
rename from docs/administration/production-deploy/install-replica-mongodb.md
rename to docs/platform-ops/production-deploy/install-replica-mongodb.md
index 4d339615..9d9a7510 100644
--- a/docs/administration/production-deploy/install-replica-mongodb.md
+++ b/docs/platform-ops/production-deploy/install-replica-mongodb.md
@@ -1,9 +1,5 @@
# Deploy MongoDB Replica Set
-import Content from '../../reuse-content/_enterprise-and-community-features.md';
-
-
-
To ensure high availability in production environments, deploying a MongoDB replica set is required before deploying TapData, as it stores essential configurations, shared cache, and other information in MongoDB databases. This document outlines the deployment process.
## Deployment Architecture
diff --git a/docs/administration/production-deploy/install-tapdata-ha-with-3-node.md b/docs/platform-ops/production-deploy/install-tapdata-ha-with-3-node.md
similarity index 98%
rename from docs/administration/production-deploy/install-tapdata-ha-with-3-node.md
rename to docs/platform-ops/production-deploy/install-tapdata-ha-with-3-node.md
index 5b0db42e..63fd2265 100644
--- a/docs/administration/production-deploy/install-tapdata-ha-with-3-node.md
+++ b/docs/platform-ops/production-deploy/install-tapdata-ha-with-3-node.md
@@ -1,11 +1,12 @@
# Deploy HA TapData Enterprise (3 Nodes)
-import Content from '../../reuse-content/_enterprise-features.md';
-
-
To ensure the reliability of business operations in a production environment, a high availability (HA) deployment is recommended. This guide explains how to deploy TapData services using three servers to achieve high availability.
+## Preparation
+
+[Capacity Planning](capacity-planning.md)
+
## Deployment Architecture
In this example, we have three servers (as illustrated in the architecture below), each configured with an IP address or hostname. We will deploy **MongoDB services** (to store information required for TapData operations) and complete **TapData services** (including management services, data synchronization governance services, and API services) on each of these servers to achieve overall service high availability.
@@ -190,4 +191,4 @@ Upon successful login, you can view the status of the TapData services on all th
## Next Steps
-[Connect to a Database](../../quick-start/connect-database.md)
\ No newline at end of file
+[Connect to a Database](../../getting-started/connect-data-source.md)
\ No newline at end of file
diff --git a/docs/administration/production-deploy/install-tapdata-ha.md b/docs/platform-ops/production-deploy/install-tapdata-ha.md
similarity index 98%
rename from docs/administration/production-deploy/install-tapdata-ha.md
rename to docs/platform-ops/production-deploy/install-tapdata-ha.md
index 19dab69e..520143df 100644
--- a/docs/administration/production-deploy/install-tapdata-ha.md
+++ b/docs/platform-ops/production-deploy/install-tapdata-ha.md
@@ -1,10 +1,10 @@
# Deploy HA TapData Enterprise (2 Nodes)
-import Content from '../../reuse-content/_enterprise-features.md';
+To ensure reliability in production environments, we recommend deploying TapData in a high-availability setup. This guide will show you how to quickly deploy a high-availability TapData service on a local Linux platform.
-
+## Preparation
-To ensure reliability in production environments, we recommend deploying TapData in a high-availability setup. This guide will show you how to quickly deploy a high-availability TapData service on a local Linux platform.
+[Capacity Planning](capacity-planning.md)
## Software & Hardware Requirements
@@ -298,4 +298,4 @@ Before deployment, we need to perform the following operations on both servers.
## Next Steps
-[Connect to a Database](../../quick-start/connect-database.md)
\ No newline at end of file
+[Connect to a Database](../../getting-started/connect-data-source.md)
\ No newline at end of file
diff --git a/docs/administration/troubleshooting/README.md b/docs/platform-ops/troubleshooting/README.md
similarity index 71%
rename from docs/administration/troubleshooting/README.md
rename to docs/platform-ops/troubleshooting/README.md
index 302a4d1f..6439597b 100644
--- a/docs/administration/troubleshooting/README.md
+++ b/docs/platform-ops/troubleshooting/README.md
@@ -1,8 +1,6 @@
# Troubleshooting
-import Content from '../../reuse-content/_all-features.md';
-
This section offers guidance and reference materials for troubleshooting issues with TapData.
diff --git a/docs/administration/troubleshooting/error-and-solutions.md b/docs/platform-ops/troubleshooting/error-and-solutions.md
similarity index 90%
rename from docs/administration/troubleshooting/error-and-solutions.md
rename to docs/platform-ops/troubleshooting/error-and-solutions.md
index 56c8e71e..7d7b722a 100644
--- a/docs/administration/troubleshooting/error-and-solutions.md
+++ b/docs/platform-ops/troubleshooting/error-and-solutions.md
@@ -1,14 +1,12 @@
# Task Log Error and Troubleshooting Guide
-import Content from '../../reuse-content/_all-features.md';
-
This document aims to provide you with a detailed guide for identifying and resolving common errors found in the logs of data synchronization tasks. We delve into the causes of various common errors and offer clear, practical troubleshooting steps to help users quickly locate and solve issues.
## Viewing Task Logs
-Task runtime logs can be viewed at the bottom of the [task monitoring page](../../user-guide/copy-data/monitor-task.md#error-code). For common issues, TapData has solidified them into specific [error codes](error-code.md) for your convenience, along with their causes and solutions. If the relevant error code is not found, you can also troubleshoot based on the log keywords provided in this document or contact technical support.
+Task runtime logs can be viewed at the bottom of the [task monitoring page](../../data-replication/monitor-task.md#error-code). For common issues, TapData has solidified them into specific [error codes](error-code.md) for your convenience, along with their causes and solutions. If the relevant error code is not found, you can also troubleshoot based on the log keywords provided in this document or contact technical support.
## Oracle
@@ -22,7 +20,7 @@ Task runtime logs can be viewed at the bottom of the [task monitoring page](../.
**Scenario**: This error occurs when Oracle is used as the source and incremental data synchronization fails.
-**Solution**: Enable Oracle's archive logging. For details, see [Preparation for Oracle Data Source](../../prerequisites/on-prem-databases/oracle.md).
+**Solution**: Enable Oracle's archive logging. For details, see [Preparation for Oracle Data Source](../../connectors/on-prem-databases/oracle.md).
### ORA-00257
@@ -53,7 +51,7 @@ Task runtime logs can be viewed at the bottom of the [task monitoring page](../.
**Scenario**: Oracle is used as the source and fails to perform full or incremental synchronization.
-**Solution**: Typically, this is a permission assignment issue. For authorization methods, see [Preparation for Oracle Data Source](../../prerequisites/on-prem-databases/oracle.md).
+**Solution**: Typically, this is a permission assignment issue. For authorization methods, see [Preparation for Oracle Data Source](../../connectors/on-prem-databases/oracle.md).
### ORA-01400: cannot insert NULL into...
@@ -143,4 +141,4 @@ Host:9200/_all/_settings -d '{"index.mapping.total_fields.limit": 5000}'
**Scenario**: When syncing from MongoDB to ElasticSearch, after running for a while, the task stops with the above error message in the logs.
-**Solution**: In MongoDB, the same field has NaN and float types, and ElasticSearch cannot complete the data write causing the error. Add a [JS Processing Node](../../user-guide/data-development/process-node.md#js-process) to the task's pipeline with content `MapUtils.removeNullValue(record);`, then restart the task.
\ No newline at end of file
+**Solution**: In MongoDB, the same field has NaN and float types, and ElasticSearch cannot complete the data write causing the error. Add a [JS Processing Node](../../data-transformation/process-node.md#js-process) to the task's pipeline with content `MapUtils.removeNullValue(record);`, then restart the task.
\ No newline at end of file
diff --git a/docs/administration/troubleshooting/error-code.md b/docs/platform-ops/troubleshooting/error-code.md
similarity index 98%
rename from docs/administration/troubleshooting/error-code.md
rename to docs/platform-ops/troubleshooting/error-code.md
index 9086e02c..10ce541f 100644
--- a/docs/administration/troubleshooting/error-code.md
+++ b/docs/platform-ops/troubleshooting/error-code.md
@@ -1,10 +1,8 @@
# Task Error Codes and Solutions
-import Content from '../../reuse-content/_all-features.md';
-
-If you encounter an issue with a task, you can view related log information at the bottom of the task's [monitoring page](../../user-guide/data-development/monitor-task.md). For common issues, TapData has defined specific error codes for easier identification, along with their causes and solutions.
+If you encounter an issue with a task, you can view related log information at the bottom of the task's [monitoring page](../../data-transformation/monitor-view-tasks.md). For common issues, TapData has defined specific error codes for easier identification, along with their causes and solutions.
## Common Processor
diff --git a/docs/prerequisites/README.md b/docs/prerequisites/README.md
deleted file mode 100644
index ee2b4aab..00000000
--- a/docs/prerequisites/README.md
+++ /dev/null
@@ -1,11 +0,0 @@
-# Preparation
-
-import Content from '../reuse-content/_all-features.md';
-
-
-
-TapData supports nearly a hundred [diverse data sources](supported-databases.md), including commercial databases, open-source databases, cloud databases, data warehouses, data lakes, message queues, SaaS platforms, files, and custom data sources.
-
-import DocCardList from '@theme/DocCardList';
-
-
diff --git a/docs/prerequisites/allow-access-network.md b/docs/prerequisites/allow-access-network.md
deleted file mode 100644
index 50e9441f..00000000
--- a/docs/prerequisites/allow-access-network.md
+++ /dev/null
@@ -1,77 +0,0 @@
-# Configure Network Access
-
-import Content from '../reuse-content/_cloud-features.md';
-
-
-
-Before deploying the Agent, you need to refer to the requirements in this document and adjust the relevant firewall to ensure its communication ability. The workflow of the Agent is shown below:
-
-
-
-
-
-| Requirements | Description |
-| ---------------------------------- | ------------------------------------------------------------ |
-| Agent can connect to source database's port. | Ensure that the Agent can read data from the source database. |
-| Agent can connect to target database's port. | Ensure that the Agent can write data to the target database. |
-| Agent can connect to extranet. | Ensure the Agent can report task status, retrieve configuration, and execute tasks to/from TapData Cloud. |
-
-
-
-If you have subscribed to the [Fully Managed Agent](../billing/purchase.md#hosted-mode) and the data source you intend to connect with only allows connections from specific IP addresses, then you will need to add the server address of the Agent to the appropriate security settings of the data source. For instance, you might need to add it to the whitelist rules of your self-hosted database's firewall. This ensures that the Agent can establish communication and transfer data with your data source. The server addresses for Agent in various regions are as follows:
-
-
-
-
-
-
-
Cloud Provider
-
Region
-
Server Address of Agent
-
-
-
-
-
Google Cloud
-
HongKong
-
34.92.78.86
-
-
-
Sydney
-
34.87.244.166
-
-
-
Singapore
-
35.240.192.89
-
-
-
TaiWan
-
35.221.187.67
-
-
-
Tokyo
-
34.146.223.25
-
-
-
London
-
35.246.16.216
-
-
-
Frankfurt
-
34.159.220.196
-
-
-
N. Virgina
-
34.145.229.212
-
-
-
Oregon
-
34.83.4.199
-
-
-
Alibaba Cloud
-
HongKong
-
47.242.251.110
-
-
-
diff --git a/docs/prerequisites/crm-and-sales-analytics/metabase.md b/docs/prerequisites/crm-and-sales-analytics/metabase.md
deleted file mode 100644
index 99e2879e..00000000
--- a/docs/prerequisites/crm-and-sales-analytics/metabase.md
+++ /dev/null
@@ -1,19 +0,0 @@
-# Metabase
-
-import Content from '../../reuse-content/_enterprise-and-cloud-features.md';
-
-
-
-This article serves as a comprehensive guide, providing step-by-step instructions on adding Metabase to TapData Cloud, enabling efficient data synchronization and development for your projects.
-
-## Fill in the connection name
-
-The first but not necessarily the first or the last step is to fill in the connection name, because that is the first required field.
-
-## Enter Metabase login account and password
-
-The account must be granted administrator rights.
-
-## Enter Metabase's HTTP address
-
-For example: http://36.134.131.166:1234
\ No newline at end of file
diff --git a/docs/prerequisites/saas-and-api/feishu-bitable.md b/docs/prerequisites/saas-and-api/feishu-bitable.md
deleted file mode 100644
index 3bcb2df8..00000000
--- a/docs/prerequisites/saas-and-api/feishu-bitable.md
+++ /dev/null
@@ -1,14 +0,0 @@
-# Feishu Bitable
-
-import Content from '../../reuse-content/_enterprise-and-cloud-features.md';
-
-
-
-[Feishu Bitable](https://open.feishu.cn/document/server-docs/docs/bitable-v1/bitable-overview) is a new business management tool that helps you restructure your work applications and team collaboration models. It enables efficient online data collaboration, allows you to build personalized applications at will, and easily manage all your business data.
-
-## Parameter Descriptions
-
-- **App ID**: When creating an app in Feishu, you will receive the App ID information upon completion. For details on how to obtain this, see [Bitable Development Overview](https://open.feishu.cn/document/home/app-types-introduction/overview).
-- **App Secret**: When creating an app in Feishu, you will receive the App Secret information upon completion. For details on how to obtain this, see [Bitable Development Overview](https://open.feishu.cn/document/home/app-types-introduction/overview).
-- **App Token**: Each Bitable can be considered an application, and the unique identifier for this application is called app_token. For details on how to obtain this, see [Integration Guide](https://open.feishu.cn/document/server-docs/docs/bitable-v1/notification).
-- **Table ID**: Each Bitable consists of multiple tables, and the unique identifier for each table is called table_id. For details on how to obtain this, see [Integration Guide](https://open.feishu.cn/document/server-docs/docs/bitable-v1/notification).
\ No newline at end of file
diff --git a/docs/user-guide/data-service/README.md b/docs/publish-apis/README.md
similarity index 88%
rename from docs/user-guide/data-service/README.md
rename to docs/publish-apis/README.md
index 82f1ff5f..aab6414a 100644
--- a/docs/user-guide/data-service/README.md
+++ b/docs/publish-apis/README.md
@@ -1,7 +1,5 @@
# Publish Data API
-import Content from '../../reuse-content/_enterprise-features.md';
-
TapData supports publishing table data as APIs, aiding enterprises in building a unified data services platform. Various applications can use these APIs to provide support for services such as push notifications. The recommended sequence of use is as follows:
@@ -10,7 +8,7 @@ TapData supports publishing table data as APIs, aiding enterprises in building a
| [Create an API Application](manage-app.md) | Manage based on the purpose of the API in groups. |
| [Create an API Service](create-api-service.md) | Select the tables to associate, set the API's name, version, access path, permission scope, etc. Once set up, publish it online. |
| [Create an API Client](create-api-client.md) | Set the scope of permissions and authentication methods based on business needs to ensure the security of the API service. |
-| Invoke API Service | Supports [RESTful](query-via-restful.md) and [GraphQL](query-via-graphql.md) access methods. |
+| Invoke API Service | Supports [RESTful](query/query-via-restful.md) and [GraphQL](query/query-via-graphql.md) access methods. |
| [Audit](audit-api.md) and [Monitor](monitor-api-request.md) | Audit and monitor API usage to meet compliance and security requirements. |
import DocCardList from '@theme/DocCardList';
diff --git a/docs/publish-apis/api-design-considerations.md b/docs/publish-apis/api-design-considerations.md
new file mode 100644
index 00000000..eaf1a838
--- /dev/null
+++ b/docs/publish-apis/api-design-considerations.md
@@ -0,0 +1,98 @@
+# API Design Considerations
+
+Designing APIs isn’t just about making data accessible—it’s about ensuring stability, flexibility, and future-proof integrations for your business. Here are the pillars for robust API design in modern SaaS environments.
+
+## 1. Design for Change: Embrace API Versioning
+
+Change is inevitable. A great API strategy acknowledges that both the data and business requirements will evolve. Versioning is your insurance policy:
+
+- **Version every public API** (e.g., `/v1/orders`), even if you only have one client at the start. This future-proofs your integration and makes upgrades predictable.
+- With explicit versioning, you can launch new features, fix bugs, or refactor without breaking live applications.
+- Multiple versions running in parallel let client teams migrate on their own schedule and safely roll back if issues arise.
+
+:::tip
+
+**Best Practice:** Only introduce a new major version for breaking changes. Additive, backward-compatible enhancements can stay in the same version. For more information, see [Manage API Versions](manage-api-versions.md).
+
+:::
+
+## 2. Deliver Upgrades Without Disruption
+
+Zero-downtime is the standard for today’s APIs. Achieve this by:
+
+- **Publishing new versions** without taking old ones offline—run them side by side.
+- Rolling out client updates gradually (blue-green, canary, or rolling deployments).
+- Always upgrade the server (API) before the client apps.
+- Monitoring real usage—only retire old versions once you’re certain all clients have moved.
+
+**Why?** This approach lets you experiment, validate, and recover quickly—no more coordinated late-night upgrades or risky one-shot deployments.
+
+
+
+## 3. Speak a Common Language: Consistent Naming & Structure
+
+Naming isn’t just cosmetic—it creates consistency across APIs and teams, making it easier to understand, maintain, and scale.
+
+- Use plural nouns for endpoints that return collections (e.g., `/v1/customers`, `/v1/product_metrics`)
+
+- **Reflect real business entities** in endpoint names—avoid technical or implementation-based terms like `List`, `Collection`, or `Document`
+
+ e.g. `/v1/customer_list` ❌ `/v1/customers` ✅
+
+- Use consistent naming patterns across all APIs. TapData follows `snake_case` for both endpoint paths and response fields to align with common data conventions
+
+- Ensure term consistency—if you use `customer_id` in one API, don’t rename it `client_id` in another unless they truly represent different things
+
+
+
+## 4. Minimize Surprises: Protect Integrations from Breaking Changes
+
+Don’t break your users’ apps unexpectedly. Guard against disruptions by:
+
+- Never removing or renaming fields, changing their types, or altering default outputs in a live version
+- Only add new parameters as optional, with safe defaults, so old clients remain unaffected
+- Use explicit filter and fields parameters so consumers control what they get—and aren’t surprised by new fields
+
+If a breaking change is truly required, launch it as a new major version and follow your rollout plan.
+
+
+
+## 5. API Flexibility: Serve Many Use Cases, Without Sprawl
+
+Your data models will support many user journeys. Design your APIs to reflect this:
+
+- Separate **detail** and **list** endpoints, with appropriate default fields for each
+- Allow field selection, so clients can get only what they need
+- Create specialized endpoints for historical records or nested/embedded data if required—but avoid unnecessary duplication
+
+:::tip
+
+Well-designed APIs let clients flexibly query what they need, but keep a sensible, opinionated default for each use case.
+
+:::
+
+
+
+## 6. Build Resilient APIs: Rate Limiting & Access Control
+
+A good API is not just usable, but also **robust**:
+
+- Apply rate limits to every endpoint based on business impact, not just raw traffic
+- Ensure that resource spikes or abusive clients on one API can’t degrade the experience for everyone else
+- Use [role-based access](../system-admin/manage-role.md) and [client-level](create-api-client.md) permissions for sensitive or premium data
+
+
+
+## Design Checklist
+
+| Principle | Why It Matters |
+| --------------------------- | ----------------------------------------------------- |
+| Explicit versioning | Enables safe, staged upgrades & independent rollbacks |
+| Zero-downtime deployment | Keeps systems online and users happy |
+| Consistent naming/structure | Makes APIs discoverable and reduces onboarding time |
+| Backward compatibility | Prevents outages, protects integrations |
+| Flexibility for use cases | Delivers value to many teams, reduces API sprawl |
+| Rate limiting & security | Keeps the platform stable, safe, and scalable |
+
+**Bottom line:**
+Great API design is as much about the developer and business experience as it is about the code. Prioritize clarity, stability, and evolvability. When in doubt—design for the next team (or your future self) who will need to use and extend your APIs.
\ No newline at end of file
diff --git a/docs/user-guide/data-service/audit-api.md b/docs/publish-apis/audit-api.md
similarity index 74%
rename from docs/user-guide/data-service/audit-api.md
rename to docs/publish-apis/audit-api.md
index 1610672b..91e3522f 100644
--- a/docs/user-guide/data-service/audit-api.md
+++ b/docs/publish-apis/audit-api.md
@@ -1,17 +1,15 @@
-# Service Auditing
-import Content from '../../reuse-content/_enterprise-features.md';
+# Check API Audit Logs
-
Service auditing is primarily used to view the access records of APIs. You can view the records of each request, including access type, personnel, IP address, access time, access result, failure reason, etc. You can also filter according to different conditions.
-[Log in to TapData Platform](../log-in.md) and select **Data Services** > **Service Audit** on the left side of the page to view.
+Log in to TapData Platform and select **Data Services** > **Service Audit** on the left side of the page to view.
-
+
Click **Details** to view detailed information about the corresponding request, as follows:
-
+
- **Log Details**: Includes basic information and various metrics of the access, such as API ID, name, IP address of the visitor, etc.
- **Number of Access Records**: The total number of records for this access (entries).
diff --git a/docs/publish-apis/create-api-client.md b/docs/publish-apis/create-api-client.md
new file mode 100644
index 00000000..35f20f13
--- /dev/null
+++ b/docs/publish-apis/create-api-client.md
@@ -0,0 +1,32 @@
+# Create a Client
+
+
+To manage and create API calls, an API client is required. Applications that developers design and develop, or any other applications needing to call API interfaces (referred to collectively as client applications), must register with the data publishing system. Upon registration, you will receive a unique client ID (client_id) and client secret (client_secret).
+
+## Procedure
+
+1. Log in to TapData Platform.
+
+2. In the left navigation bar, select **Data Services** > **API Clients**.
+
+3. Click **Create a Client** in the top right corner, fill in the relevant information, and click **OK**.
+
+ 
+
+ - **Client name**: A meaningful name to identify the client. For example, `Client_for_BI`.
+
+ - **Grant Type**: The OAuth2 grant types supported by this client. You can select one or more from:– **Implicit**– **Client Credentials**– **Refresh Token**. Choose based on how the client will authenticate.
+
+ - **Client Secret**: Auto-generated credential used by the client to authenticate. Click **Generate** to create one.
+
+ :::tip
+
+ The client secret is an important basis for client applications to obtain API access authorization and should be properly stored to avoid transmission in public network environments.
+
+ :::
+
+ - **Permission scope** Assign the [role(s)](../system-admin/manage-role.md) this client should inherit (e.g. `DefaultRoleForNewUser`, `admin`). This determines what APIs and resources the client can access.
+
+ - **Redirect URI** The URI to which the system redirects after a successful authorization. Typically used in OAuth2 flows.
+
+ - **Show to the menu** Choose **Yes** to display this client in the app’s client list; choose **No** to hide it from the default view.
diff --git a/docs/user-guide/data-service/create-api-server.md b/docs/publish-apis/create-api-server.md
similarity index 79%
rename from docs/user-guide/data-service/create-api-server.md
rename to docs/publish-apis/create-api-server.md
index 39259376..6de6b39f 100644
--- a/docs/user-guide/data-service/create-api-server.md
+++ b/docs/publish-apis/create-api-server.md
@@ -1,13 +1,11 @@
# Create a Server
-import Content from '../../reuse-content/_enterprise-features.md';
-
API servers can be configured to expose API server addresses externally, and multiple servers can also be added.
## Procedure
-1. [Log in to TapData Platform](../log-in.md).
+1. Log in to TapData Platform.
2. In the left navigation bar, select **Data Services** > **API Servers**.
@@ -15,7 +13,7 @@ API servers can be configured to expose API server addresses externally, and mul
4. In the pop-up dialog box, enter the server name and access address, and then click **OK**.
- 
+ 
:::tip
diff --git a/docs/publish-apis/create-api-service.md b/docs/publish-apis/create-api-service.md
new file mode 100644
index 00000000..fc097e1a
--- /dev/null
+++ b/docs/publish-apis/create-api-service.md
@@ -0,0 +1,51 @@
+# Create Data API
+
+
+To help developers easily dock interfaces and conveniently view API information published through TapData, we offer a data services feature.
+
+## Supported Data Sources
+
+Currently, it supports Doris, MongoDB, MySQL, Oracle, PostgreSQL, SQL Server, and TiDB.
+
+## Procedure
+
+1. Log in to TapData Platform.
+
+2. In the left navigation bar, choose **Data Services** > **API List**.
+
+3. Click **Create API** at the top right of the page, then complete the settings on the right panel according to the instructions below.
+
+ 
+
+ * **Key Configuration Fields**
+ * **Service Name**: Give your API a meaningful name for easier identification and management.
+ * **Owner Application**: Select the business application this API belongs to. This helps categorize your APIs clearly. See [Application Management](manage-app.md) for more details.
+ * **Connection Type**, **Connection Name**, **Object Name**: Choose the data source and object (e.g. a view like `orders-wide-view`) that the API will query.
+ - **Interface Type**: TapData provides two modes for querying data via APIs:
+ - **Default Query**: A general-purpose mode with built-in pagination and filtering, suitable for client-driven access.
+ - **Custom Query**: A structured mode that enables domain-specific APIs with full control over query logic, sorting, and inputs.
+ - **API Path Settings**: Your API path follows the format `/api/{version}/{prefix}/{base_path}`.
+ - `version` and `prefix` are optional and can be used for versioning or business labeling (e.g., `/api/v1/orders/summary`).
+ - `base_path` is required and uniquely identifies the endpoint. It is auto-generated if left blank.
+ - **Input Parameters**: Define the parameters clients can pass when calling this API.
+ - For **Default Query**, the platform automatically includes three built-in parameters: `page`, `limit`, and `filter`. This allows dynamic pagination and filtering by the client; custom parameters are **not** supported.
+ - For **Custom Query**, you can define your own parameters (such as `region`, `startDate`, or `userLevel`), and map them to specific filter or sort conditions in the UI. In this mode, all filtering is managed server-side; the `filter` parameter is not included unless you explicitly add it. For supported types and configuration rules, see [API Query Parameters](query/api-query-params.md).
+ - **Output Results**: By default, all fields from the selected object are returned. You can manually adjust the list to return only selected fields.
+
+4. Click **Save** at the top right of the page.
+
+5. Find the service you just created and click **Publish** on its right to use the related service.
+
+6. (Optional) Click the service you just created, select the **Debug** tab in the right panel, enter request parameters, and click **Submit** to verify service availability.
+
+ 
+
+7. (Optional) For the data services you have created, you can select and export them for backup or sharing with other team members. You can also import data services.
+
+ 
+
+ Additionally, for published data services, you can select them and click **API Document Export** to quickly establish API usage documentation within the team. The exported Word file is in docx format and includes data service name, API description, GET/POST parameter descriptions.
+
+## See also
+
+[Managing API Versions](manage-api-versions.md)
diff --git a/docs/publish-apis/manage-api-versions.md b/docs/publish-apis/manage-api-versions.md
new file mode 100644
index 00000000..7a8424aa
--- /dev/null
+++ b/docs/publish-apis/manage-api-versions.md
@@ -0,0 +1,85 @@
+# Manage API Versions
+
+Managing API versions is essential when multiple applications rely on your APIs. Proper versioning ensures stability, prevents downtime, and facilitates smooth updates across your ecosystem.
+
+This guide addresses three key questions:
+
+- When should you increment your API version?
+- How can you support multiple API versions concurrently?
+- When should you retire older API versions?
+
+## Why API Versioning Matters
+
+API versioning helps you:
+
+- Safely introduce changes without breaking existing integrations.
+- Support rollback of client applications independently.
+- Enable gradual deployments and zero-downtime updates (such as blue-green deployments).
+
+**Example:**
+If your original API endpoint is `/e_commerce_orders`, by assigning version `v1`, the endpoint becomes `/v1/e_commerce_orders`. Later, introducing a new version (`v2`) allows simultaneous access to both `/v1/e_commerce_orders` and `/v2/e_commerce_orders`. Applications can upgrade independently and safely rollback if necessary.
+
+
+
+:::tip
+
+You can create versioned APIs by [dragging a target table](../operational-data-hub/adm-layer/integrate-apis.md) from the FDM or MDM Layer into the API builder, or by opening the Data Service section and clicking [Create API](create-api-service.md). From there, you’ll define the endpoint path and assign a version to it.
+
+:::
+
+
+
+## When to Increment API Versions
+
+Not every change to an API warrants a version bump. You should only increment the version when the change may **break existing clients** or alter the expected behavior of existing integrations.
+
+The table below helps clarify which types of changes require versioning and why:
+
+| Change Type | Example | Version Increment Required | Why? |
+| ------------------------------------------- | -------------------------------------------------- | -------------------------- | ------------------------------------------------------ |
+| Add new optional parameters (with defaults) | New query param like `region=us` | ❌ No | Existing clients continue working without changes |
+| Add new fields (returned only on request) | New field shown only if `fields` param is used | ❌ No | Output remains identical unless new field is requested |
+| Rename existing fields | Renaming `user_id` to `customer_id` | ✅ Yes | Breaks client-side parsing and integrations |
+| Remove fields from response | Dropping `email` from response payload | ✅ Yes | Clients expecting the field will fail or misbehave |
+| Add fields to default response | Automatically including `status` in default output | ✅ Yes | Changes parsing logic and response size unexpectedly |
+| Change parameter behavior or defaults | Default filter changes from "all" to "active only" | ✅ Yes | Alters business logic and returned data |
+| Remove or modify existing parameters | Removing required parameter `type` | ✅ Yes | Clients relying on it will break or get errors |
+
+As a best practice, treat **breaking changes** as major version increments (e.g., `/v1/resource` → `/v2/resource`). This protects existing consumers and allows parallel upgrades without disruption.
+
+
+
+
+
+## Manage Version Rollouts & Retirement
+
+Supporting multiple API versions enables safe, staged upgrades—without disrupting existing applications.
+
+### Roll Out New Versions Safely
+
+To introduce a new API version:
+
+1. [Create and publish](create-api-service.md) the new version (e.g., v2) alongside the existing version (e.g., v1).
+2. Update and test client apps to consume the new version.
+3. Roll out app changes gradually, allowing fallbacks if needed.
+4. [Monitor usage logs](audit-api.md) during a burn-in period (typically several weeks).
+5. Identify any apps still using v1 and complete their migration.
+
+### Retire Old Versions Responsibly
+
+Once all clients have moved to the new version:
+
+1. Verify v1 is no longer receiving traffic via logs (Client ID, IP, etc.).
+2. Unpublish the old API version—making it inactive but not deleted.
+3. Archive the old definition for audit or rollback purposes. You can [export and store](create-api-service.md#release330-export-api) it safely in case future recovery is needed.
+
+
+
+## Summary of Best Practices
+
+- Increment versions only for breaking changes.
+- Support concurrent API versions to facilitate gradual client upgrades and independent rollbacks.
+- Monitor and retire older versions after a confirmed stability period and successful client migration.
+- Maintain archived versions for audit, compliance, or potential future reference.
+
+Following these guidelines helps ensure stable, reliable API lifecycle management, delivering a better experience for your developers and end-users.
\ No newline at end of file
diff --git a/docs/user-guide/data-service/manage-app.md b/docs/publish-apis/manage-app.md
similarity index 81%
rename from docs/user-guide/data-service/manage-app.md
rename to docs/publish-apis/manage-app.md
index 5fe2af0e..bb7c6b4f 100644
--- a/docs/user-guide/data-service/manage-app.md
+++ b/docs/publish-apis/manage-app.md
@@ -1,13 +1,11 @@
# Manage Application
-import Content from '../../reuse-content/_enterprise-features.md';
-
To better manage and distinguish API services, we can categorize them based on business needs, assigning different APIs into different applications. This article introduces the specific operation process.
## Procedure
-1. [Log in to TapData Platform](../log-in.md).
+1. Log in to TapData Platform.
2. In the left navigation bar, select **Data Services** > **Application List**.
@@ -15,7 +13,7 @@ To better manage and distinguish API services, we can categorize them based on b
4. In the pop-up dialog, enter the application name and description, then click **Save**.
- 
+ 
:::tip
@@ -25,7 +23,7 @@ To better manage and distinguish API services, we can categorize them based on b
5. (Optional) Manage existing applications.
- 
+ 
* **Edit**: You can edit the application's name and description information.
* **Details**: You can view detailed information about the API services contained in the application, such as the publication status, etc.
diff --git a/docs/user-guide/data-service/monitor-api-request.md b/docs/publish-apis/monitor-api-request.md
similarity index 79%
rename from docs/user-guide/data-service/monitor-api-request.md
rename to docs/publish-apis/monitor-api-request.md
index b976b6ec..0769a222 100644
--- a/docs/user-guide/data-service/monitor-api-request.md
+++ b/docs/publish-apis/monitor-api-request.md
@@ -1,11 +1,9 @@
# API Status Monitor
-import Content from '../../reuse-content/_enterprise-features.md';
-
Service Monitoring allows for the oversight and management of API requests on the platform, providing a view into global statistics and the status of each API.
-
+
The top section displays overview data, which includes the total number of APIs, the total number of rows accessed, etc.
@@ -15,4 +13,4 @@ The bottom section shows an API list where you can view the status of each API.
You can click on the left expand button to view detailed information.
-
\ No newline at end of file
+
\ No newline at end of file
diff --git a/docs/user-guide/advanced-settings/README.md b/docs/publish-apis/query/README.md
similarity index 75%
rename from docs/user-guide/advanced-settings/README.md
rename to docs/publish-apis/query/README.md
index ee510ac7..a07d7354 100644
--- a/docs/user-guide/advanced-settings/README.md
+++ b/docs/publish-apis/query/README.md
@@ -1,5 +1,4 @@
-# Advanced Settings
-
+# Query APIs
import DocCardList from '@theme/DocCardList';
diff --git a/docs/user-guide/data-service/api-auth.md b/docs/publish-apis/query/api-auth.md
similarity index 82%
rename from docs/user-guide/data-service/api-auth.md
rename to docs/publish-apis/query/api-auth.md
index 94c69fbf..a5549858 100644
--- a/docs/user-guide/data-service/api-auth.md
+++ b/docs/publish-apis/query/api-auth.md
@@ -1,9 +1,7 @@
# API Authentication
-import Content from '../../reuse-content/_enterprise-features.md';
-
-TapData's API authentication service is based on the OAuth 2.0 mechanism, with default support for `client credentials` and `implicit` authorization methods. You can select the authorization method when [creating a client](create-api-client.md). This article introduces the API authentication process, including how to obtain access tokens, to help you quickly utilize the API service.
+TapData's API authentication service is based on the OAuth 2.0 mechanism, with default support for `client credentials` and `implicit` authorization methods. You can select the authorization method when [creating a client](../create-api-client.md). This article introduces the API authentication process, including how to obtain access tokens, to help you quickly utilize the API service.
## Obtaining Access Tokens
@@ -52,13 +50,12 @@ Authorization: bearer eyJhbGciOiJIUzI1NiJ9.eyJjbGllbnRJ********
## Common Response Status Codes
-| Response Code | Description |
-| ------------- | --------------------------------------------------------------------------- |
+| Response Code | Description |
+| ------------- | ------------------------------------------------------------ |
| 200 | Successful return for findById, findPage, create, custom methods, and requests. |
-| 204 | Successful return for updateById, deleteById requests. |
| 500 | Internal server error, common errors include violating unique constraints, MongoDB Validate failure, etc. |
-| 401 | Authentication failure, access token expired or not provided. |
-| 404 | Operation data does not exist, such as deleting, updating, or querying non-existent records. |
+| 401 | Authentication failure, access token expired or not provided. |
+| 404 | Operation data does not exist, such as querying non-existent records. |
## Recommended Reading
diff --git a/docs/user-guide/data-service/api-query-params.md b/docs/publish-apis/query/api-query-params.md
similarity index 96%
rename from docs/user-guide/data-service/api-query-params.md
rename to docs/publish-apis/query/api-query-params.md
index 500fcad8..67b1ede6 100644
--- a/docs/user-guide/data-service/api-query-params.md
+++ b/docs/publish-apis/query/api-query-params.md
@@ -1,7 +1,5 @@
# API Query Parameters
-import Content from '../../reuse-content/_enterprise-features.md';
-
When invoking published API interfaces, it's possible to add query conditions in the URL query string to quickly filter the query results. This article introduces supported filters and provides related usage examples.
@@ -11,7 +9,7 @@ When invoking published API interfaces, it's possible to add query conditions in
- **[Skip Filter (Skip Specified Record Count Filter)](#skip)**: Skips a specified number of rows in the returned data.
- **[Where Filter (Query Condition Filter)](#where)**: Queries and returns data based on a set of logically related conditions, similar to SQL's WHERE clause.
-In this case, we have published the `customer` table [as an API service](create-api-service.md), and the data comes from a randomly generated source. Its table structure and data sample are as follows:
+In this case, we have published the `customer` table [as an API service](../create-api-service.md), and the data comes from a randomly generated source. Its table structure and data sample are as follows:
```sql
mysql> SELECT * FROM customer LIMIT 1\G;
diff --git a/docs/user-guide/data-service/query-via-graphql.md b/docs/publish-apis/query/query-via-graphql.md
similarity index 95%
rename from docs/user-guide/data-service/query-via-graphql.md
rename to docs/publish-apis/query/query-via-graphql.md
index 8f3f0d49..3e5c4b61 100644
--- a/docs/user-guide/data-service/query-via-graphql.md
+++ b/docs/publish-apis/query/query-via-graphql.md
@@ -1,7 +1,5 @@
# Query API via GraphQL
-import Content from '../../reuse-content/_enterprise-features.md';
-
GraphQL provides a query language that allows you to request data from the server in a declarative way, such as specific data in a schema. TapData has integrated GraphQL, allowing you to execute requests through the API service address.
@@ -9,7 +7,7 @@ In this article, we will introduce how to use the Postman to view API data servi
## Procedure
-1. [Log in to TapData Platform](../log-in.md).
+1. Log in to TapData Platform.
2. Retrieve the GraphQL query request address.
diff --git a/docs/publish-apis/query/query-via-restful.md b/docs/publish-apis/query/query-via-restful.md
new file mode 100644
index 00000000..e843c35b
--- /dev/null
+++ b/docs/publish-apis/query/query-via-restful.md
@@ -0,0 +1,59 @@
+# Query API through RESTful
+
+TapData allows you to expose real-time data as secure RESTful APIs. Once an API service is published, you can query it directly from within the platform using the built-in debugger, or externally using tools like Postman. This guide walks you through both methods.
+
+## Before You Begin
+
+Make sure the API you want to query has already been [created and published](../create-api-service.md).
+
+## Query via Built-in Debugger
+
+You can test and preview your API directly in TapData—no need for external tools.
+
+1. Log in to TapData
+
+2. Go to **Data Services** > **API List** in the left navigation menu.
+
+3. Find your published API and click the service name.
+
+4. In the right panel, scroll to the **Access URL** section to copy the service endpoint.
+
+ 
+
+5. Click the **Debug** tab.
+
+6. Scroll to the **Example Code** section to get a sample request and the authentication token.
+
+ 
+
+7. Click the **Query** button to test your API.
+
+:::tip
+Need to filter results? You can add query parameters to the request URL. See [API Query Parameters](api-query-params.md) for details.
+:::
+
+
+
+## Query via Postman (Optional)
+
+If you'd prefer to use an external tool or automate API testing, [Postman](https://www.postman.com/) is a great option.
+
+1. Open Postman and select your **Workspace** at the top.
+
+2. Click **New** and choose **HTTP Request**.
+
+ 
+
+3. In the request URL field, paste the API endpoint you copied from TapData.
+
+4. (Optional) Click **Query Params** to add filter conditions to your request.
+
+ For supported query parameters, see [API Query Parameters](api-query-params.md).
+
+5. Click **Authorization**, select **Bearer Token**, and paste the Access Token you got from TapData.
+
+ 
+
+6. Click **Send**. You’ll get a real-time response from the API.
+
+ 
diff --git a/docs/quick-start/README.md b/docs/quick-start/README.md
deleted file mode 100644
index 524657a3..00000000
--- a/docs/quick-start/README.md
+++ /dev/null
@@ -1,17 +0,0 @@
-# Quick Start
-
-import Content from '../reuse-content/_all-features.md';
-
-
-
-With the digital transformation of enterprises, traditional applications and architectures are no longer able to meet the needs of businesses, and the issue of data silos has emerged as the most significant challenge.
-
-In the past, delivering data often required substantial time and effort to customize ETL logic or scripts. However, these capabilities were often not easily reusable in other business scenarios and necessitated repetition when requirements changed. Consequently, enterprises now seek a user-friendly approach to uniformly manage their data pipelines and achieve efficient data flow.
-
-TapData offers real-time data services that seamlessly integrate data replication and data transformation. It enables millisecond-level real-time data synchronization and data fusion across clouds, regions, and multiple types of databases.
-
-Follow our tutorial to experience the convenience, power, security, and reliability of TapData Cloud's data flow in just a few simple steps.
-
-import DocCardList from '@theme/DocCardList';
-
-
diff --git a/docs/quick-start/connect-database.md b/docs/quick-start/connect-database.md
deleted file mode 100644
index fa3f13f5..00000000
--- a/docs/quick-start/connect-database.md
+++ /dev/null
@@ -1,62 +0,0 @@
-# Step 2: Connect Data Sources
-
-import Content from '../reuse-content/_all-features.md';
-
-
-
-Once you have [installed the Agent](install.md), you need to connect the Agent to the data sources through TapData, and you can create a data pipeline once the connection has been established.
-
-:::tip
-
-Before connecting to the data sources, you also need to ensure that the network environment is accessed properly and complete the authorization of the database account. For more information, see [Preparation](../prerequisites/README.md).
-
-:::
-
-## Procedure
-
-1. [Log in to TapData Platform](../user-guide/log-in.md).
-
-2. In the left navigation panel, click **Connections**.
-
-3. On the right side of the page, click on **Create**. A dialog box will appear, where you can select the desired data source to establish a connection with.
-
- 
-
-4. After being redirected to the connection configuration page, proceed to fill in the required data source connection information.
-
- On the right panel of the page, you will find helpful information and guidance regarding the configuration of the connection.
-
- :::tip
-
- The operation process will be demonstrated using MySQL as an example. For more examples, see [Connect Data Sources](../prerequisites/README.md).
-
- :::
-
- 
-
- * **Connection name**: Enter a unique name that holds business significance.
- * **Connection type**: Select Source, Target, or Source&Target.
- * **Host**: The database connection address.
- * **Port**: The service port of database.
- * **Database**: database name, a connection corresponding to a database, if there are multiple databases, you need to create multiple connections.
- * **username**: Enter database server username.
- * **Password**: The database password.
- * **Connection Parameter String**: Additional connection parameters, default empty.
- * **timezone**: Defaults to the time zone used by the database, which you can also manually specify according to your business needs.
- * **Contain table**: The default option is **All**, but you also have the choice to select **Custom** and specify the included tables. If there are multiple tables, separate them by commas (,) when filling in the table names.
- * **Exclude tables**: Once you have enabled the switch, you can configure the tables to be excluded by specifying their names, separated by commas (,) if there are multiple tables.
- * **Agent settings**: Defaults to **Platform automatic allocation**, you can also manually specify an agent.
-
-5. Click **Connection Test** at the bottom of the page, and when passed the check, click **Save**.
-
- :::tip
-
- If the connection test fails, follow the prompts on the page to fix it.
-
- :::
-
-
-
-## Next step
-
-[Create a Data Pipeline](create-task.md)
\ No newline at end of file
diff --git a/docs/quick-start/create-task.md b/docs/quick-start/create-task.md
deleted file mode 100644
index 3c7734e0..00000000
--- a/docs/quick-start/create-task.md
+++ /dev/null
@@ -1,18 +0,0 @@
-# Step 3: Create a Data Pipeline
-
-import Content from '../reuse-content/_all-features.md';
-
-
-
-TapData allows you to synchronize data from various sources and process it during the data flow. Based on your business needs, you can create different tasks such as:
-
-| Task Type | Applicable scenario |
-| ------------------------------------------------------------ | ------------------------------------------------------------ |
-| [Create a data replication task](../user-guide/copy-data/create-task.md) | Real-time synchronization between similar or heterogeneous data sources can be achieved easily in a few simple steps. This capability is well-suited for various business scenarios such as data migration/synchronization, data disaster recovery, and improving reading performance. |
-| [Create a data transformation task](../user-guide/data-development/create-task.md) | A variety of processing nodes can be added between source/target data sources. These nodes provide advanced data processing capabilities such as data splitting, merging, field addition, and deletion, and shared mining. |
-
-## See also
-
-* [Enable Data Service Platform](../user-guide/real-time-data-hub/daas-mode/enable-daas-mode.md)
-* [Supported Data Sources](../prerequisites/supported-databases.md)
-* [Best Practices](../case-practices/best-practice/README.md)
\ No newline at end of file
diff --git a/docs/quick-start/install.md b/docs/quick-start/install.md
deleted file mode 100644
index 0fd4085d..00000000
--- a/docs/quick-start/install.md
+++ /dev/null
@@ -1,36 +0,0 @@
-# Step 1: Launch TapData
-
-import Content from '../reuse-content/_all-features.md';
-
-
-
-TapData offers various deployment options to cater to diverse needs, from quick validation to highly secure on-premises deployment. This guide uses **TapData Cloud** as an example to help you get started quickly while briefly introducing other deployment methods.
-
-## Try with TapData Cloud
-
-**TapData Cloud** is the fully managed version of TapData, designed to let you explore real-time data synchronization effortlessly. With no installation or infrastructure setup required, you can start creating your data flows in just a few clicks. Here’s why TapData Cloud is perfect for getting started:
-
-- **No Infrastructure Required**: Jump in and start syncing data—no servers, no installations.
-- **Scale with Ease**: Need more power? [Add dedicated Agent instances](../billing/billing-overview.md) for high-throughput scenarios or [deploy Agents locally](../installation/install-tapdata-agent.md) for complex network environments.
-- **Focus on Your Data, Not Maintenance**: TapData takes care of the heavy lifting, so you can stay focused on data development and business insights.
-
-**Get Started in Minutes**
-
-1. Visit [TapData Cloud](https://cloud.tapdata.io/console/v3/) and sign up for free.
-2. Log in to your account to access your shared TapData Agent.
-3. Follow these quick guides to explore TapData’s features:
- - **[Connect a Data Source](connect-database.md)**: Link your databases or data sources to the TapData platform.
- - **[Create Your First Data Pipeline](connect-database.md)**: Start syncing or transforming data in real time.
-
-## More Deployment Options
-
-If your project requires more control or custom setups, TapData has you covered with additional deployment methods tailored to your needs:
-
-- **[TapData Enterprise](../installation/install-tapdata-enterprise/README.md)**
- Designed for organizations with strict data governance and security requirements, TapData Enterprise is ideal for industries like finance, government, or large enterprises. Deploy TapData in your own data center to ensure full control over your data.
-- **[TapData Community](../installation/install-tapdata-community.md)**
- An open-source version of TapData, perfect for small projects or technical teams. Quickly deploy via Docker to access core data synchronization and transformation capabilities, with the option to upgrade to Enterprise or Cloud for additional features.
-
-## See also
-
- [Edition Comparison](../introduction/compare-editions.md)
\ No newline at end of file
diff --git a/docs/release-notes/release-notes-on-prem.md b/docs/release-notes-on-prem.md
similarity index 63%
rename from docs/release-notes/release-notes-on-prem.md
rename to docs/release-notes-on-prem.md
index 9a6ea1e6..854d0340 100644
--- a/docs/release-notes/release-notes-on-prem.md
+++ b/docs/release-notes-on-prem.md
@@ -1,8 +1,4 @@
-# TapData Enterprise Release Notes
-
-import Content from '../reuse-content/_enterprise-features.md';
-
-
+# What's New
This article provides release notes for TapData Enterprise, including new features, improvements, and bug fixes.
@@ -18,13 +14,65 @@ import TabItem from '@theme/TabItem';
```
+## 4.6.0
+
+### New Features
+
+* Added support for defining business aliases for fields when [creating APIs](publish-apis/create-api-service.md), improving semantic readability and standardization of field names.
+* When publishing a Custom Query API, you can now preview and edit the generated SQL statement, and reference user-defined parameters—offering a more flexible API configuration experience.
+
+### Enhancements
+
+* Added an **“Ignore Update Events”** option for child table merge logic, reducing redundant triggers and minimizing unnecessary data updates.
+* Support for customizing license alert levels and email templates, helping enterprises meet compliance and notification requirements.
+* Added a Set API Access option on the role management page, enabling quick assignment of API access permissions to roles, improving authorization efficiency.
+
+### Bug Fixes
+
+* Fixed an issue where the maximum number of rows returned by an API did not take effect in system settings.
+* Fixed a display error in the API audit page.
+
+## 4.5.0
+
+### Enhancements
+
+* Improved write performance when using [Feishu Bitable](connectors/saas-and-api/feishu-bitable.md) as a target database.
+* Optimized the source and target configuration interfaces to make interactions clearer, improving usability and configuration efficiency.
+
+### Bug Fixes
+
+* Fixed an issue where **specified return fields for APIs** were not applied correctly.
+
+## 4.4.0
+
+### New Features
+
+- [Incremental data validation](data-replication/incremental-check.md) now supports tables without primary keys, along with manual repair. This expands validation coverage and helps ensure data consistency.
+- Added [alert notifications](system-admin/other-settings/notification.md#alert-settings) during task retries to improve visibility and responsiveness in failure scenarios.
+- Introduced "Node Data Preview" in the task editor, allowing you to view sample fields instantly while configuring tasks—speeding up design and troubleshooting.
+
+### Enhancements
+
+- Improved data source monitoring: Added connection status and performance metrics for Sybase and PostgreSQL, enhancing system observability.
+
+## 4.3.0
+
+### New Features
+
+- [Data validation](operational-data-hub/fdm-layer/validate-data-quality.md) now supports exporting repair SQL for inconsistent results, with compatibility for PostgreSQL, Sybase, and **SQL Server**. This makes offline auditing and rollback easier for ops teams.
+- JS Processor Node now supports the `unset` operation, enabling targeted field removal in MongoDB. Ideal for cleanup and payload optimization.
+
+### Enhancements
+
+- Refined logic for marking full sync tasks as completed: now based on whether all writes to the target are finished—more aligned with real business completion.
+
## 4.2.0
### New Features
-- Added status monitoring for incremental log mining plugins in the [Cluster Management](../user-guide/manage-system/manage-cluster.md) module to improve observability and troubleshooting efficiency.
-- Support for [customizing alert email content](../user-guide/other-settings/notification.md#mail-alert), enhancing readability and flexibility of notifications.
-- Support for importing/exporting [data verification task configurations](../user-guide/verify-data.md), making it easier to migrate or reuse configurations between test and production environments.
+- Added status monitoring for incremental log mining plugins in the [Cluster Management](system-admin/manage-cluster.md) module to improve observability and troubleshooting efficiency.
+- Support for [customizing alert email content](system-admin/other-settings/notification.md#mail-alert), enhancing readability and flexibility of notifications.
+- Support for importing/exporting [data verification task configurations](operational-data-hub/fdm-layer/validate-data-quality.md), making it easier to migrate or reuse configurations between test and production environments.
### Enhancements
@@ -38,7 +86,7 @@ import TabItem from '@theme/TabItem';
### New Features
-- Added support for [Incremental Data Validation](../user-guide/incremental-check.md) within tasks. This feature continuously verifies target-side data consistency during synchronization, improving validation efficiency and enhancing overall data reliability.
+- Added support for [Incremental Data Validation](data-replication/incremental-check.md) within tasks. This feature continuously verifies target-side data consistency during synchronization, improving validation efficiency and enhancing overall data reliability.
### Enhancements
@@ -52,9 +100,9 @@ import TabItem from '@theme/TabItem';
### New Features
-* Introduced [Tapdata MCP (Model Context Protocol)](../mcp/introduction.md), enabling integration of multi-source data into real-time contextual views consumable by LLMs and AI Agents. This feature is ideal for scenarios with high demands on data freshness and compliance, such as financial risk control.
+* Introduced [Tapdata MCP (Model Context Protocol)](experimental/mcp/introduction.md), enabling integration of multi-source data into real-time contextual views consumable by LLMs and AI Agents. This feature is ideal for scenarios with high demands on data freshness and compliance, such as financial risk control.
* Added support for using **StarRocks** as a target database, allowing faster construction of real-time data warehouses for high-concurrency, multi-dimensional analytics use cases.
-* Added the ability to choose from multiple data structures(e.g. Flink) when syncing to **[Kafka-Enhanced](../prerequisites/mq-and-middleware/kafka-enhanced.md)**, enhancing compatibility and integration efficiency with downstream systems.
+* Added the ability to choose from multiple data structures(e.g. Flink) when syncing to **[Kafka-Enhanced](connectors/mq-and-middleware/kafka-enhanced.md)**, enhancing compatibility and integration efficiency with downstream systems.
### Enhancements
@@ -75,8 +123,8 @@ import TabItem from '@theme/TabItem';
### New Features
-- The [Cluster Overview](../user-guide/workshop.md) page on the homepage now displays task distribution by node, helping you better understand cluster workload.
-- [OceanBase (MySQL Mode)](../prerequisites/on-prem-databases/oceanbase.md), [OceanBase (Oracle Mode)](../prerequisites/on-prem-databases/oceanbase-oracle.md), and [GaussDB (DWS)](../prerequisites/warehouses-and-lake/gaussdb.md) have passed Tapdata certification and are now classified as [Certified Data Sources](../prerequisites/supported-databases.md), offering enhanced features and improved production-level stability.
+- The Cluster Overview page on the homepage now displays task distribution by node, helping you better understand cluster workload.
+- [OceanBase (MySQL Mode)](connectors/on-prem-databases/oceanbase.md), [OceanBase (Oracle Mode)](connectors/on-prem-databases/oceanbase-oracle.md), and [GaussDB (DWS)](connectors/warehouses-and-lake/gaussdb.md) have passed Tapdata certification and are now classified as [Certified Data Sources](connectors/supported-data-sources.md), offering enhanced features and improved production-level stability.
- Data replication tasks now support writing multiple tables to the same Kafka topic, expanding compatibility with more write scenarios.
### Enhancements
@@ -91,8 +139,8 @@ import TabItem from '@theme/TabItem';
### New Features
-- Added support for syncing tables with auto-increment primary keys in [SQL Server](../prerequisites/on-prem-databases/sqlserver.md).
-- Added support for syncing default values and foreign keys in [PostgreSQL](../prerequisites/on-prem-databases/postgresql.md) to SQL Server sync scenarios.
+- Added support for syncing tables with auto-increment primary keys in [SQL Server](connectors/on-prem-databases/sqlserver.md).
+- Added support for syncing default values and foreign keys in [PostgreSQL](connectors/on-prem-databases/postgresql.md) to SQL Server sync scenarios.
### Enhancements
@@ -102,9 +150,9 @@ import TabItem from '@theme/TabItem';
### New Features
-- Added support for synchronizing **column default values**, **auto-increment columns**, and **foreign key constraints** in [MySQL](../prerequisites/on-prem-databases/mysql.md)-to-MySQL, [PostgreSQL](../prerequisites/on-prem-databases/postgresql.md)-to-PostgreSQL, and [SQL Server](../prerequisites/on-prem-databases/sqlserver.md)-to-PostgreSQL scenarios, ensuring data structure consistency.
-- Enabled foreign key constraint synchronization in [Sybase](../prerequisites/on-prem-databases/sybase.md)-to-PostgreSQL tasks, further enhancing data consistency.
-- Enhanced the **[primary-secondary merge node](../user-guide/data-development/process-node.md#pri-sec-merged)** functionality to allow subsequent connections with other processing nodes (including JS nodes), improving workflow flexibility.
+- Added support for synchronizing **column default values**, **auto-increment columns**, and **foreign key constraints** in [MySQL](connectors/on-prem-databases/mysql.md)-to-MySQL, [PostgreSQL](connectors/on-prem-databases/postgresql.md)-to-PostgreSQL, and [SQL Server](connectors/on-prem-databases/sqlserver.md)-to-PostgreSQL scenarios, ensuring data structure consistency.
+- Enabled foreign key constraint synchronization in [Sybase](connectors/on-prem-databases/sybase.md)-to-PostgreSQL tasks, further enhancing data consistency.
+- Enhanced the **[primary-secondary merge node](data-transformation/process-node.md#pri-sec-merged)** functionality to allow subsequent connections with other processing nodes (including JS nodes), improving workflow flexibility.
### Enhancements
@@ -134,8 +182,8 @@ import TabItem from '@theme/TabItem';
### New Features
-- Added support for the [Sybase to PostgreSQL](../prerequisites/on-prem-databases/sybase.md) sync scenario, now supporting synchronization of default values, enumerated types, and sequences.
-- Enabled the ability to define a primary key for tables without a primary key when configuring [Primary-Secondary Merge Nodes](../user-guide/data-development/process-node.md#pri-sec-merged), ensuring data synchronization consistency and improving merge efficiency.
+- Added support for the [Sybase to PostgreSQL](connectors/on-prem-databases/sybase.md) sync scenario, now supporting synchronization of default values, enumerated types, and sequences.
+- Enabled the ability to define a primary key for tables without a primary key when configuring [Primary-Secondary Merge Nodes](data-transformation/process-node.md#pri-sec-merged), ensuring data synchronization consistency and improving merge efficiency.
### Enhancements
@@ -161,7 +209,7 @@ import TabItem from '@theme/TabItem';
### New Features
-- Enhanced [Sybase](../prerequisites/on-prem-databases/sybase.md)-to-PostgreSQL synchronization scenario, adding **index** migration and **sequence** synchronization features, further improving migration automation and ensuring sequence data consistency.
+- Enhanced [Sybase](connectors/on-prem-databases/sybase.md)-to-PostgreSQL synchronization scenario, adding **index** migration and **sequence** synchronization features, further improving migration automation and ensuring sequence data consistency.
### Feature Optimizations
@@ -176,7 +224,7 @@ import TabItem from '@theme/TabItem';
### New Features
-- Added support for restricting [single-session login](../user-guide/other-settings/system-settings.md#login) per account to enhance login security.
+- Added support for restricting [single-session login](system-admin/other-settings/system-settings.md#login) per account to enhance login security.
### Enhancements
@@ -195,7 +243,7 @@ import TabItem from '@theme/TabItem';
### New Features
-- In the **[Data Verification](../user-guide/verify-data.md)** task's advanced settings, a "**Custom Collate**" option has been added, allowing you to specify the sorting rules for both the source and target databases to ensure consistent character sorting during verification.
+- In the **[Data Verification](operational-data-hub/fdm-layer/validate-data-quality.md)** task's advanced settings, a "**Custom Collate**" option has been added, allowing you to specify the sorting rules for both the source and target databases to ensure consistent character sorting during verification.
### Enhancements
@@ -245,7 +293,7 @@ import TabItem from '@theme/TabItem';
### New Features
-- Added HTTPS connection support for [Elasticsearch data sources](../prerequisites/on-prem-databases/elasticsearch.md), enhancing data transmission security to meet more stringent data security and compliance requirements.
+- Added HTTPS connection support for [Elasticsearch data sources](connectors/on-prem-databases/elasticsearch.md), enhancing data transmission security to meet more stringent data security and compliance requirements.
- Enabled support for synchronizing tables without primary keys by adding a hash field (default name: `_no_pk_hash`), ensuring data consistency and stable synchronization in non-primary key scenarios.
### Enhancements
@@ -263,11 +311,11 @@ import TabItem from '@theme/TabItem';
### New Features
-* Kafka-Enhanced and TiDB have passed the TapData certification testing process and have been upgraded to [Certified Data Sources](../prerequisites/supported-databases.md), providing more advanced features and enhanced production stability.
+* Kafka-Enhanced and TiDB have passed the TapData certification testing process and have been upgraded to [Certified Data Sources](connectors/supported-data-sources.md), providing more advanced features and enhanced production stability.
### Enhancements
-- Added a [Multi-threaded CT Table Polling](../prerequisites/on-prem-databases/sqlserver.md#advanced-settings) option to improve incremental data collection performance for SQL Server environments with a large number of tables (over 500), significantly increasing synchronization efficiency.
+- Added a [Multi-threaded CT Table Polling](connectors/on-prem-databases/sqlserver.md#advanced-settings) option to improve incremental data collection performance for SQL Server environments with a large number of tables (over 500), significantly increasing synchronization efficiency.
- Optimized the cache management logic for processing nodes, enhancing resource usage efficiency and improving task execution speed.
- Introduced an automatic retry mechanism for Oracle LogMiner errors caused by exceeding PGA limits, improving fault tolerance.
@@ -284,8 +332,8 @@ import TabItem from '@theme/TabItem';
### New Features
-* Doris, ClickHouse, KingBaseES-R6, PostgreSQL, SQL Server, and MongoDB have passed the TapData certification testing process and have been upgraded to [Certified Data Sources](../prerequisites/supported-databases.md), providing more advanced features and enhanced production stability.
-* Support for [user login authentication via LDAP](../user-guide/other-settings/system-settings.md#ldap) integration with Active Directory (AD), enabling unified user identity management.
+* Doris, ClickHouse, KingBaseES-R6, PostgreSQL, SQL Server, and MongoDB have passed the TapData certification testing process and have been upgraded to [Certified Data Sources](connectors/supported-data-sources.md), providing more advanced features and enhanced production stability.
+* Support for [user login authentication via LDAP](system-admin/other-settings/system-settings.md#ldap) integration with Active Directory (AD), enabling unified user identity management.
* When using PostgreSQL as a source, it is now possible to specify the time point for incremental data in task settings.
### Enhancements
@@ -304,7 +352,7 @@ import TabItem from '@theme/TabItem';
### New Features
-* MySQL has passed the TapData certification testing process, upgrading it to a [certified data source](../prerequisites/supported-databases.md), providing more comprehensive features and enhanced production stability.
+* MySQL has passed the TapData certification testing process, upgrading it to a [certified data source](connectors/supported-data-sources.md), providing more comprehensive features and enhanced production stability.
### Enhancements
@@ -321,9 +369,9 @@ import TabItem from '@theme/TabItem';
### New Features
-- Oracle, Dameng, and Db2 have passed the TapData certification testing process and have been upgraded to [Certified Data Sources](../prerequisites/supported-databases.md), offering richer features and higher production stability.
-- When configuring [alert recipient email](../case-practices/best-practice/alert-via-qqmail.md), support for using proxy services has been added, allowing for timely alert notifications even in restricted network environments.
-- For [PostgreSQL](../prerequisites/on-prem-databases/postgresql.md) data sources, incremental data synchronization is now supported using the walminer plugin, catering to more use cases.
+- Oracle, Dameng, and Db2 have passed the TapData certification testing process and have been upgraded to [Certified Data Sources](connectors/supported-data-sources.md), offering richer features and higher production stability.
+- When configuring [alert recipient email](case-practices/best-practice/alert-via-qqmail.md), support for using proxy services has been added, allowing for timely alert notifications even in restricted network environments.
+- For [PostgreSQL](connectors/on-prem-databases/postgresql.md) data sources, incremental data synchronization is now supported using the walminer plugin, catering to more use cases.
- Data replication tasks now support reading from multiple tables simultaneously, improving parallel processing capabilities and task execution efficiency.
- Added support for batch API publishing, simplifying multi-interface management and enhancing publishing efficiency.
@@ -359,8 +407,8 @@ import TabItem from '@theme/TabItem';
### New Features
-- Added table name and API address display functionality in the [Service Management List Page](../user-guide/data-service/create-api-service.md), supporting quick search and filtering by keywords.
-- Enhanced [Data Transformation Task Configuration](../user-guide/data-development/create-task.md) to support reloading of single table models in the source node model preview area, improving loading efficiency.
+- Added table name and API address display functionality in the [Service Management List Page](publish-apis/create-api-service.md), supporting quick search and filtering by keywords.
+- Enhanced [Data Transformation Task Configuration](data-transformation/create-views/README.md) to support reloading of single table models in the source node model preview area, improving loading efficiency.
- Introduced time detection functionality that automatically detects the time difference between the engine deployment server and the database server and displays it on the task monitoring page.
### Enhancements
@@ -380,11 +428,11 @@ import TabItem from '@theme/TabItem';
### New Features
-- [Data Verification](../user-guide/verify-data.md) feature now allows downloading detailed discrepancy data from the verification task details page for in-depth analysis.
-- Added a [Union Node](../user-guide/copy-data/process-node.md#union-node) to data replication tasks, enabling the merging (UNION) of multiple tables within the same database. This is useful for data integration and analysis scenarios.
-- [Doris](../prerequisites/warehouses-and-lake/doris.md) data source now supports certificate-free HTTPS connections.
+- [Data Verification](operational-data-hub/fdm-layer/validate-data-quality.md) feature now allows downloading detailed discrepancy data from the verification task details page for in-depth analysis.
+- Added a [Union Node](data-transformation/process-node.md#union-node) to data replication tasks, enabling the merging (UNION) of multiple tables within the same database. This is useful for data integration and analysis scenarios.
+- [Doris](connectors/warehouses-and-lake/doris.md) data source now supports certificate-free HTTPS connections.
- MySQL, Oracle, OpenGauss, SQL Server, and PostgreSQL data sources now support enabling the **Hash Sharding** feature in the advanced settings of nodes during task configuration, significantly improving the full data sync speed for large tables.
-- Added support for [VastBase](../prerequisites/on-prem-databases/vastbase.md) data source, with a maturity level of Beta, further enriching the variety of data sources.
+- Added support for [VastBase](connectors/on-prem-databases/vastbase.md) data source, with a maturity level of Beta, further enriching the variety of data sources.
### Enhancements
@@ -409,12 +457,12 @@ import TabItem from '@theme/TabItem';
### New Features
-* [Data Verification](../user-guide/verify-data.md) now includes differential data repair capabilities, enhancing data consistency and accuracy.
-* Added a new button for using CDC log Caching when creating [Live Cache](../user-guide/advanced-settings/share-cache.md), simplifying cache task configuration and improving the efficiency and flexibility of cache sharing.
+* [Data Verification](operational-data-hub/fdm-layer/validate-data-quality.md) now includes differential data repair capabilities, enhancing data consistency and accuracy.
+* Added a new button for using CDC log Caching when creating [Live Cache](operational-data-hub/advanced/share-cache.md), simplifying cache task configuration and improving the efficiency and flexibility of cache sharing.
### Enhancements
-* Optimized features in the [Real-Time Data Hub](../user-guide/real-time-data-hub/README.md):
+* Optimized features in the [Real-Time Data Hub](operational-data-hub/plan-data-platform.md):
* The data processing layer now displays all models in the database.
* The platform cache layer and platform processing layer can be configured with different connections, which cannot be adjusted after setting.
* Added an API publishing entry.
@@ -439,8 +487,8 @@ import TabItem from '@theme/TabItem';
### New Features
-* Enhanced [TiDB](../prerequisites/on-prem-databases/tidb.md) data source capabilities with support for real-time incremental synchronization.
-* [Data Validation](../user-guide/verify-data.md) now supports automatic difference checking, allowing real-time tasks to automatically perform difference checks based on incremental delay.
+* Enhanced [TiDB](connectors/on-prem-databases/tidb.md) data source capabilities with support for real-time incremental synchronization.
+* [Data Validation](operational-data-hub/fdm-layer/validate-data-quality.md) now supports automatic difference checking, allowing real-time tasks to automatically perform difference checks based on incremental delay.
### Enhancements
@@ -458,7 +506,7 @@ import TabItem from '@theme/TabItem';
### New Features
-* Added support for [granting data verification permissions](../user-guide/manage-system/manage-role.md) to users, enhancing permission management granularity.
+* Added support for [granting data verification permissions](system-admin/manage-role.md) to users, enhancing permission management granularity.
* Introduced Mock Source and Mock Target data sources for data migration testing scenarios.
### Enhancements
@@ -477,9 +525,9 @@ import TabItem from '@theme/TabItem';
### New Features
-* Added support for dynamically generating date suffixes for target table names when [configuring data transformation tasks](../user-guide/data-development/create-task.md#target-node-set), suitable for daily batch processing scenarios.
-* Added support for [integrating with third-party platforms via Webhook](../user-guide/other-settings/notification.md) to enable more alert notification channels.
-* Added support for performing Hash validation between MySQL, Oracle, SQL Server, PostgreSQL, and GaussDB data sources when [configuring data validation tasks](../user-guide/verify-data.md), improving validation efficiency.
+* Added support for dynamically generating date suffixes for target table names when [configuring data transformation tasks](data-transformation/create-views/README.md#target-node-set), suitable for daily batch processing scenarios.
+* Added support for [integrating with third-party platforms via Webhook](system-admin/other-settings/notification.md) to enable more alert notification channels.
+* Added support for performing Hash validation between MySQL, Oracle, SQL Server, PostgreSQL, and GaussDB data sources when [configuring data validation tasks](operational-data-hub/fdm-layer/validate-data-quality.md), improving validation efficiency.
* Added support for setting partitions when configuring Doris data sources.
* Added support for the Oracle mode of OceanBase data sources, with the data source name OceanBase(Oracle).
@@ -501,12 +549,12 @@ import TabItem from '@theme/TabItem';
* Support for bidirectional data synchronization between MySQL instances and between PostgreSQL instances, better meeting the needs of active-active and disaster recovery scenarios.
* Support for importing files from [MongoDB Relmig](https://www.mongodb.com/docs/relational-migrator/) version 1.3.0 and above, further enhancing ecosystem integration capabilities.
* Support for synchronizing MongoDB [Oplog](https://www.mongodb.com/docs/manual/core/replica-set-oplog/) (operation log) data.
-* Support for filtering the time field of tables in the source node’s **[Advanced Settings](../user-guide/data-development/create-task.md#full-sql-query)** when configuring data transformation tasks (e.g., relative dates).
-* Display milestone information for tasks on the [Task List](../user-guide/copy-data/manage-task.md) page, helping users quickly understand key progress statuses.
+* Support for filtering the time field of tables in the source node’s **[Advanced Settings](data-transformation/create-views/README.md#full-sql-query)** when configuring data transformation tasks (e.g., relative dates).
+* Display milestone information for tasks on the [Task List](data-transformation/manage-task.md) page, helping users quickly understand key progress statuses.
### Enhancements
-* Improved [Unwind Node](../user-guide/data-development/process-node.md#unwind) functionality, allowing users to set expansion modes, such as **Embedded Objects** or **Flatten Fields**.
+* Improved [Unwind Node](data-transformation/process-node.md#unwind) functionality, allowing users to set expansion modes, such as **Embedded Objects** or **Flatten Fields**.
* Enhanced full synchronization detail page display, supporting quick table name filtering.
### Bug Fixes
@@ -520,8 +568,8 @@ import TabItem from '@theme/TabItem';
### New Features
-* [Data replication tasks](../user-guide/copy-data/create-task.md) now support table-level checkpoint resumption, allowing tasks to continue syncing from the last incomplete table upon restart.
-* Added the ability to quickly [set task labels](../user-guide/copy-data/manage-task.md) by dragging and dropping.
+* [Data replication tasks](data-replication/create-task.md) now support table-level checkpoint resumption, allowing tasks to continue syncing from the last incomplete table upon restart.
+* Added the ability to quickly [set task labels](data-transformation/manage-task.md) by dragging and dropping.
* Added support for MySQL replica architecture, ensuring tasks continue to sync data normally after a failover event.
### Bug Fixes
@@ -533,14 +581,14 @@ import TabItem from '@theme/TabItem';
### New Features
-* Support for [assigning labels](../user-guide/manage-system/manage-cluster.md) to **sync governance services** (Agents), allowing for subsequent assignment of tasks to agents with specific labels.
-* Supported real-time log parsing for [TiDB data sources](../prerequisites/on-prem-databases/tidb.md), meeting the needs for incremental data synchronization.
+* Support for [assigning labels](system-admin/manage-cluster.md) to **sync governance services** (Agents), allowing for subsequent assignment of tasks to agents with specific labels.
+* Supported real-time log parsing for [TiDB data sources](connectors/on-prem-databases/tidb.md), meeting the needs for incremental data synchronization.
* During the full synchronization phase from Oracle to MySQL, support for syncing unique indexes and regular indexes that do not utilize functions.
* Added the ability to skip errors encountered during the last run when starting tasks.
### Enhancements
-* Optimized the data synchronization task scenario, allowing source nodes to [configure DDL synchronization settings](../case-practices/best-practice/handle-schema-changes.md) and specify DDL statements to ignore (based on regular expressions) in case of DDL errors.
+* Optimized the data synchronization task scenario, allowing source nodes to [configure DDL synchronization settings](case-practices/best-practice/handle-schema-changes.md) and specify DDL statements to ignore (based on regular expressions) in case of DDL errors.
* Enhanced data verification capabilities to support tasks that include processing nodes.
* Improved the data verification results page display, enabling quick filtering of consistent and inconsistent tables.
@@ -556,8 +604,8 @@ import TabItem from '@theme/TabItem';
### New Features
-* When [configuring data verification tasks](../user-guide/verify-data.md), custom filtering based on time fields is now available for MongoDB aggregation queries.
-* Supported [hash verification](../user-guide/verify-data.md) for MySQL/Oracle homogeneous data source synchronization.
+* When [configuring data verification tasks](operational-data-hub/fdm-layer/validate-data-quality.md), custom filtering based on time fields is now available for MongoDB aggregation queries.
+* Supported [hash verification](operational-data-hub/fdm-layer/validate-data-quality.md) for MySQL/Oracle homogeneous data source synchronization.
### Bug Fixes
@@ -569,9 +617,9 @@ import TabItem from '@theme/TabItem';
### New Features
-* Support for sending email reminders one week before the license expires (once a day), which can be combined with [configuring SMTP email services](../case-practices/best-practice/alert-via-qqmail.md) to enhance operational convenience.
-* New options in [DDL synchronization settings](../case-practices/best-practice/handle-schema-changes.md): **Stop Task on DDL Error** and **Automatically Ignore All DDLs**, catering to different business scenario needs.
-* Added a [time field injection](../user-guide/data-development/process-node.md#time_injection) node, allowing the addition of a custom timestamp field to data during synchronization. This provides a more flexible way to capture incremental changes from the source database.
+* Support for sending email reminders one week before the license expires (once a day), which can be combined with [configuring SMTP email services](case-practices/best-practice/alert-via-qqmail.md) to enhance operational convenience.
+* New options in [DDL synchronization settings](case-practices/best-practice/handle-schema-changes.md): **Stop Task on DDL Error** and **Automatically Ignore All DDLs**, catering to different business scenario needs.
+* Added a [time field injection](data-transformation/process-node.md#time_injection) node, allowing the addition of a custom timestamp field to data during synchronization. This provides a more flexible way to capture incremental changes from the source database.
* Support for setting the expiration time and size of engine logs, enabling automatic log cleanup.
### Enhancements
@@ -586,17 +634,17 @@ import TabItem from '@theme/TabItem';
### New Features
-- [Shared Mining](../user-guide/advanced-settings/share-mining.md) functionality supports using RocksDB as local external storage for incremental log storage expansion.
-- [TDengine Connector](../prerequisites/on-prem-databases/tdengine.md) supports using multiple databases as incremental sources.
+- [Shared Mining](operational-data-hub/advanced/share-mining.md) functionality supports using RocksDB as local external storage for incremental log storage expansion.
+- [TDengine Connector](connectors/on-prem-databases/tdengine.md) supports using multiple databases as incremental sources.
### Enhancements
-- [Task Monitoring Page](../user-guide/copy-data/monitor-task.md) adds a time filter option for the incremental phase to quickly observe the RPS (Records Per Second) of the incremental phase.
+- [Task Monitoring Page](data-replication/monitor-task.md) adds a time filter option for the incremental phase to quickly observe the RPS (Records Per Second) of the incremental phase.
- Added related prompt information for key operations that may affect the database (such as filtering source table data).
### Bug Fixes
-* Fixed the issue where the final data does not match the expectation when the primary and secondary table key conditions change in [Primary-Secondary Merge Node](../user-guide/data-development/process-node.md#pri-sec-merged).
+* Fixed the issue where the final data does not match the expectation when the primary and secondary table key conditions change in [Primary-Secondary Merge Node](data-transformation/process-node.md#pri-sec-merged).
## V3.5.10
@@ -629,12 +677,12 @@ import TabItem from '@theme/TabItem';
### New Features
-- Newly supports [Azure Cosmos DB](../prerequisites/cloud-databases/azure-cosmos-db.md) as a data source, capable of synchronizing full data from the source to help facilitate rapid data flow in the cloud.
+- Newly supports [Azure Cosmos DB](connectors/cloud-databases/azure-cosmos-db.md) as a data source, capable of synchronizing full data from the source to help facilitate rapid data flow in the cloud.
### Enhancements
-- Enhanced data source connection methods, [SQL Server](../prerequisites/on-prem-databases/sqlserver.md) supports SSL connections to further enhance data security.
-- Optimized the method of adjusting field types in [data replication tasks](../user-guide/copy-data/create-task.md); in addition to manual input, it now supports direct selection of common types from the target database.
+- Enhanced data source connection methods, [SQL Server](connectors/on-prem-databases/sqlserver.md) supports SSL connections to further enhance data security.
+- Optimized the method of adjusting field types in [data replication tasks](data-replication/create-task.md); in addition to manual input, it now supports direct selection of common types from the target database.
- For the source node settings of the task, supports setting the number of records read per batch during the incremental phase to better meet the performance requirements of incremental synchronization.
### Bug Fixes
@@ -646,9 +694,9 @@ import TabItem from '@theme/TabItem';
### New Features
-- Supports loading table comments for [Oracle data sources](../prerequisites/on-prem-databases/oracle.md#advanced), which can be enabled in the advanced options during data source configuration, allowing quick identification of tables' business meanings through comments.
-- Supports deployment of TapData on [Windows platform](../installation/install-tapdata-enterprise/install-on-windows.md), further expanding the range of supported deployment platforms.
-- In the task operation [monitoring page](../user-guide/copy-data/monitor-task.md), supports viewing RPS (Records Per Second) information based on the dimension of event size.
+- Supports loading table comments for [Oracle data sources](connectors/on-prem-databases/oracle.md#advanced), which can be enabled in the advanced options during data source configuration, allowing quick identification of tables' business meanings through comments.
+- Supports deployment of TapData on [Windows platform](getting-started/install-and-setup/install-enterprise-edition.md), further expanding the range of supported deployment platforms.
+- In the task operation [monitoring page](data-replication/monitor-task.md), supports viewing RPS (Records Per Second) information based on the dimension of event size.
### Bug Fixes
@@ -659,8 +707,8 @@ import TabItem from '@theme/TabItem';
### Enhancements
-- Optimized [data source connections](../prerequisites/README.md), with MySQL, PostgreSQL, Kafka, TiDB, MariaDB, etc., supporting SSL connections to further enhance data security.
-- Enhanced the filtering function of [data verification](../user-guide/verify-data.md), supporting custom query and aggregation query filtering through SQL.
+- Optimized [data source connections](connectors/README.md), with MySQL, PostgreSQL, Kafka, TiDB, MariaDB, etc., supporting SSL connections to further enhance data security.
+- Enhanced the filtering function of [data verification](operational-data-hub/fdm-layer/validate-data-quality.md), supporting custom query and aggregation query filtering through SQL.
- Optimized interface interaction logic.
- For non-primary key update conditions, created a unique index to solve the problem of data duplication.
@@ -674,14 +722,14 @@ import TabItem from '@theme/TabItem';
### New Features
- Newly supports Hive3 as a target.
-- When MongoDB is the target, newly supports [automatic creation of sharded collections](../user-guide/copy-data/create-task.md#advanced-settings).
-- Newly added [Unwind Processing Node](../user-guide/data-development/process-node.md#Unwind), helping you efficiently "unwind" elements in an array, converting each element into a separate data row.
+- When MongoDB is the target, newly supports [automatic creation of sharded collections](data-replication/create-task.md#advanced-settings).
+- Newly added [Unwind Processing Node](data-transformation/process-node.md#Unwind), helping you efficiently "unwind" elements in an array, converting each element into a separate data row.
- When configuring tasks, newly supports the ability to disable nodes. Hovering over a node now offers this functionality, helping to reduce the cost of data flow in the process.
### Enhancements
-- Optimized the setting of [published API scope](../user-guide/data-service/create-api-service.md#settings), allowing adjustments without needing to publish.
-- When [configuring data replication tasks](../user-guide/copy-data/create-task.md), the **selectable table range** dropdown box allows quick filtering of tables with or without primary keys, where tables with primary keys include those without primary keys but with unique indexes.
+- Optimized the setting of [published API scope](publish-apis/create-api-service.md#settings), allowing adjustments without needing to publish.
+- When [configuring data replication tasks](data-replication/create-task.md), the **selectable table range** dropdown box allows quick filtering of tables with or without primary keys, where tables with primary keys include those without primary keys but with unique indexes.
### Bug Fixes
@@ -692,31 +740,31 @@ import TabItem from '@theme/TabItem';
### New Features
-- Added [building materialized views](../user-guide/data-development/create-materialized-view.md) feature, enabling quick construction of real-time data models.
-- Added support for configuring source nodes of [shared mining](../user-guide/advanced-settings/share-mining.md) tasks, including settings for enabling **incremental multi-threaded writing** and **supplementing updated data with complete fields**.
-- Kafka data source added support for [setting the number of replicas and partitions](../case-practices/pipeline-tutorial/oracle-to-kafka.md).
+- Added [building materialized views](data-transformation/create-views/using-imv-guide.md) feature, enabling quick construction of real-time data models.
+- Added support for configuring source nodes of [shared mining](operational-data-hub/advanced/share-mining.md) tasks, including settings for enabling **incremental multi-threaded writing** and **supplementing updated data with complete fields**.
+- Kafka data source added support for [setting the number of replicas and partitions](case-practices/pipeline-tutorial/oracle-to-kafka.md).
- Added support for the `$unset` operation during synchronization between MongoDB instances.
### Enhancements
-- [Data verification](../user-guide/verify-data.md) feature field filtering experience optimization.
+- [Data verification](operational-data-hub/fdm-layer/validate-data-quality.md) feature field filtering experience optimization.
- Supported quick node targeting at the top of the data replication/data transformation configuration page through node search.
## V3.5.2
### New Features
-* Added [Python Processing Node](../user-guide/data-development/process-node.md#python), supporting custom data processing logic through Python scripts, offering performance improvements compared to JS processing nodes.
+* Added [Python Processing Node](data-transformation/process-node.md#python), supporting custom data processing logic through Python scripts, offering performance improvements compared to JS processing nodes.
* Added support for data synchronization between Redis instances.
### Enhancements
-* Enhanced [data source error codes](../administration/troubleshooting/error-code.md), covering more scenarios and providing solutions.
+* Enhanced [data source error codes](platform-ops/troubleshooting/error-code.md), covering more scenarios and providing solutions.
## V3.5.1
### New Features
-- Now when [creating a role](../user-guide/manage-system/manage-role.md), it supports the granular granting of functional and data rights.
+- Now when [creating a role](system-admin/manage-role.md), it supports the granular granting of functional and data rights.
### Enhancements
- Enhanced the UI prompts and guidance when setting up core data sources like PostgreSQL, Redis, etc.
@@ -730,10 +778,10 @@ import TabItem from '@theme/TabItem';
## V3.4
### New Features
-- When task configurations are set for full + incremental sync, there's now support to turn on the [scheduled periodic task feature](../user-guide/copy-data/create-task.md#task-attr). The task will automatically stop, reset, and run again at the set time.
-- For the [add/remove field node](../user-guide/data-development/process-node.md#add-and-del-cols), field order adjustment is now supported.
-- A new feature to [dynamically adjust memory](../user-guide/copy-data/create-task.md#task-attr) has been introduced (enabled by default). During the full synchronization phase, it identifies memory usage and auto-adjusts the memory queue, effectively preventing memory overflow scenarios.
-- The data panel has been renamed to the [Real-time Data Center](../user-guide/real-time-data-hub/README.md), with added guidance on usage and task creation.
+- When task configurations are set for full + incremental sync, there's now support to turn on the [scheduled periodic task feature](data-replication/create-task.md#task-attr). The task will automatically stop, reset, and run again at the set time.
+- For the [add/remove field node](data-transformation/process-node.md#add-and-del-cols), field order adjustment is now supported.
+- A new feature to [dynamically adjust memory](data-replication/create-task.md#task-attr) has been introduced (enabled by default). During the full synchronization phase, it identifies memory usage and auto-adjusts the memory queue, effectively preventing memory overflow scenarios.
+- The data panel has been renamed to the [Real-time Data Center](operational-data-hub/plan-data-platform.md), with added guidance on usage and task creation.
- Introduced a target write strategy, where if an update event does not exist, it can be written to a local log.
### Enhancements
@@ -752,17 +800,17 @@ import TabItem from '@theme/TabItem';
## V3.3
### New Features
-- [Kafka data source](../prerequisites/mq-and-middleware/kafka.md) now supports custom message body formats.
-- Added the [API interface documentation export feature](../user-guide/data-service/create-api-service.md#release330-export-api) to help teams quickly establish and enhance API usage documents.
-- Shared mining functionality supports [configuring task alerts](../user-guide/advanced-settings/share-mining.md#release330-alert), allowing alerts via system notifications or emails for better task monitoring.
-- The [data validation function](../user-guide/verify-data.md) allows setting data filters, enabling validation of specific conditional data only, reducing validation scope and increasing efficiency.
-- In data service platform mode, when dragging a data table to the platform cache layer to generate a task, it supports [setting the synchronization type of the task to be full or incremental](../user-guide/real-time-data-hub/daas-mode/create-daas-task.md#release330-task).
+- [Kafka data source](connectors/mq-and-middleware/kafka.md) now supports custom message body formats.
+- Added the [API interface documentation export feature](publish-apis/create-api-service.md#release330-export-api) to help teams quickly establish and enhance API usage documents.
+- Shared mining functionality supports [configuring task alerts](operational-data-hub/advanced/share-mining.md#release330-alert), allowing alerts via system notifications or emails for better task monitoring.
+- The [data validation function](operational-data-hub/fdm-layer/validate-data-quality.md) allows setting data filters, enabling validation of specific conditional data only, reducing validation scope and increasing efficiency.
+- In data service platform mode, when dragging a data table to the platform cache layer to generate a task, it supports [setting the synchronization type of the task to be full or incremental](operational-data-hub/set-up-odh.md).
### Enhancements
-- Introduced [rolling upgrades](../administration/operation.md#release330-upgrade), which, compared to the downtime upgrade method, further reduces business impacts.
-- Post-error in [shared mining tasks](../user-guide/advanced-settings/share-mining.md), associated tasks now include alert prompts.
-- In the [row filter processing node](../user-guide/data-development/process-node.md), added usage examples when filtering with the DATE type.
-- [Time operation node](../user-guide/data-development/process-node.md#date-calculation) now displays adjusted fields.
+- Introduced [rolling upgrades](platform-ops/operation.md#release330-upgrade), which, compared to the downtime upgrade method, further reduces business impacts.
+- Post-error in [shared mining tasks](operational-data-hub/advanced/share-mining.md), associated tasks now include alert prompts.
+- In the [row filter processing node](data-transformation/process-node.md), added usage examples when filtering with the DATE type.
+- [Time operation node](data-transformation/process-node.md#date-calculation) now displays adjusted fields.
- Optimized algorithm for estimating remaining time for full synchronization.
- Field processing nodes now support one-click copy and paste for configurations.
@@ -778,22 +826,22 @@ import TabItem from '@theme/TabItem';
### New Features
-- In the data platform mode, it can directly [display the relationship of table-level traceability](../user-guide/real-time-data-hub/daas-mode/daas-mode-dashboard.md#release320-daas), helping you to visually show the link relationship of data tables.
-- In the data platform mode, it supports [deleting tables from the platform processing layer](../user-guide/real-time-data-hub/daas-mode/daas-mode-dashboard.md#release320-daas).
-- When configuring the target node of a task, it supports [adjusting field length by a coefficient](../user-guide/copy-data/create-task.md#release320-col-length) to avoid data write failures due to different character encodings.
-- [Data verification](../user-guide/verify-data.md) feature supports SelectDB data source.
-- In scenarios where Redis is the target node, and data is stored in List or Hash format with a single key, it [supports writing the source table schema into a Hash key](../case-practices/pipeline-tutorial/mysql-to-redis.md) (default name is `-schema-key-`). The value is used to store the source table's table name and column name information.
-- Added [**type filter**](../user-guide/data-development/process-node.md#release320-type-filter) processing node, which can quickly filter columns of the same type. Filtered fields will not be passed to the next node.
+- In the data platform mode, it can directly display the relationship of table-level traceability, helping you to visually show the link relationship of data tables.
+- In the data platform mode, it supports deleting tables from the platform processing layer.
+- When configuring the target node of a task, it supports [adjusting field length by a coefficient](data-replication/create-task.md#release320-col-length) to avoid data write failures due to different character encodings.
+- [Data verification](operational-data-hub/fdm-layer/validate-data-quality.md) feature supports SelectDB data source.
+- In scenarios where Redis is the target node, and data is stored in List or Hash format with a single key, it [supports writing the source table schema into a Hash key](case-practices/pipeline-tutorial/mysql-to-redis.md) (default name is `-schema-key-`). The value is used to store the source table's table name and column name information.
+- Added [**type filter**](data-transformation/process-node.md#release320-type-filter) processing node, which can quickly filter columns of the same type. Filtered fields will not be passed to the next node.
- **Field editing** processing node supports conversion between snake_case and camelCase naming.
-- Data copy tasks, data conversion tasks, data panels, and caching creation support [displaying table description information](../user-guide/copy-data/create-task.md#310-table-model), defaulting to table comment information.
+- Data copy tasks, data conversion tasks, data panels, and caching creation support [displaying table description information](data-replication/create-task.md#310-table-model), defaulting to table comment information.
### Enhancements
-- Product menu adjustments: data development is renamed to [data conversion](../user-guide/data-development/). Some functions have been moved to [advanced settings](../user-guide/advanced-settings/) (e.g., shared cache).
-- Improved interaction for tables without primary keys, e.g., [support for filtering non-primary key tables and adding primary key table identification](../user-guide/copy-data/create-task.md#310-table-model) when configuring data copy tasks.
-- For external storage configurations of MongoDB data sources, [connection testing capability](../user-guide/advanced-settings/manage-external-storage.md#320-external-storage) has been added.
-- When creating a new external storage and choosing MongoDB, it supports [using SSL connections](../user-guide/advanced-settings/manage-external-storage.md#320-external-storage).
-- Creating an HttpReceiver data source now [supports script trial runs](../prerequisites/others/http-receiver.md) and [access authentication functionality](../prerequisites/others/http-receiver.md).
+- Product menu adjustments: data development is renamed to [data conversion](data-transformation/README.md). Some functions have been moved to [advanced settings](operational-data-hub/advanced/manage-function.md) (e.g., shared cache).
+- Improved interaction for tables without primary keys, e.g., [support for filtering non-primary key tables and adding primary key table identification](data-replication/create-task.md#310-table-model) when configuring data copy tasks.
+- For external storage configurations of MongoDB data sources, [connection testing capability](operational-data-hub/advanced/manage-external-storage.md#320-external-storage) has been added.
+- When creating a new external storage and choosing MongoDB, it supports [using SSL connections](operational-data-hub/advanced/manage-external-storage.md#320-external-storage).
+- Creating an HttpReceiver data source now [supports script trial runs](connectors/others/http-receiver.md) and [access authentication functionality](connectors/others/http-receiver.md).
- Standard JS node capabilities adjusted, adding [Linked HashMap data structure](appendix/standard-js.md#linkedhashmap) and [context.global object](appendix/standard-js.md#global).
- **Field editing** processing node's UI interaction has been improved.
- Redundant prompts for task startup and schema reload have been optimized.
@@ -817,17 +865,17 @@ import TabItem from '@theme/TabItem';
### New Features
-- [Data panel functionality](../user-guide/real-time-data-hub/etl-mode) now supports table-level traceability capabilities. You can view data lineage relationships through table details.
-- When [configuring data copy tasks](../user-guide/copy-data/create-task.md#310-table-model), you can view the table model in the processing node.
-- Supports publishing API data services based on Doris data source [Release API Data Services](../user-guide/data-service/create-api-service.md).
-- [Cluster management](../user-guide/manage-system/manage-cluster.md) page allows downloading thread resource monitoring and data source usage data.
+- Data panel functionality now supports table-level traceability capabilities. You can view data lineage relationships through table details.
+- When [configuring data copy tasks](data-replication/create-task.md#310-table-model), you can view the table model in the processing node.
+- Supports publishing API data services based on Doris data source [Release API Data Services](publish-apis/create-api-service.md).
+- [Cluster management](system-admin/manage-cluster.md) page allows downloading thread resource monitoring and data source usage data.
### Enhancements
-- Shared mining task management improved, supporting [starting/stopping mining tasks for individual tables](../user-guide/advanced-settings/share-mining.md#release310-share-mining).
-- [Shared cache](../user-guide/advanced-settings/share-cache.md), [functions](../user-guide/advanced-settings/manage-function.md), [API data services](../user-guide/data-service/create-api-service.md) support import/export functions.
-- [Data verification](../user-guide/verify-data.md) supports configuring alert rules and notification methods.
-- Auto-fill table logic for [data verification](../user-guide/verify-data.md) has been optimized.
+- Shared mining task management improved, supporting [starting/stopping mining tasks for individual tables](operational-data-hub/advanced/share-mining.md#release310-share-mining).
+- [Shared cache](operational-data-hub/advanced/share-cache.md), [functions](operational-data-hub/advanced/manage-function.md), [API data services](publish-apis/create-api-service.md) support import/export functions.
+- [Data verification](operational-data-hub/fdm-layer/validate-data-quality.md) supports configuring alert rules and notification methods.
+- Auto-fill table logic for [data verification](operational-data-hub/fdm-layer/validate-data-quality.md) has been optimized.
- Frontend added explanations for the distinction between [standard JS](appendix/standard-js.md) and [enhanced JS](appendix/enhanced-js.md).
- JS processor standardization, JS usage, and trial run have been restructured.
- In all processing nodes supporting JS scripting, typing `record.` automatically prompts for the current model's field names.
@@ -858,16 +906,16 @@ import TabItem from '@theme/TabItem';
### New Features
-- [Integrated GraphQL capability](../user-guide/data-service/query-via-graphql.md), enriching API query methods.
-- Added [application categorization capability for APIs](../user-guide/data-service/create-api-service.md), facilitating categorization based on business.
-- Introduced [time calculation processing node](../user-guide/data-development/process-node.md#time-calculation) for flexible handling of discrepancies in source and destination database time zones.
-- Introduced [full-scale partitioning capability](../case-practices/best-practice/full-breakpoint-resumption.md), currently only supported for MongoDB.
+- [Integrated GraphQL capability](publish-apis/query/query-via-graphql.md), enriching API query methods.
+- Added [application categorization capability for APIs](publish-apis/create-api-service.md), facilitating categorization based on business.
+- Introduced [time calculation processing node](data-transformation/process-node.md#time-calculation) for flexible handling of discrepancies in source and destination database time zones.
+- Introduced [full-scale partitioning capability](case-practices/best-practice/full-breakpoint-resumption.md), currently only supported for MongoDB.
### Enhancements
-- [Shared cache function](../user-guide/advanced-settings/share-mining.md) improved, offering an observable page to monitor mining progress and troubleshoot failures.
-- [Full custom query function](../user-guide/data-development/create-task.md#full-sql-query) relaxed the restriction of only using JS nodes, now allowing the addition of other processing nodes with the node model directly utilizing the source table's model.
-- The field [processing node](../user-guide/data-development/process-node.md) supporting operations like adding/deleting fields, type modifications, and renaming fields now includes a field search function.
+- [Shared cache function](operational-data-hub/advanced/share-mining.md) improved, offering an observable page to monitor mining progress and troubleshoot failures.
+- [Full custom query function](data-transformation/create-views/README.md#full-sql-query) relaxed the restriction of only using JS nodes, now allowing the addition of other processing nodes with the node model directly utilizing the source table's model.
+- The field [processing node](data-transformation/process-node.md) supporting operations like adding/deleting fields, type modifications, and renaming fields now includes a field search function.
- Adjusted wording for Schema loading frequency configuration in connection settings.
- Optimization of table name modification logic in the **Table Editing Node**; removed the apply button for direct configuration effectiveness.
- During the startup of the management process (frontend), it now includes heapDump and stackTrace parameters, similar to the synchronization governance process.
diff --git a/docs/release-notes/README.md b/docs/release-notes/README.md
deleted file mode 100644
index ab7b6eaa..00000000
--- a/docs/release-notes/README.md
+++ /dev/null
@@ -1,11 +0,0 @@
-# Release Notes
-
-import Content from '../reuse-content/_all-features.md';
-
-
-
-Based on the product family you use, select the documents below to stay up-to-date on the latest product developments:
-
-import DocCardList from '@theme/DocCardList';
-
-
diff --git a/docs/release-notes/release-notes-cloud.md b/docs/release-notes/release-notes-cloud.md
deleted file mode 100644
index 459498ad..00000000
--- a/docs/release-notes/release-notes-cloud.md
+++ /dev/null
@@ -1,691 +0,0 @@
-# TapData Cloud Release Notes
-
-import Content from '../reuse-content/_cloud-features.md';
-
-
-
-To enhance the user experience, TapData Cloud continuously enriches and optimizes product features and rectifies known defects by releasing new versions. This article provides an update log for TapData Cloud, helping you grasp the new feature specifications more effectively.
-
-```mdx-code-block
-import Tabs from '@theme/Tabs';
-import TabItem from '@theme/TabItem';
-```
-
-
-```mdx-code-block
-
-
-```
-
-### 2025-04-14
-
-#### New Features
-
-- The [Cluster Overview](../user-guide/workshop.md) page on the homepage now displays task distribution by node, helping you better understand cluster workload.
-- [OceanBase (MySQL Mode)](../prerequisites/on-prem-databases/oceanbase.md), [OceanBase (Oracle Mode)](../prerequisites/on-prem-databases/oceanbase-oracle.md), and [GaussDB (DWS)](../prerequisites/warehouses-and-lake/gaussdb.md) have passed Tapdata certification and are now classified as [Certified Data Sources](../prerequisites/supported-databases.md), offering enhanced features and improved production-level stability.
-- Data replication tasks now support writing multiple tables to the same Kafka topic, expanding compatibility with more write scenarios.
-
-#### Enhancements
-
-- Improved model visualization by adjusting how primary keys, foreign keys, and unique indexes are displayed, making models more readable and easier to edit.
-
-#### Bug Fixes
-
-- Fixed an issue where connection requests were not evenly distributed across multiple `mongos` nodes, eliminating potential single-node performance bottlenecks.
-
-### 2025-04-02
-
-#### New Features
-
-- Added support for syncing tables with auto-increment primary keys in [SQL Server](../prerequisites/on-prem-databases/sqlserver.md).
-- Added support for syncing default values and foreign keys in [PostgreSQL](../prerequisites/on-prem-databases/postgresql.md) to SQL Server sync scenarios.
-
-### 2025-03-19
-
-#### New Features
-
-- Added support for synchronizing **column default values**, **auto-increment columns**, and **foreign key constraints** in [MySQL](../prerequisites/on-prem-databases/mysql.md)-to-MySQL, [PostgreSQL](../prerequisites/on-prem-databases/postgresql.md)-to-PostgreSQL, and [SQL Server](../prerequisites/on-prem-databases/sqlserver.md)-to-PostgreSQL scenarios, ensuring data structure consistency.
-- Enabled foreign key constraint synchronization in [Sybase](../prerequisites/on-prem-databases/sybase.md)-to-PostgreSQL tasks, further enhancing data consistency.
-- Enhanced the **[primary-secondary merge node](../user-guide/data-development/process-node.md#pri-sec-merged)** functionality to allow subsequent connections with other processing nodes (including JS nodes), improving workflow flexibility.
-
-#### Enhancements
-
-- Improved Kafka connector capabilities.
-- Optimized the display of task milestones for better clarity.
-
-#### Bug Fixes
-
-- Fixed an issue where MongoDB sharded configurations could not be automatically synchronized.
-- Resolved a problem where MongoDB capped collections failed to sync correctly.
-- Fixed an issue where merged embedded arrays behaved unexpectedly when modifying relationship keys.
-
-### 2025-03-06
-
-#### Bug Fixes
-
-- Fixed an issue where incremental data lost time precision when synchronizing from Oracle to PostgreSQL.
-- Fixed an issue in primary-secondary merge tasks where changes to the primary table's association conditions caused extra pre-update records in the target data.
-
-### 2025-02-21
-
-#### New Features
-
-- Added support for the [Sybase to PostgreSQL](../prerequisites/on-prem-databases/sybase.md) sync scenario, now supporting synchronization of default values, enumerated types, and sequences.
-- Enabled the ability to define a primary key for tables without a primary key when configuring [Primary-Secondary Merge Nodes](../user-guide/data-development/process-node.md#pri-sec-merged), ensuring data synchronization consistency and improving merge efficiency.
-
-#### Enhancements
-
-- Improved field derivation logic for the Sybase to PostgreSQL sync scenario.
-
-#### Bug Fixes
-
-- Fixed an issue with multi-level associated keys in primary-secondary merges, preventing incorrect merging of child table data.
-
-### 2025-01-24
-
-#### Bug Fixes
-
-- Fixed an issue where heartbeat task startup failures prevented data synchronization tasks from starting properly.
-- Fixed a problem where notification settings were not applied after saving.
-
-### 2025-01-15
-
-#### New Features
-
-- Enhanced [Sybase](../prerequisites/on-prem-databases/sybase.md)-to-PostgreSQL synchronization scenario, adding index migration and **sequence** synchronization features, further improving migration automation and ensuring sequence data consistency.
-
-#### Feature Optimizations
-
-- Optimized the Oracle connection test feature, adding prompts for mismatched case sensitivity between username and schema to improve user experience.
-
-#### Bug Fixes
-
-- Fixed the issue where shared data mining tasks initiated by the admin user could not be used properly by other users.
-
-
-
-
-
-
-### 2024-12-30
-
-#### Enhancements
-
-- Added the ability to download log files from the task monitoring page for easier fault diagnosis.
-- Optimized engine startup to eliminate the need for MongoDB configuration during initialization.
-- Expanded error code coverage and provided more detailed solution hints.
-
-#### Bug Fixes
-
-- Fixed a problem where tasks synchronizing only primary key tables using regex continued to log "new table detected" after adding non-primary key tables.
-
-### 2024-12-17
-
-#### Enhancements
-
-- Optimized and added new engine error codes to help users quickly locate the cause of issues.
-
-#### Bug Fixes
-
-- Fixed an issue where the system failed to start when configuring SSL connections for MongoDB as an intermediate database.
-- Fixed an issue where data was not updated to the target during incremental synchronization when synchronizing Oracle tables with multi-column composite primary keys to GaussDB (DWS).
-- Fixed an issue where the task incorrectly reported missing table creation privileges after synchronizing some tables to MySQL.
-- Fixed a system error that occurred when viewing the source node list of a data mining task.
-- Fixed an issue where the row count displayed in the real-time data platform's table details was inconsistent with the actual row count.
-
-### 2024-11-29
-
-#### Enhancements
-
-- Enabled copy all selected table names during task configuration, improving operational efficiency.
-- Expanded the range of built-in error codes for better issue identification and diagnosis.
-- Enhanced milestone tracking and display logic during task execution.
-- Improved log viewing experience for script processing nodes by supporting split log display.
-
-#### Bug Fixes
-
-- Fixed an issue where syncing PostgreSQL to SQL Server failed to sync newly added partitioned child tables if the parent table’s partitions were not created before task execution.
-- Resolved an issue where MongoDB indexes were not correctly loaded, causing schema loading failures.
-- Fixed an issue where data extraction tasks could get stuck at the table structure replication stage.
-
-
-### 2024-11-15
-
-#### New Features
-
-- Added support for real-time synchronization of PostgreSQL partitioned tables to SQL Server.
-
-#### Enhancements
-
-- Expanded the range of built-in error codes for faster issue identification and diagnosis.
-
-#### Bug Fixes
-
-- Fixed an issue where resetting tasks on the edit page failed, causing a “current status not allowed” error when saving the task.
-- Resolved an issue where removing and re-adding a table being synchronized in a replication task failed to resume synchronization correctly.
-
-### 2024-10-30
-
-#### New Features
-
-- Added HTTPS connection support for [Elasticsearch data sources](../prerequisites/on-prem-databases/elasticsearch.md), enhancing data transmission security to meet more stringent data security and compliance requirements.
-- Enabled support for synchronizing tables without primary keys by adding a hash field (default name: `_no_pk_hash`), ensuring data consistency and stable synchronization in non-primary key scenarios.
-
-#### Enhancements
-
-- Enhanced data filtering logic in Row Filter nodes, ensuring that target data is updated to maintain consistency when data status changes from meeting to not meeting filter conditions.
-- Improved support for handling changes in composite primary keys in Oracle.
-
-#### Bug Fixes
-
-- Fixed an issue preventing the display of all tables (completed, in progress, and not started) in full sync details.
-- Corrected inaccuracies in time and milestone statistics.
-- Resolved an issue with MongoDB Atlas functionality when DNS resolution fails.
-
-### 2024-10-17
-
-#### New Features
-
-* Kafka-Enhanced and TiDB have passed the TapData certification testing process and have been upgraded to [Certified Data Sources](../prerequisites/supported-databases.md), providing more advanced features and enhanced production stability.
-
-#### Enhancements
-
-- Added a [Multi-threaded CT Table Polling](../prerequisites/on-prem-databases/sqlserver.md#advanced-settings) option to improve incremental data collection performance for SQL Server environments with a large number of tables (over 500), significantly increasing synchronization efficiency.
-- Optimized the cache management logic for processing nodes, enhancing resource usage efficiency and improving task execution speed.
-- Introduced an automatic retry mechanism for Oracle LogMiner errors caused by exceeding PGA limits, improving fault tolerance.
-
-#### Bug Fixes
-
-- Fixed an issue where, after enabling the heartbeat table, tasks displayed no delay but data was not synchronized.
-- Fixed an issue where not all tags could be viewed when setting tags.
-- Fixed an issue where the task retry start time was incorrectly displayed as 1970.
-- Fixed an issue where index creation failed when Elasticsearch was used as the target database.
-
-### 2024-10-10
-
-#### New Features
-
-* Doris, ClickHouse, KingBaseES-R6, PostgreSQL, SQL Server, and MongoDB have passed the TapData certification testing process and have been upgraded to [Certified Data Sources](../prerequisites/supported-databases.md), providing more advanced features and enhanced production stability.
-* When using PostgreSQL as a source, it is now possible to specify the time point for incremental data in task settings.
-
-#### Enhancements
-
-* When configuring an Elasticsearch data source, the task setup now allows you to select an update strategy for data writing.
-* For data replication tasks, the source node's table selection defaults to primary key tables, with an added prompt message.
-
-#### Bug Fixes
-
-- Fixed an issue where tasks would encounter errors during the incremental phase after enabling the heartbeat table in new tasks.
-- Fixed the issue where tasks got stuck in the full phase and could not move to the incremental phase after a reset.
-
-### 2024-09-20
-
-#### New Features
-
-* MySQL has passed the TapData certification testing process, upgrading it to a [certified data source](../prerequisites/supported-databases.md), providing more comprehensive features and enhanced production stability.
-* Added a [form-based mode](../user-guide/copy-data/quick-create-task.md) for building replication tasks, simplifying the task creation process and improving operational convenience.
-
-#### Enhancements
-
-- Added a new sorting feature for mining tasks based on today's mined volume, making task management and filtering more convenient.
-
-#### Bug Fixes
-
-- Fixed an issue where regular indexes were not properly synchronized after enabling the **Sync Indexes on Table Creation** option, ensuring data synchronization integrity.
-
-### 2024-08-21
-
-#### New Features
-
-- Oracle, Dameng, and Db2 have passed the TapData certification testing process and have been upgraded to [Certified Data Sources](../prerequisites/supported-databases.md), offering richer features and higher production stability.
-- Added [traffic billing](../billing/billing-overview.md) feature for fully managed instances, supporting [traffic bill viewing and payment](../billing/renew-subscribe.md), enabling users to easily monitor traffic usage and manage bills conveniently.
-- For [PostgreSQL](../prerequisites/on-prem-databases/postgresql.md) data sources, incremental data synchronization is now supported using the walminer plugin, catering to more use cases.
-- Data replication tasks now support reading from multiple tables simultaneously, improving parallel processing capabilities and task execution efficiency.
-
-#### Enhancements
-
-- Significantly enhanced data synchronization performance.
-- Improved error messages and high-risk operation warnings.
-- For data sources that do not support hash validation, hash validation is now disabled by default.
-- After full sync tasks are completed, restarting the task will trigger a full resynchronization to ensure data consistency.
-- The Agent deployment page now includes a network whitelist configuration guide, making it easier to configure communication between the Agent and the management console.
-
-#### Bug Fixes
-
-- Fixed an issue where some task monitoring metrics were lost after task completion.
-- Fixed a query efficiency issue caused by missing necessary indexes in the intermediate database, reducing data scan volume.
-- Fixed an issue where selecting "Show only different fields" when downloading data validation discrepancies resulted in downloading all fields.
-- Fixed an issue where the old engine name still appeared in task settings after changing the engine name in cluster management.
-- Fixed a problem where task editing could get stuck during model generation, improving the task editing experience.
-- Fixed an issue where large-scale Agents could not start due to insufficient memory on low-configuration servers.
-- Fixed possible OOM error problems with Agents, enhancing memory management and stability.
-- Fixed an issue where full sync tasks in the cloud version sometimes got stuck in a running state, improving task execution smoothness.
-- Fixed an issue where editing an API showed a duplicate name warning.
-- Fixed an issue where, after stopping a data replication task in the incremental phase and restarting it, the full completion time displayed incorrectly.
-- Fixed an issue with TDengine where SQL statement length exceeded limits when writing to super tables with many fields.
-- Fixed an error occurring in data transformation tasks using TDengine as a source when the table name contained Chinese characters.
-- Fixed potential exceptions when running mining tasks on PostgreSQL data sources.
-- Fixed an issue in Oracle to Doris shared mining tasks where source table DDL events could not be parsed.
-- Fixed issues with inserting and deleting operations when syncing Oracle to PostgreSQL for tables without primary keys, enhancing synchronization reliability.
-- Fixed specific exception issues during the incremental phase of MongoDB to Kafka data transformation tasks.
-- Fixed an issue where an unexpected `_id` field appeared in the model when synchronizing MongoDB oplog to Kafka.
-- Fixed an issue where MongoDB oplog data replication tasks could not replicate properly during synchronization.
-
-#### New Features
-
-- Oracle, Kafka, and Db2 have completed the TapData certification testing process, upgraded to [GA-level data sources](../prerequisites/supported-databases.md), offering enhanced capabilities and production stability.
-- Added traffic billing view and payment features in the cloud version.
-
-#### Enhancements
-
-- Optimized the layout structure of the menu entries.
-- Improved error messages and risk warnings for high-risk operations.
-- Significantly improved data synchronization performance.
-- Optimized memory allocation logic during Agent startup.
-
-#### Bug Fixes
-
-- Fixed the issue where some monitoring metrics were lost after task completion.
-- Fixed potential runtime issues in PostgreSQL data source mining tasks.
-- Fixed the issue where large-scale Agents on low-configuration servers might fail to start due to insufficient memory.
-- Fixed the issue where full data synchronization tasks remained in a running state for an extended period.
-
-### 2024-08-06
-
-#### New Features
-
-- Enhanced [Data Transformation Task Configuration](../user-guide/data-development/create-task.md) to support reloading of single table models in the source node model preview area, improving loading efficiency.
-- Introduced time detection functionality that automatically detects the time difference between the engine deployment server and the database server and displays it on the task monitoring page.
-
-#### Optimizations
-
-* User-defined field business descriptions can now be directly displayed in the column name position of the table sample data.
-
-#### Bug Fixes
-
-- Fixed an issue where some table data counts in the real-time data platform were empty.
-- Fixed an issue where the host was not displayed in the path when publishing APIs in the real-time data platform.
-- Fixed an issue where MongoDB database cursor timeout prevented normal full synchronization.
-- Fixed an issue where the custom SQL filter switch could not be turned on in the source node data filtering settings.
-- Fixed an formatting error in email alerts for the fully managed Agent.
-
-### 2024-07-20
-
-#### New Features
-
-- Added a [Union Node](../user-guide/copy-data/process-node.md) to data replication tasks, enabling the merging (UNION) of multiple tables within the same database. This is useful for data integration and analysis scenarios.
-- [Doris](../prerequisites/warehouses-and-lake/doris.md) data source now supports certificate-free HTTPS connections.
-- MySQL, Oracle, OpenGauss, SQL Server, and PostgreSQL data sources now support enabling the **Hash Sharding** feature in the advanced settings of nodes during task configuration, significantly improving the full data sync speed for large tables.
-- Added support for [VastBase](../prerequisites/on-prem-databases/vastbase.md) data source, with a maturity level of Beta, further enriching the variety of data sources.
-
-#### Enhancements
-
-- Improved synchronization logic for time zone fields.
-- Optimized the display logic of continuous log mining settings, automatically hiding related buttons if the database version does not support this feature.
-
-#### Bug Fixes
-
-- Fixed an issue with incremental sync delay display in MongoDB sync tasks that used shared mining.
-- Addressed the unclear error messages and lack of detailed information in the error codes when the source MySQL does not support incremental.
-- Corrected the format of task warning alerts.
-- Resolved an issue where imported tasks showed running records and the current running record status appeared as "deleting."
-- Fixed an issue with incorrect display when renaming tables in FDM replication tasks.
-- Addressed an issue where editing tasks incorrectly modified the association key when the target table association key was set.
-- Fixed a potential failure when removing fields in Python nodes.
-- Resolved an issue where deleting the primary node in master-slave merge operations caused configuration errors in the master-slave merge node, leading to task errors.
-- Corrected an issue where creating an application in application management incorrectly prompted that the tag name already existed.
-- Fixed garbled text issue with Chinese node names in tasks when a source-side DDL occurs and the engine server is not set to UTF character encoding.
-
-### 2024-07-05
-
-#### Enhancements
-
-* Optimized features in the [Real-Time Data Hub](../user-guide/real-time-data-hub/README.md):
- * The data processing layer now displays all models in the database.
- * The platform cache layer and platform processing layer can be configured with different connections, which cannot be adjusted after setting.
- * Added an API publishing entry.
- * Improved the display of model details.
-* Added field restriction configuration parameters for the ElasticSearch data source.
-* Optimized exception handling logic when enabling the preimage capability for the MongoDB data source.
-
-#### Bug Fixes
-
-- Fixed an issue where some task event statistics might occasionally be missing when reported.
-- Fixed an issue where shared cache tasks without shared mining might encounter errors due to exceeding the log time window if data does not change upon restarting or upgrading the engine.
-- Fixed an issue where disabling the slave node under the MDM_model led to task startup failures.
-- Fixed an issue where the lineage graph in the real-time data hub occasionally failed to display.
-- Fixed an issue where the unset operation on the source table could cause task errors in scenarios where the write mode is updating sub-documents.
-- Fixed an issue where joining collections with time types in MongoDB and MySQL caused errors.
-- Fixed an issue where tasks created in the real-time data hub could not add master-slave merge nodes.
-- Fixed an issue where incremental update events unexpectedly performed lookups in master-slave merge scenarios.
-- Fixed conflict errors when modifying columns in master-slave merge nodes.
-
-### 2024-06-21
-
-#### New Features
-
-* Enhanced [TiDB](../prerequisites/on-prem-databases/tidb.md) data source capabilities with support for real-time incremental synchronization.
-
-#### Enhancements
-
-* Improved the display of primary keys and indexes in the task's table model.
-* Enhanced the model deduction logic, supporting model deduction directly in the engine.
-
-#### Bug Fixes
-
-* Fixed an issue where some exceptions were ignored during data source error handling.
-* Fixed an issue where aggregation tasks using time fields as join keys could not backtrack data.
-* Fixed an issue with delayed times in mining tasks.
-* Fixed an issue where MySQL as a source would consume a large amount of database memory during initial synchronization of large tables.
-
-### 2024-06-07
-
-#### New Features
-
-* Introduced Mock Source and Mock Target data sources for data migration testing scenarios.
-
-#### Enhancements
-
-* Improved the interaction logic for skipping errors when starting tasks.
-* Improved the loading speed of the connection list.
-
-#### Bug Fixes
-
-* Fixed inconsistencies between the task runtime model and configuration model.
-* Fixed inaccurate task event statistics after filtering source data.
-* Fixed timezone handling issues in Oracle and PostgreSQL synchronization scenarios.
-* Fixed an issue where heartbeat task reset failures could prevent related tasks from starting.
-
-### 2024-05-21
-
-#### New Features
-
-* Added support for dynamically generating date suffixes for target table names when [configuring data transformation tasks](../user-guide/data-development/create-task.md#target-node-set), suitable for daily batch processing scenarios.
-* Added support for setting partitions when configuring Doris data sources.
-* Added support for the Oracle mode of OceanBase data sources, with the data source name OceanBase(Oracle).
-
-#### Enhancements
-
-* Optimized data handling logic when syncing MongoDB to relational databases (e.g., MySQL).
-* Enhanced the Dummy data source to support quickly adding large fields for performance testing scenarios.
-
-#### Bug Fixes
-
-* Fixed an issue where MariaDB could not write data in the `0000-00-00 00:00:00` format to the target.
-* Fixed an issue where heartbeat tasks could not automatically recover after the heartbeat table was mistakenly deleted.
-* Fixed an issue where shared extraction tasks could not be serialized after an error occurred.
-
-### 2024-05-06
-
-#### New Features
-
-* Support for bidirectional data synchronization between MySQL instances and between PostgreSQL instances, better meeting the needs of active-active and disaster recovery scenarios.
-* Support for importing files from [MongoDB Relmig](https://www.mongodb.com/docs/relational-migrator/) version 1.3.0 and above, further enhancing ecosystem integration capabilities.
-* Support for synchronizing MongoDB [Oplog](https://www.mongodb.com/docs/manual/core/replica-set-oplog/) (operation log) data.
-* Support for filtering the time field of tables in the source node’s **[Advanced Settings](../user-guide/data-development/create-task.md#full-sql-query)** when configuring data transformation tasks (e.g., relative dates).
-* Display milestone information for tasks on the [Task List](../user-guide/copy-data/manage-task.md) page, helping users quickly understand key progress statuses.
-
-#### Enhancements
-
-* Improved [Unwind Node](../user-guide/data-development/process-node.md#unwind) functionality, allowing users to set expansion modes, such as **Embedded Objects** or **Flatten Fields**.
-* Enhanced full synchronization detail page display, supporting quick table name filtering.
-
-#### Bug Fixes
-
-* Fixed an issue where adjusting alarm settings could affect normal task operations in certain scenarios.
-* Fixed an issue where adding a new digging table caused the task to display digging task errors.
-
-### 2024-04-26
-
-#### New Features
-
-* [Data replication tasks](../user-guide/copy-data/create-task.md) now support table-level checkpoint resumption, allowing tasks to continue syncing from the last incomplete table upon restart.
-* Added the ability to quickly [set task labels](../user-guide/copy-data/manage-task.md) by dragging and dropping.
-* Added support for MySQL replica architecture, ensuring tasks continue to sync data normally after a failover event.
-
-#### Enhancements
-
-* The Windows version of the Cloud Agent now includes digital certificate signing to avoid installation delays caused by system security prompts.
-* Improved the User Center page layout.
-
-#### Bug Fixes
-
-* Fixed an issue where tasks were failing with Aliyun PolarDB MySQL data sources due to unsupported event types.
-* Corrected a statistical progress display error in the completion metrics of full data synchronization tasks.
-
-### 2024-04-12
-
-#### New Features
-
-* Added support for real-time log parsing of [TiDB data sources](../prerequisites/on-prem-databases/tidb.md), fulfilling incremental data synchronization needs.
-* During the full sync phase from Oracle to MySQL, support has been added for syncing unique and normal indexes that do not utilize functions.
-* Enhanced the task start process to include an option to skip errors encountered during the last run.
-
-#### Enhancements
-
-* Improved DDL synchronization settings in data sync tasks by allowing users to configure DDL statements to ignore (based on regular expressions) when DDL errors occur.
-* Enhanced data verification capabilities to support tasks that include processing nodes.
-* Optimized the data verification results page to quickly filter between consistent and inconsistent tables.
-
-#### Bug Fixes
-
-* Fixed an issue where MongoDB used as external storage failed when storing values in a Map format with keys containing the `.` character.
-* Addressed a looping error that occurred during connection tests for Kafka data sources containing non-JSON topics.
-* Resolved a bug where JS nodes reported errors during trial runs under specific conditions.
-* Fixed an issue with incorrect data results when changing join keys in master-slave merge nodes.
-* Fixed a problem where using RocksDB as cache storage could cause task errors.
-
-### 2024-03-29
-
-#### Enhancements
-
-* To further enhance user experience, Beta and Alpha [data sources](../prerequisites/README.md) now require an application for use, allowing TapData to provide better technical support based on your business scenarios.
-
-#### Bug Fixes
-
-* Resolved an issue where Agents crashed under specific circumstances.
-* Fixed a bug related to importing RM files in MongoDB.
-
-## 2024-03-08
-
-### New Features
-
-* Support for setting [default alarm recipients](../user-guide/workshop.md), allowing customization of alarm receipt email addresses (supports multiple addresses).
-* New options in [DDL synchronization settings](../case-practices/best-practice/handle-schema-changes.md): **Stop Task on DDL Error** and **Automatically Ignore All DDLs**, catering to different business scenario needs.
-* Added a [time field injection](../user-guide/data-development/process-node.md#time_injection) node, allowing the addition of a custom timestamp field to data during synchronization. This provides a more flexible way to capture incremental changes from the source database.
-
-### Enhancements
-
-* Optimized task retry logic and interface prompt information.
-* Enhanced the setting for incremental collection timing, supporting quick selection of the incremental time point from the last incremental run.
-* Improved the interaction logic for using external storage with the master-slave merge node.
-
-## 2024-01-26
-
-### New Features
-
-- Added support for [Shared Mining](../user-guide/advanced-settings/share-mining.md), allowing multiple tasks to share incremental logs from the source database, thus avoiding redundant reads and significantly reducing the load on the source database during incremental synchronization.
-- The Shared Mining feature now supports using RocksDB as [local external storage](../user-guide/advanced-settings/manage-external-storage.md) to extend storage for incremental logs.
-
-### Enhancements
-
-- Improved the onboarding process for users from the [Google Cloud Marketplace](https://console.cloud.google.com/marketplace/product/tapdata-public/detail).
-- Added a time filter option for the incremental phase in the [Task Monitoring Page](../user-guide/copy-data/monitor-task.md) to quickly observe RPS (Records Per Second) during the incremental phase.
-- Added warning messages for critical operations that might impact the database (e.g., filtering source table data).
-- Refined the logic for unsubscribing after instance subscription expiration.
-
-### Bug Fixes
-
-- Fixed an issue with the [Primary-Secondary Merge Node](../user-guide/data-development/process-node.md#pri-sec-merged) where changes in the key conditions between the primary and secondary tables resulted in data not matching expectations.
-
-## 2024-01-12
-
-### New Features
-
-* Added support for [Capped Collections](https://www.mongodb.com/docs/manual/core/capped-collections/) in data synchronization between MongoDB database.
-* Data replication/transformation tasks now have import capabilities. Design your data flow process on [MongoDB Relational Migrator](https://www.mongodb.com/docs/relational-migrator/), export it, and then directly import it into TapData data pipelines from the top right corner, enhancing the convenience of data pipeline design.
-
-### Enhancements
-
-* Enhanced the new user onboarding process, including the ability to collapse prompts and return to previous steps.
-
-### Bug Fixes
-
-* Fixed an issue where JS node model declaration settings showed incorrect prompts on the task editing page.
-* Fixed an issue where the DROP COLUMN operation in Oracle to MySQL synchronization was not syncing correctly.
-* Addressed an issue causing DDL errors when syncing from MySQL to ClickHouse.
-* Fixed instability in tasks due to frequent WebSocket reconnections.
-* Corrected several UI interaction experience issues.
-
-
-
-
-
-
-## 2023-12-26
-
-### New Features
-
-* Added support for [Time Series collections](https://www.mongodb.com/docs/manual/core/timeseries-collections/) in MongoDB 5.x and above versions.
-* Added support for [preImage](https://www.mongodb.com/docs/manual/changeStreams/#change-streams-with-document-pre--and-post-images) in MongoDB 6.x and above versions.
-
-### Enhancements
-
-* Improved system prompts when enabling scheduled tasks while reaching the task limit.
-
-### Bug Fixes
-
-* Fixed inaccuracies in checkpoints in multi-table data replication scenarios.
-* Resolved issues with unsubscribed and deleted Agent instances continuing to report heartbeat information.
-* Addressed known UI interaction experience issues.
-
-## 2023-12-08
-
-### New Features
-
-- Added [Azure Cosmos DB](../prerequisites/cloud-databases/azure-cosmos-db.md) as a new data source, enabling full data synchronization to facilitate quick cloud data transfers.
-
-### Enhancements
-
-- Upgraded data source connections, with [SQL Server](../prerequisites/on-prem-databases/sqlserver.md) now supporting SSL connections, enhancing data security.
-- Optimized field type adjustments in [data replication tasks](../user-guide/copy-data/create-task.md), allowing for direct selection of common types from the target database.
-- Improved task source node settings, enabling customization of the number of rows read per batch in the incremental phase, catering to performance needs of incremental synchronization.
-
-### Bug Fixes
-
-- Addressed issues with enhanced JS nodes failing or causing exceptions under certain scenarios.
-- Corrected several UI interaction experience issues for better usability.
-
-
-
-## 2023-11-24
-
-### New Features
-
-* Support for loading table comments on [Oracle data sources](../prerequisites/on-prem-databases/oracle.md#advanced), which can be enabled in the **Advanced Settings** when configuring the data source. This makes it easier to quickly identify the business meaning of tables through their comments.
-* In the task [monitoring page](../user-guide/copy-data/monitor-task.md), support viewing RPS (Records Per Second) information based on the size of events.
-
-### Enhancements
-
-* Enhanced the display effects of resource management and the subscription center pages.
-* When performing data source connection tests, support for displaying connector download progress is now available, helping to quickly grasp connection progress and pinpoint timeout issues.
-
-### Bug Fixes
-
-* Fixed an issue where incremental information was not successfully cleared after resetting and rerunning a task.
-* Fixed an issue where some SaaS data sources displayed incremental timestamps during full data synchronization.
-
-### Enhancements
-
-### Bug Fixes
-
-## 2023-11-03
-
-### Enhancements
-
-- Enhanced [Data Source Connection](../prerequisites/README.md) methods, supporting SSL connections for data sources like MySQL, PostgreSQL, Kafka, TiDB, MariaDB, etc., to further enhance data security.
-- Improved user interface interaction logic.
-- To better manage data duplication for updates on non-primary keys, TapData Cloud now supports creating unique indexes.
-
-### Bug Fixes
-
-- Fixed an issue where data synchronization could fail when table names contain `.`.
-- Fixed an issue where task exception messages did not include table names.
-- Fixed an issue with incorrect judgment of task quotas and task count limits when specifying an Agent for a task.
-
-
-## 2023-10-20
-
-### New Features
-
-- Added support for [automatically creating sharded collections](../user-guide/copy-data/create-task.md#advanced-settings) when MongoDB Cluster is set as the target.
-- Add support for [Unwind Processing Node](../user-guide/data-development/process-node.md#Unwind), which can help you efficiently "unwind" each element in an array, converting each element into independent data rows.
-- Added support for disabling node capabilities when configuring tasks. You can access this feature by hovering over a node, which can help reduce the cost of data flow during processing.
-
-### Enhancements
-
-- When [configuring data replication tasks](../user-guide/copy-data/create-task.md), you can now quickly filter tables with or without primary keys through the "**Selectable table range**" dropdown. Tables with primary keys include those without primary keys but with unique indexes.
-- Added a Demo data source to the onboarding guide flow for new users, helping you quickly complete the tutorial and set up your first data flow task.
-- Optimized the front-end display effects of operation buttons on the engine interface.
-
-### Bug Fixes
-
-- Fixed an issue where an error occurred in MongoDB as a target during an INSERT operation when there was no shard key.
-- Fixed an issue where MongoDB did not support REPLACE properly, and the fields deleted by REPLACE could not be properly removed.
-
-## 2023-10-08
-
-### New Features
-
-- Introduced the [Create Materialized View](../user-guide/data-development/create-materialized-view.md) feature for swift construction of real-time data models.
-- Added capability to fetch read-only access information of [subscribed MongoDB Atlas](../user-guide/real-time-data-hub/daas-mode/enable-daas-mode.md#Procedure).
-- Kafka data source now supports settings for replication factor and partition count.
-- For synchronization between MongoDB instances, added support for `$unset` operations.
-
-### Enhancements
-
-- During the task guidance process, when creating a connection for a fully managed Agent, instructions about the public IP address of the fully managed Agent have been added.
-- Enabled rapid target node location through node search at the top of the data replication/data transformation configuration page.
-
-### Bug Fixes
-
-* Fixed an issue where the wrong category of operation logs was recorded when restarting the Agent via the webpage.
-
-
-
-## 2023-09-20
-
-### New Features
-
-- Added [Python processing node](../user-guide/data-development/process-node.md#python), which supports customizing data processing logic through Python scripts. This offers improved performance compared to the JS processing node.
-- Added a "**Contact Us**" entry point, making it easier for users to quickly reach out to technical support when faced with issues.
-
-### Feature Improvements
-
-- Enhanced [error codes for data sources](../user-guide/error-code-solution.md), covering more scenarios and providing solutions.
-- While setting up email alert notifications, added guidance for binding email addresses.
-- Improved reminders and easy upgrade guide for when the task count reaches its limit.
-
-
-
-
-
-## 2023-08-28
-
-### New Features
-
-- Introduced the [Primary-Secondary Merge Node](../user-guide/data-development/process-node.md#pri-sec-merged), enabling quick construction and real-time updates of wide tables, assisting you in achieving better data analysis.
-- [Real-Time Data Hub](../user-guide/real-time-data-hub/daas-mode/enable-daas-mode.md) now offer a storage instances for free trial, with more new specifications available, including M10, M20, and M30.
-- Added support for connecting [existing MongoDB Atlas instances](../user-guide/real-time-data-hub/daas-mode/enable-daas-mode.md#atlas) as data storage for the Real-Time Data Hub.
-
-### Feature Improvements
-
-- Changed the display of help documentation on the right side during data source connection to embedded online documentation, assisting users in accessing the most recent help information.
-- For core data sources (such as Oracle, PostgreSQL, etc.), improved the page parameter descriptions and guidance when creating connections.
-
-### Bug Fixes
-
-- Fixed the issue where users couldn't view the monitoring page for previously run tasks.
-
-
-
\ No newline at end of file
diff --git a/docs/release-notes/release-notes-community.md b/docs/release-notes/release-notes-community.md
deleted file mode 100644
index 2a9aed38..00000000
--- a/docs/release-notes/release-notes-community.md
+++ /dev/null
@@ -1,311 +0,0 @@
-# TapData Community Release Notes
-
-import Content from '../reuse-content/_community-features.md';
-
-
-
-This document introduces the recent release notes for TapData Community. For more information on earlier versions, please refer to the [GitHub Release Page](https://github.com/tapdata/tapdata/releases).
-
-## 3.27.0
-
-### New Features
-
-- The [Cluster Overview](../user-guide/workshop.md) page on the homepage now displays task distribution by node, helping you better understand cluster workload.
-- [OceanBase (MySQL Mode)](../prerequisites/on-prem-databases/oceanbase.md), and [GaussDB (DWS)](../prerequisites/warehouses-and-lake/gaussdb.md) have passed Tapdata certification and are now classified as [Certified Data Sources](../prerequisites/supported-databases.md), offering enhanced features and improved production-level stability.
-- Data replication tasks now support writing multiple tables to the same Kafka topic, expanding compatibility with more write scenarios.
-
-### Enhancements
-
-- Improved model visualization by adjusting how primary keys, foreign keys, and unique indexes are displayed, making models more readable and easier to edit.
-
-### Bug Fixes
-
-- Fixed an issue where connection requests were not evenly distributed across multiple `mongos` nodes, eliminating potential single-node performance bottlenecks.
-
-## 3.26.0
-
-### New Features
-
-- Added support for syncing default values and foreign keys in [PostgreSQL](../prerequisites/on-prem-databases/postgresql.md) to SQL Server sync scenarios.
-
-## 3.25.0
-
-### New Features
-
-- Added support for synchronizing **column default values**, **auto-increment columns**, and **foreign key constraints** in [MySQL](../prerequisites/on-prem-databases/mysql.md)-to-MySQL and [PostgreSQL](../prerequisites/on-prem-databases/postgresql.md)-to-PostgreSQL scenarios, ensuring data structure consistency.
-- Enhanced the **[primary-secondary merge node](../user-guide/data-development/process-node.md#pri-sec-merged)** functionality to allow subsequent connections with other processing nodes (including JS nodes), improving workflow flexibility.
-
-### Enhancements
-
-- Improved Kafka connector capabilities.
-- Optimized the display of task milestones for better clarity.
-
-### Bug Fixes
-
-- Fixed an issue where MongoDB sharded configurations could not be automatically synchronized.
-- Resolved a problem where MongoDB capped collections failed to sync correctly.
-- Fixed an issue where merged embedded arrays behaved unexpectedly when modifying relationship keys.
-
-## 3.24.0
-
-### Bug Fixes
-
-- Fixed an issue in primary-secondary merge tasks where changes to the primary table's association conditions caused extra pre-update records in the target data.
-
-## 3.23.0
-
-### New Features
-
-- Enabled the ability to define a primary key for tables without a primary key when configuring [Primary-Secondary Merge Nodes](../user-guide/data-development/process-node.md#pri-sec-merged), ensuring data synchronization consistency and improving merge efficiency.
-
-### Bug Fixes
-
-- Fixed an issue with multi-level associated keys in primary-secondary merges, preventing incorrect merging of child table data.
-
-## 3.22.0
-
-### Enhancements
-
-- Improved the method for retrieving software version information to prevent version discrepancies caused by page caching.
-
-### Bug Fixes
-
-- Fixed an issue where heartbeat task startup failures prevented data synchronization tasks from starting properly.
-
-## 3.21.0
-
-### Bug Fixes
-
-- Fixed the issue where webhook alerts configured by the admin user could not retrieve all alert data.
-
-## 3.20.0
-
-### Enhancements
-
-- Added the ability to download log files from the task monitoring page for easier fault diagnosis.
-- Optimized engine startup to eliminate the need for MongoDB configuration during initialization.
-- Expanded error code coverage and provided more detailed solution hints.
-
-### Bug Fixes
-
-- Fixed a problem where tasks synchronizing only primary key tables using regex continued to log "new table detected" after adding non-primary key tables.
-
-## 3.19.0
-
-### Enhancements
-
-- Optimized and added new engine error codes to help users quickly locate the cause of issues.
-
-### Bug Fixes
-
-- Fixed an issue where the system failed to start when configuring SSL connections for MongoDB as an intermediate database.
-- Fixed an issue where data was not updated to the target during incremental synchronization when synchronizing Oracle tables with multi-column composite primary keys to GaussDB (DWS).
-- Fixed an issue where the task incorrectly reported missing table creation privileges after synchronizing some tables to MySQL.
-
-## 3.18.0
-
-### Enhancements
-
-- Enabled copy all selected table names during task configuration, improving operational efficiency.
-- Expanded the range of built-in error codes for better issue identification and diagnosis.
-- Enhanced milestone tracking and display logic during task execution.
-- Improved log viewing experience for script processing nodes by supporting split log display.
-
-### Bug Fixes
-
-- Resolved an issue where MongoDB indexes were not correctly loaded, causing schema loading failures.
-- Fixed an issue where data extraction tasks could get stuck at the table structure replication stage.
-
-## 3.17.0
-
-### Enhancements
-
-- Expanded the range of built-in error codes for faster issue identification and diagnosis.
-
-### Bug Fixes
-
-- Fixed an issue where resetting tasks on the edit page failed, causing a “current status not allowed” error when saving the task.
-- Resolved an issue where removing and re-adding a table being synchronized in a replication task failed to resume synchronization correctly.
-
-## 3.16.0
-
-### New Features
-
-- Added HTTPS connection support for [Elasticsearch data sources](../prerequisites/on-prem-databases/elasticsearch.md), enhancing data transmission security to meet more stringent data security and compliance requirements.
-- Enabled support for synchronizing tables without primary keys by adding a hash field (default name: `_no_pk_hash`), ensuring data consistency and stable synchronization in non-primary key scenarios.
-
-### Enhancements
-
-- Enhanced data filtering logic in Row Filter nodes, ensuring that target data is updated to maintain consistency when data status changes from meeting to not meeting filter conditions.
-
-### Bug Fixes
-
-- Fixed an issue preventing the display of all tables (completed, in progress, and not started) in full sync details.
-- Corrected inaccuracies in time and milestone statistics.
-- Resolved an issue with MongoDB Atlas functionality when DNS resolution fails.
-
-## 3.15.0
-
-### New Features
-
-* Kafka-Enhanced and TiDB have passed the TapData certification testing process and have been upgraded to [Certified Data Sources](../prerequisites/supported-databases.md), providing more advanced features and enhanced production stability.
-
-### Enhancements
-
-- Optimized the cache management logic for processing nodes, enhancing resource usage efficiency and improving task execution speed.
-
-### Bug Fixes
-
-- Fixed an issue where, after enabling the heartbeat table, tasks displayed no delay but data was not synchronized.
-- Fixed an issue where not all tags could be viewed when setting tags.
-- Fixed an issue where the task retry start time was incorrectly displayed as 1970.
-- Fixed an issue where index creation failed when Elasticsearch was used as the target database.
-
-## 3.14.0
-
-### New Features
-
-* Doris, ClickHouse, PostgreSQL, and MongoDB have passed the TapData certification testing process and have been upgraded to [Certified Data Sources](../prerequisites/supported-databases.md), providing more advanced features and enhanced production stability.
-* When using PostgreSQL as a source, it is now possible to specify the time point for incremental data in task settings.
-
-### Enhancements
-
-* When configuring an Elasticsearch data source, the task setup now allows you to select an update strategy for data writing.
-* For data replication tasks, the source node's table selection defaults to primary key tables, with an added prompt message.
-
-### Bug Fixes
-
-- Fixed an issue where tasks would encounter errors during the incremental phase after enabling the heartbeat table in new tasks.
-- Fixed the issue where tasks got stuck in the full phase and could not move to the incremental phase after a reset.
-
-## 3.13.0
-
-### New Features
-
-* MySQL has passed the TapData certification testing process, upgrading it to a [certified data source](../prerequisites/supported-databases.md), providing more comprehensive features and enhanced production stability.
-
-### Bug Fixes
-
-- Fixed an issue where regular indexes were not properly synchronized after enabling the **Sync Indexes on Table Creation** option, ensuring data synchronization integrity.
-
-## 3.12.0
-
-### New Features
-
-- Dameng have passed the TapData certification testing process and have been upgraded to [Certified Data Sources](../prerequisites/supported-databases.md), offering richer features and higher production stability.
-- For [PostgreSQL](../prerequisites/on-prem-databases/postgresql.md) data sources, incremental data synchronization is now supported using the walminer plugin, catering to more use cases.
-- Data replication tasks now support reading from multiple tables simultaneously, improving parallel processing capabilities and task execution efficiency.
-
-### Feature Enhancements
-
-- Significantly enhanced data synchronization performance.
-- Optimized the layout and structure of menu entries.
-- Improved error messages and high-risk operation warnings.
-- For data sources that do not support hash validation, hash validation is now disabled by default.
-- After full sync tasks are completed, restarting the task will trigger a full resynchronization to ensure data consistency.
-
-### Bug Fixes
-
-- Fixed an issue where some task monitoring metrics were lost after task completion.
-- Fixed a query efficiency issue caused by missing necessary indexes in the intermediate database, reducing data scan volume.
-- Fixed an issue where selecting "Show only different fields" when downloading data validation discrepancies resulted in downloading all fields.
-- Fixed a problem where task editing could get stuck during model generation, improving the task editing experience.
-- Fixed an issue where, after stopping a data replication task in the incremental phase and restarting it, the full completion time displayed incorrectly.
-- Fixed an issue with TDengine where SQL statement length exceeded limits when writing to super tables with many fields.
-- Fixed an error occurring in data transformation tasks using TDengine as a source when the table name contained Chinese characters.
-- Fixed potential exceptions when running mining tasks on PostgreSQL data sources.
-- Fixed an issue in Oracle to Doris shared mining tasks where source table DDL events could not be parsed.
-- Fixed specific exception issues during the incremental phase of MongoDB to Kafka data transformation tasks.
-- Fixed an issue where an unexpected `_id` field appeared in the model when synchronizing MongoDB oplog to Kafka.
-- Fixed an issue where MongoDB oplog data replication tasks could not replicate properly during synchronization.
-
-## 3.11.0
-
-### New Features
-
-- Enhanced [Data Transformation Task Configuration](../user-guide/data-development/create-task.md) to support reloading of single table models in the source node model preview area, improving loading efficiency.
-- Introduced time detection functionality that automatically detects the time difference between the engine deployment server and the database server and displays it on the task monitoring page.
-
-### Optimizations
-
-* User-defined field business descriptions can now be directly displayed in the column name position of the table sample data.
-
-### Bug Fixes
-
-- Fixed an issue where MongoDB database cursor timeout prevented normal full synchronization.
-- Fixed an issue where the custom SQL filter switch could not be turned on in the source node data filtering settings.
-
-## 3.10.0
-
-### New Features
-
-- Added a [Union Node](../user-guide/copy-data/process-node.md#union-node) to data replication tasks, enabling the merging (UNION) of multiple tables within the same database. This is useful for data integration and analysis scenarios.
-- [Doris](../prerequisites/warehouses-and-lake/doris.md) data source now supports certificate-free HTTPS connections.
-- MySQL, Oracle, OpenGauss, SQL Server, and PostgreSQL data sources now support enabling the **Hash Sharding** feature in the advanced settings of nodes during task configuration, significantly improving the full data sync speed for large tables.
-- Added support for [VastBase](../prerequisites/on-prem-databases/vastbase.md) data source, with a maturity level of Beta, further enriching the variety of data sources.
-
-### Enhancements
-
-- Improved synchronization logic for time zone fields.
-
-### Bug Fixes
-
-- Addressed the unclear error messages and lack of detailed information in the error codes when the source MySQL does not support incremental.
-- Corrected the format of task warning alerts.
-- Resolved an issue where imported tasks showed running records and the current running record status appeared as "deleting."
-- Addressed an issue where editing tasks incorrectly modified the association key when the target table association key was set.
-- Fixed a potential failure when removing fields in Python nodes.
-- Resolved an issue where deleting the primary node in master-slave merge operations caused configuration errors in the master-slave merge node, leading to task errors.
-- Fixed garbled text issue with Chinese node names in tasks when a source-side DDL occurs and the engine server is not set to UTF character encoding.
-
-## 3.9.0
-
-### New Features
-
-* Added a new button for using shared mining when creating [Shared Caches](../user-guide/advanced-settings/share-cache.md), simplifying cache task configuration and improving the efficiency and flexibility of cache sharing.
-
-### Enhancements
-
-* Added field restriction configuration parameters for the ElasticSearch data source.
-* Optimized exception handling logic when enabling the preimage capability for the MongoDB data source.
-
-### Bug Fixes
-
-- Fixed an issue where some task event statistics might occasionally be missing when reported.
-- Fixed an issue where shared cache tasks without shared mining might encounter errors due to exceeding the log time window if data does not change upon restarting or upgrading the engine.
-- Fixed an issue where the unset operation on the source table could cause task errors in scenarios where the write mode is updating sub-documents.
-- Fixed an issue where joining collections with time types in MongoDB and MySQL caused errors.
-- Fixed an issue where incremental update events unexpectedly performed lookups in master-slave merge scenarios.
-- Fixed conflict errors when modifying columns in master-slave merge nodes.
-
-## 3.8.0
-
-### Enhancements
-
-* Improved the display of primary keys and indexes in the task's table model.
-* Enhanced the model deduction logic, supporting model deduction directly in the engine.
-
-### Bug Fixes
-
-* Fixed an issue where some exceptions were ignored during data source error handling.
-* Fixed an issue where aggregation tasks using time fields as join keys could not backtrack data.
-* Fixed an issue with delayed times in mining tasks.
-* Fixed an issue where MySQL as a source would consume a large amount of database memory during initial synchronization of large tables.
-
-## 3.7.0
-
-### New Features
-
-* Introduced Mock Source and Mock Target data sources for data migration testing scenarios.
-
-### Enhancements
-
-* Improved the interaction logic for skipping errors when starting tasks.
-* Improved the loading speed of the connection list.
-
-### Bug Fixes
-
-* Fixed inconsistencies between the task runtime model and configuration model.
-* Fixed inaccurate task event statistics after filtering source data.
-* Fixed timezone handling issues in Oracle and PostgreSQL synchronization scenarios.
-* Fixed an issue where heartbeat task reset failures could prevent related tasks from starting.
\ No newline at end of file
diff --git a/docs/system-admin/README.md b/docs/system-admin/README.md
new file mode 100644
index 00000000..bd2b7c3c
--- /dev/null
+++ b/docs/system-admin/README.md
@@ -0,0 +1,5 @@
+# System Admin
+
+import DocCardList from '@theme/DocCardList';
+
+
\ No newline at end of file
diff --git a/docs/user-guide/manage-system/manage-cluster.md b/docs/system-admin/manage-cluster.md
similarity index 54%
rename from docs/user-guide/manage-system/manage-cluster.md
rename to docs/system-admin/manage-cluster.md
index 78c52c58..43edf00b 100644
--- a/docs/user-guide/manage-system/manage-cluster.md
+++ b/docs/system-admin/manage-cluster.md
@@ -1,27 +1,25 @@
# Manage Clusters
-import Content from '../../reuse-content/_enterprise-features.md';
-
Through the Cluster Management page, you can view the running status of all components within the current cluster, the number of external connections established, and other information. It also supports management operations.
## Procedure
-1. [Log in to TapData Platform](../log-in.md) as a system administrator.
+1. Log in to TapData Platform as a system administrator.
2. In the left navigation bar, select **System** > **Cluster**. The default view is **Cluster View**, where you can see the operational status and connection information of each component.
You can also start/stop, and restart services. Note that stopping and restarting operations will affect the normal operation of related services, so please operate during maintenance windows or during business off-peak periods.
- 
+ 
3. On this page, choose the following operations according to business needs.
- * Click  to download the thread resource usage details of the current engine, in JSON format.
+ * Click  to download the thread resource usage details of the current engine, in JSON format.
- * Click  to download the data source usage details of the current engine, in JSON format.
+ * Click  to download the data source usage details of the current engine, in JSON format.
- * Click  to adjust the server name and switch the network card display information.
+ * Click  to adjust the server name and switch the network card display information.
:::tip
@@ -29,10 +27,10 @@ Through the Cluster Management page, you can view the running status of all comp
:::
- * Click  to add custom service monitoring.
+ * Click  to add custom service monitoring.
4. Click **Component View** in the upper right corner, and the page will display the status information of components by category. Additionally, you can assign different tags to multiple synchronization governance services (Agents). These tags can then be specified when configuring data synchronization or transformation tasks.
- 
+ 
-5. If you have deployed the [Raw Log Parsing Service](../../case-practices/best-practice/raw-logs-solution.md), you can click **Log Mining Monitor** to view the resource usage (such as CPU, memory, etc.) of the server where the service is running. This helps you gain a comprehensive understanding of the service’s operational status.
\ No newline at end of file
+5. If you have deployed the [Raw Log Parsing Service](../case-practices/best-practice/raw-logs-solution.md), you can click **Log Mining Monitor** to view the resource usage (such as CPU, memory, etc.) of the server where the service is running. This helps you gain a comprehensive understanding of the service’s operational status.
\ No newline at end of file
diff --git a/docs/user-guide/manage-system/manage-role.md b/docs/system-admin/manage-role.md
similarity index 68%
rename from docs/user-guide/manage-system/manage-role.md
rename to docs/system-admin/manage-role.md
index 29c0ff5f..9b20911d 100644
--- a/docs/user-guide/manage-system/manage-role.md
+++ b/docs/system-admin/manage-role.md
@@ -1,13 +1,16 @@
# Manage Roles
-import Content from '../../reuse-content/_enterprise-features.md';
-
+A role is a collection of one or more permissions. It controls access to features and data across the TapData platform.
+ Roles can be assigned to both:
-A role is a collection of one or more permissions. You can grant multiple operation permissions to a role, and then grant the role to a [user](manage-user.md), who will inherit all the permissions within that role. Based on this design, you can pre-create roles based on business needs and then directly assign roles to users when creating them, without the need to configure permissions for each user, thereby simplifying operational management and enhancing security.
+- **[Users](manage-user.md)**, to control what operations they can perform in the TapData UI.
+- **[Clients](../publish-apis/create-api-client.md)**, to control which APIs they are authorized to access under API Services.
+
+By pre-defining roles for typical use cases, you can quickly assign them to users or clients without configuring individual permissions each time—streamlining management and enhancing security.
## Procedure
-1. [Log in to TapData Platform](../log-in.md) as a system administrator.
+1. Log in to TapData Platform as a system administrator.
2. In the left navigation bar, select **System** > **Roles**.
@@ -24,7 +27,7 @@ A role is a collection of one or more permissions. You can grant multiple operat
:::
- 
+ 
* **Associate Users**: Click **Associate Users** for the target role. In the pop-up dialog, select the target user(s) (multiple selections allowed) and click **Confirm**. The user(s) will automatically inherit all permissions of the current role.
diff --git a/docs/user-guide/manage-system/manage-user.md b/docs/system-admin/manage-user.md
similarity index 91%
rename from docs/user-guide/manage-system/manage-user.md
rename to docs/system-admin/manage-user.md
index 2d124090..c9745508 100644
--- a/docs/user-guide/manage-system/manage-user.md
+++ b/docs/system-admin/manage-user.md
@@ -1,13 +1,10 @@
# Manage Users
-import Content from '../../reuse-content/_enterprise-and-community-features.md';
-
-
After TapData is deployed, a system administrator named `admin@admin.com` will be automatically created. To better manage platform operation permissions, you can log into the TapData platform with this account and perform management operations, such as creating users and granting permissions, for other members within the organization.
## Procedure
-1. [Log in to TapData Platform](../log-in.md) as a system administrator.
+1. Log in to TapData Platform as a system administrator.
2. In the left navigation bar, select **System** > **Users**.
diff --git a/docs/user-guide/operation-log.md b/docs/system-admin/operation-log.md
similarity index 77%
rename from docs/user-guide/operation-log.md
rename to docs/system-admin/operation-log.md
index 50fe634c..bedd1b39 100644
--- a/docs/user-guide/operation-log.md
+++ b/docs/system-admin/operation-log.md
@@ -1,16 +1,12 @@
# Operation Log
-import Content from '../reuse-content/_cloud-features.md';
-
-
-
TapData Cloud uses the operation log to record user actions while using the system. You can view the recent activity log and filter the action log by applying the filter.
## Procedure
-1. [Log in to TapData Platform](log-in.md) as a system administrator.
+1. Log in to TapData Platform as a system administrator.
2. In the left navigation panel, click **Operation log**.
diff --git a/docs/system-admin/other-settings/README.md b/docs/system-admin/other-settings/README.md
new file mode 100644
index 00000000..2dabf35a
--- /dev/null
+++ b/docs/system-admin/other-settings/README.md
@@ -0,0 +1,5 @@
+# Other Settings
+
+import DocCardList from '@theme/DocCardList';
+
+
\ No newline at end of file
diff --git a/docs/user-guide/other-settings/check-version.md b/docs/system-admin/other-settings/check-version.md
similarity index 68%
rename from docs/user-guide/other-settings/check-version.md
rename to docs/system-admin/other-settings/check-version.md
index c8b645dc..75af722d 100644
--- a/docs/user-guide/other-settings/check-version.md
+++ b/docs/system-admin/other-settings/check-version.md
@@ -1,9 +1,5 @@
# View System Version
-import Content from '../../reuse-content/_enterprise-and-community-features.md';
-
-
-
In the upper right corner of the TapData platform, click on your account, and then click on **Version** to view the current version.

\ No newline at end of file
diff --git a/docs/user-guide/other-settings/manage-license.md b/docs/system-admin/other-settings/manage-license.md
similarity index 75%
rename from docs/user-guide/other-settings/manage-license.md
rename to docs/system-admin/other-settings/manage-license.md
index a6eff940..1e11ce59 100644
--- a/docs/user-guide/other-settings/manage-license.md
+++ b/docs/system-admin/other-settings/manage-license.md
@@ -1,9 +1,5 @@
# Manage License
-import Content from '../../reuse-content/_enterprise-features.md';
-
-
-
For partners of the TapData platform, you can view and manage your license by clicking on the username and selecting "**License**". Here, you have the options to copy and update your License.

diff --git a/docs/user-guide/other-settings/notification.md b/docs/system-admin/other-settings/notification.md
similarity index 82%
rename from docs/user-guide/other-settings/notification.md
rename to docs/system-admin/other-settings/notification.md
index ea278588..8fb3f656 100644
--- a/docs/user-guide/other-settings/notification.md
+++ b/docs/system-admin/other-settings/notification.md
@@ -1,9 +1,5 @@
# Notification and Alert Settings
-import Content from '../../reuse-content/_enterprise-features.md';
-
-
-
:::tip
If you are using TapData Cloud, notification messages and configuration entry points are located at the top right of the page. You can set notification rules and alert recipients.
@@ -14,7 +10,7 @@ TapData supports custom system and alert settings and integrates with third-part
## Notification Settings
-After [logging into the TapData platform](../log-in.md), click the  > **Notification Settings** at the top right corner. You can set up custom notification rules to automatically trigger notification processes. The main types are task operation notifications and Agent notifications. The specific notification items include:
+After logging into the TapData platform, click the  > **Notification Settings** at the top right corner. You can set up custom notification rules to automatically trigger notification processes. The main types are task operation notifications and Agent notifications. The specific notification items include:

diff --git a/docs/user-guide/other-settings/system-settings.md b/docs/system-admin/other-settings/system-settings.md
similarity index 99%
rename from docs/user-guide/other-settings/system-settings.md
rename to docs/system-admin/other-settings/system-settings.md
index f9972dfc..55259b07 100644
--- a/docs/user-guide/other-settings/system-settings.md
+++ b/docs/system-admin/other-settings/system-settings.md
@@ -1,7 +1,4 @@
# System Settings
-import Content from '../../reuse-content/_enterprise-features.md';
-
-
The system settings feature is mainly used to configure some parameters of the system, such as logging, SMTP, API distribution, and more.
diff --git a/docs/user-guide/README.md b/docs/user-guide/README.md
deleted file mode 100644
index 07755503..00000000
--- a/docs/user-guide/README.md
+++ /dev/null
@@ -1,9 +0,0 @@
-# User Guide
-
-import Content from '../reuse-content/_all-features.md';
-
-
-
-import DocCardList from '@theme/DocCardList';
-
-
diff --git a/docs/user-guide/copy-data/create-task-via-drag.md b/docs/user-guide/copy-data/create-task-via-drag.md
deleted file mode 100644
index e44c72c6..00000000
--- a/docs/user-guide/copy-data/create-task-via-drag.md
+++ /dev/null
@@ -1,32 +0,0 @@
-# Generate Data Pipeline with One Click
-
-import Content from '../../reuse-content/_all-features.md';
-
-
-
-In the **Board** view mode, you can simply drag the source table to the target database to generate a data pipeline with one click, greatly simplifying the task configuration process and real-time synchronization of source data. This article introduce how to generate a data pipeline.
-
-## Procedure
-
-1. [Log in to TapData Platform](../log-in.md).
-
-2. In the left navigation panel, click **Data Replications**.
-
-3. In the upper right corner of the page, click on **Board** to switch to the Data Board view.
-
-4. On this page, you can conveniently view the data source information you have entered. The page is divided into two columns labeled **Sources** and **Targets & Services** by TapData. This helps you distinguish between the source and target data sources and provides a clear overview of your data connections.
-
- 
-
-5. (Optional) Click the 🔍 icon to find the source table you want to synchronize and drag it to the right target data source.
-
-6. In the pop-up dialog box, fill in a task name that is meaningful for your business, select the synchronization type, and choose whether to run the task.
-
- 
-
- - **Only Save**: Save the task without running it. You can now click on the task name in the target data card to customize the task further. On the redirected task configuration page, you can add [processing nodes](../data-development/process-node.md) to meet requirements such as table structure adjustment (e.g., adding fields), table merging, and building wide tables. Once the setup is complete, click **Start** in the upper right corner of the page.
-
- - **Save and Run**: No additional action is required. TapData will automatically create a data transformation task and run it to synchronize your source tables in real-time to the selected target data source. In this case, the **customer** table in the source MySQL will be synchronized to MongoDB in real-time.
-
- You can also click the task name in the target data card to enter the task monitoring page to see the detailed operation status. For more information, see [Monitoring Tasks](monitor-task.md).
-
diff --git a/docs/user-guide/copy-data/process-node.md b/docs/user-guide/copy-data/process-node.md
deleted file mode 100644
index b0d3b9d7..00000000
--- a/docs/user-guide/copy-data/process-node.md
+++ /dev/null
@@ -1,205 +0,0 @@
-# Adding Processing Nodes to Replication Tasks
-import Content from '../../reuse-content/_all-features.md';
-
-
-
-TapData supports integrating processing nodes into data replication tasks for requirements like data filtering or field adjustments.
-
-## Union Node
-
-With the **Union** node, you can merge multiple tables with similar or identical structures into one table. TapData will combine data with consistent field names, following the rules below:
-
-- If the inferred type length and precision differ, the maximum length and precision are selected.
-- If the inferred types are different, they are converted to a common type.
-- When all source tables have consistent primary key fields, the primary key is retained; otherwise, it is removed.
-- When all source tables have the same field with non-null constraints, the non-null constraint is retained; otherwise, it is removed.
-- Unique indexes from the source tables are not transferred to the target table.
-
-**Scenario Example:**
-
-Suppose you want to perform a union operation on two tables, **student1** and **student2**, with the same structure and then store the results in the **student_merge** table. The structure and data of the tables are as follows:
-
-
-
-**Operation Steps:**
-
-1. [Log in to the TapData platform](../log-in.md).
-
-2. In the left navigation bar, click **Data Replication**.
-
-3. Click **Create** on the right side of the page. Drag in the source node, union node, table editor, and target node in sequence from the left side of the page, and then connect them.
-
- 
-
- :::tip
-
- In this scenario, we use the table editor node to specify a new name for the merged table to avoid overwriting the original table data.
-
- :::
-
-4. Click the first node (source node) and select the tables to be merged (**student1** / **student2**) in the right-side panel.
-
-5. Click the **Union** node and choose the name of the merged table.
-
- 
-
-6. Click the **Table Editor** node and specify a unique new name for the table in the database, such as **student_merge**.
-
-7. Click the target node, preview the table structure, and confirm it is correct. Click **Start** in the upper right corner.
-
-**Result Verification:**
-
-Query the **student_merge** table, and the result is as follows:
-
-```sql
-mysql> select * from student_merge;
-+---------+------+--------+------+-------+--------+
-| stu_id | name | gender | age | class | scores |
-+---------+------+--------+------+-------+--------+
-| 2201101 | Lily | F | 18 | NULL | NULL |
-| 2201102 | Lucy | F | 18 | NULL | NULL |
-| 2201103 | Tom | M | 18 | NULL | NULL |
-| 2202101 | Lily | F | 18 | 2 | 632 |
-| 2202102 | Lucy | F | 18 | 2 | 636 |
-| 2202103 | Tom | M | 18 | 2 | 532 |
-+---------+------+--------+------+-------+--------+
-6 rows in set (0.00 sec)
-```
-
-## Table Edit Node
-
-The table edit node primarily adjusts table names. Add a **Table Edit** node to the canvas and connect it with the data source. Click on the node to select operations (apply to all tables):
-
-- Rename Tables
-- Adjust Table Name Case
-- Add Prefix/Suffix to Table Names
-
-Additionally, you can directly specify a new name for individual target tables.
-
-
-
-## Column Edit Node
-
-The column edit node mainly renames or changes the case of table fields. Add a **Column Edit** node to the canvas and connect it with the data source. Click on the node to select a uniform method for field name handling (apply to all tables). You can also click directly on the target field name to manually adjust individual fields:
-
-- Convert to Upper Case: e.g., from `claim_id` to `CLAIM_ID`
-- Convert to Lower Case: e.g., from `CLAIM_ID` to `claim_id`
-- Convert Snake Case to Camel Case: e.g., from `CLAIM_ID` to `claimId`
-- Convert Camel Case to Snake Case: e.g., from `claimId` to `CLAIM_ID`
-
-Additionally, you can select a target field and click **Mask** to exclude it from being passed to the next node.
-
-
-
-## JS Node
-
-Supports data manipulation using JavaScript scripts or Java code. Ensure the node is connected to both the source and target nodes before editing the code. If not connected, code editing is not possible.
-
-
-
-After scripting, use the test button below the node to view inputs and outputs, aiding in debugging.
-
-### JS Node Model Declaration
-
-For JS nodes, TapData employs sample data run tests to infer the node's model information. If the inferred model is inaccurate or missing fields, model declaration can be used to explicitly define field information.
-
-
-
-The model declaration in replication tasks supports the following methods:
-
-```javascript
-// Add a field if it doesn't exist
-TapModelDeclare.addField(schemaApplyResultList, 'fieldName', 'TapString')
-// Remove an existing field
-TapModelDeclare.removeField(schemaApplyResultList, 'fieldName')
-// Update an existing field
-TapModelDeclare.updateField(schemaApplyResultList, 'fieldName', 'TapString')
-// Update or add a field
-TapModelDeclare.upsertField(schemaApplyResultList, 'fieldName', 'TapString')
-// Set a field as primary key
-TapModelDeclare.setPk(schemaApplyResultList, 'fieldName')
-// Unset a primary key
-TapModelDeclare.unsetPk(schemaApplyResultList, 'fieldName')
-// Add an index
-TapModelDeclare.addIndex(schemaApplyResultList, 'indexName', [{'fieldName':'fieldName1', 'order': 'asc'}])
-// Remove an index
-TapModelDeclare.removeIndex(schemaApplyResultList, 'indexName')
-```
-
-### JS Built-in Function Explanation
-
-- [Standard JS Built-in Functions](../../appendix/standard-js.md): Useful for data record manipulation and calculation, such as converting date strings into Date objects.
-- [Enhanced JS Built-in Functions (Beta)](../../appendix/enhanced-js.md): On top of standard JS functions, supports external calls (e.g., network, database).
-
-## Time Operations
-
-In scenarios where source and target databases are in different time zones, operations on date/time fields, like adjusting hours, are necessary. This requirement can be fulfilled using a time operation node.
-
-**Scenario Example**:
-
-In this case, the source database is in UTC+8, and the target database is in UTC+0, with an 8-hour difference.
-
-**Operational Process**:
-
-1. Log into the TapData platform.
-2. Navigate to **Data Pipeline** > **Data Replication** and click **Create**.
-3. Drag the source and target data sources to the canvas, followed by a time operation node, and connect them sequentially.
-4. Configure the source node and select the tables.
-5. Click on the **Time Operation** node and in the right panel, select the time type and operation method.
-
- 
-
- - **Node Name**: Defaults to the connection name, but you can set a meaningful name.
- - **Select the time type to operate on**: TapData auto-detects supportable time types. You should choose based on your business requirements. Additionally, you can click the **Model** tab to see the relationship between time types and column names.
- - **Select the operation method**: Supports adding or subtracting time, in integer hours. In this case, we choose to subtract 8 hours.
-
-6. Complete the configuration for the target node and the task. For specific steps, see [Creating a Data Replication Task](create-task.md).
-
-**Result Verification**:
-
-Query the same ID data from both the source and target tables, and you'll notice that the time has been adjusted by 8 hours as set.
-
-```sql
--- Source table query result
-SELECT birthdate FROM customer_new WHERE id="00027f47eef64717aa8ffb8115f1e66a";
-+-------------------------+
-| birthdate |
-+-------------------------+
-| 2021-09-01 09:10:00.000 |
-+-------------------------+
-1 row in set (0.00 sec)
-
--- Target table query result
-SELECT birthdate FROM customer_new_time WHERE id="00027f47eef64717aa8ffb8115f1e66a";
-+-------------------------+
-| birthdate |
-+-------------------------+
-| 2021-09-01 01:10:00.000 |
-+-------------------------+
-```
-
-## Type Filtering
-
-In scenarios involving data synchronization between heterogeneous data sources, some data types not supported by the target database might also lack business utility. In such cases, the **Type Filtering** node can quickly filter out unwanted same-type columns. The filtered fields will not be passed to the next node.
-
-Operation: Add the **Type Filtering** node to the canvas and connect it to the data source. Click on the node and select the field types to filter (precision specification is not supported yet).
-
-
-
-:::tip
-
-Precision specification for filtered field types is not yet supported. For instance, if the field type to be filtered is **varchar**, then **varchar(16)**, **varchar(12)**, etc., will all be filtered.
-
-:::
-
-
-
-## Time Field Injection
-
-In real-time data integration and synchronization processes, capturing and synchronizing incremental data is key to ensuring data consistency and timeliness.
-
-When the data source lacks complete CDC support or is restricted by permission controls from accessing incremental logs, we can add a time field injection node to the data synchronization chain. This node automatically adds timestamp information to the source table data read. Subsequently, in the target table's configuration, this field (of DATETIME type) can be selected for polling to achieve incremental data retrieval, thereby further enhancing the flexibility of real-time data acquisition methods.
-
-
-
-##
\ No newline at end of file
diff --git a/docs/user-guide/copy-data/quick-create-task.md b/docs/user-guide/copy-data/quick-create-task.md
deleted file mode 100644
index b1088385..00000000
--- a/docs/user-guide/copy-data/quick-create-task.md
+++ /dev/null
@@ -1,81 +0,0 @@
-# Quickly Create Data Replication Task
-
-import Content from '../../reuse-content/_cloud-features.md';
-
-
-
-The data replication feature helps you achieve real-time synchronization between homogeneous or heterogeneous data sources. It is suitable for various business scenarios such as data migration/synchronization, disaster recovery, and read performance expansion. TapData supports a form-based wizard to guide you through creating replication tasks quickly. This article explains the detailed process.
-
-## Steps
-
-
- Best Practices
- To build efficient and reliable data replication tasks, it is recommended to read Data Synchronization Best Practices before starting to configure tasks.
-
-
-1. [Log in to TapData platform](../log-in.md).
-
-2. In the left navigation bar, click **Data Replication**.
-
-3. Click **Quickly Create Task** on the right side of the page to go to the task creation form.
-
-4. First, select the data source as the source database. You can choose to **Create New Connection** or **Select Existing Connection**.
-
- 
-
- :::tip
-
- The following steps demonstrate how to synchronize MySQL to MongoDB in real-time using the **Select Existing Connection** option. The process is similar for other data sources. For information on how to create data sources in advance, see [Connecting Data Sources](../../prerequisites/README.md).
-
- :::
-
-5. Select **Existing Connection** and choose MySQL as the source database, then click **Next**.
-
-6. Select **Existing Connection** and choose MongoDB as the target database, then click **Next**.
-
-7. In the **Configure Task** step, configure the task details according to the instructions below.
-
- 
-
- * **Task Name**: Enter a meaningful name.
- * **Sync Type**: The default is **Full and Incremental Sync**, but you can also choose **Full Sync** or **Incremental Sync** separately.
- - **Full Sync**: Copies the existing data from the source to the target.
- - **Incremental Sync**: Copies new data or data changes in real-time from the source to the target.
- - The combination of both can be used for real-time data synchronization scenarios.
- * **Duplicate Handling Policy**: Choose based on your business needs. The default is to **Preserve the Original Table Structure and Data on the Target Side**.
- * **Select Tables**: Choose based on your business needs.
- - **Select by Table Name**: Select the tables in the area to be replicated, then click the right arrow to complete the setup.
- - **Select by Regular Expression**: Enter the table name's regular expression. Additionally, if a new table in the source database matches the expression, it will be automatically synchronized to the target database.
- * **Selectable table range**: By default, all tables are displayed, but you can choose to filter only tables with primary keys or only tables without primary keys. Since tables without primary keys use the full primary key method to implement data updates, they might encounter errors due to exceeding the index length limit, and their performance might be limited. Therefore, it is recommended that you create separate data replication tasks for tables without primary keys to avoid task errors and enhance the performance of data updates.
-
-8. Click **Next** to configure more task settings.
-
- 
-
- * **Basic Settings**
- - **Full Multi-thread Writing**: The number of concurrent threads for writing full data. It is disabled by default.
- - **Multi-threaded Incremental Write**: The number of concurrent threads for incremental data writing. It is disabled by default. You can enable and adjust based on the write performance of the target database.
- - **Number of Writes Per Batch**: The number of entries per batch during full synchronization.
- - **Write The Maximum Waiting Time for Each Batch**: Set the maximum wait time based on the target database performance and network latency. The unit is milliseconds.
- - **DDL Synchronization Settings**: Select the DDL event handling strategy. The default is **Automatically Ignore All DDL**. You can choose **Sync DDL Events** and select the DDL events to capture, typically including **Add Column**, **Change Column Name**, **Modify Column Attribute**, **Drop Column**. For more information, see [Handling Schema Changes](../../case-practices/best-practice/handle-schema-changes.md).
- - **Data Read Settings**: Choose the number of entries to read per batch in the full and incremental stages. The default values are 500 and 1, respectively. You can also choose to enable **Enable Concurrent Table Reading** (suitable for scenarios with many small tables).
- - **Data Write Settings**: Choose the data write strategy:
- - **Process by by Event Type**: After selecting this option, you need to choose the data write strategy for insert, update, and delete events.
- - **Append Write**: Only handles insert events, discards update and delete events.
- * **Advanced Settings**
- - **When the Event Processing is Abnormal**: The default is to retry, but you can choose to skip the erroneous event and continue the task.
- - **Other Settings**: Set the task start time, shared mining, periodic scheduling, dynamic memory adjustment, incremental data processing mode, processor threads, Agent, etc.
- * **Alert Settings**: By default, if the average processing time of the node is greater than or equal to 5 seconds for 1 minute, a system and email notification will be sent. You can adjust the rules or disable alerts based on your business needs.
-
-9. After configuration, click **Start Task**.
-
-10. After the task starts successfully, it will automatically jump to the task monitoring page, where you can view information such as RPS (records per second), latency, and task events.
-
- Additionally, to ensure the normal operation of the task, TapData will perform a pre-check based on node configuration and data source characteristics. You can view the printed log information at the bottom of the page.
-
- 
-
-## See also
-
-* [Processing Nodes](process-node.md): Combine multiple processing nodes and data sources to achieve more complex and customized data flow capabilities.
-* [FAQ](../../faq/data-pipeline.md): Common issues and solutions when using the data replication feature.
diff --git a/docs/user-guide/data-development/README.md b/docs/user-guide/data-development/README.md
deleted file mode 100644
index 8290de95..00000000
--- a/docs/user-guide/data-development/README.md
+++ /dev/null
@@ -1,11 +0,0 @@
-# Data Transformation
-
-import Content from '../../reuse-content/_all-features.md';
-
-
-
-To configure data transformation tasks with advanced data processing requirements such as table merging, data splitting, field adding/deleting, and data sharing without impacting the user's business, you can add processing nodes. This enables data processing, disaster recovery, and analysis scenarios.
-
-import DocCardList from '@theme/DocCardList';
-
-
\ No newline at end of file
diff --git a/docs/user-guide/data-development/create-materialized-view.md b/docs/user-guide/data-development/create-materialized-view.md
deleted file mode 100644
index 40f2d6eb..00000000
--- a/docs/user-guide/data-development/create-materialized-view.md
+++ /dev/null
@@ -1,53 +0,0 @@
-# Build Real-Time Materialized Views (Beta)
-
-import Content from '../../reuse-content/_all-features.md';
-
-
-
-Materialized views are specialized database object that cache the results of intricate queries, thereby accelerating data retrieval. With TapData, you can craft real-time materialized views across diverse data sources. This not only ensures accuracy and immediacy of data but also streamlines data management and application development.
-
-## Background
-
-In the era of exponential data growth, enterprises and developers grapple with complex data management challenges. Conventional data handling, such as manually managing and syncing various related tables, is inefficient and poses risks to data consistency. Thus, effective and real-time data integration tools become paramount.
-
-TapData's real-time materialized view feature is designed to address these challenges, seamlessly integrating varied data sources. It ensures that the view is auto-updated whenever there's a change in the source data, preserving its timeliness and accuracy. This real-time and automated nature greatly diminishes data management complexities while boosting query efficiency.
-
-To demonstrate its practicality, let's consider an e-commerce platform. Order management is central to such platforms, with critical tables like orders, sub-orders, products, user info, and logistics. Assuming the team chooses MongoDB, and aims to merge the data from these tables into a new 'order' table, TapData makes this task effortless.
-
-Next, we'll detail how to employ TapData's real-time materialized view feature in an e-commerce context, to grasp its potency.
-
-## Procedure
-
-1. [Log in to TapData Platform](../log-in.md).
-2. In the left navigation panel, click **Data Transformation**.
-3. Click **Build Materialized View** on the right, leading you to the task configuration page.
-
- 1. Select the database and table for your materialized view. In this case, choose the **order** table.
-
- 
-
- 2. As we intend to include user info, product tables, etc., first click **+ Add Field** and select **Embedded Document**, naming the field **user**.
- 3. In the popped field editor, sequentially set the associated database, table, and relation conditions. In our case, link to the **users** table via **user_id**.
-
- After setup, the **orders** table will feature an embedded document field named **user**.
-
- 
-
- 4. Add a **sub_orders** field for storing sub-order info. Click **+ Add Field** on the **orders** table, choose **Embedded Array**, name it **sub_orders**, and follow the previous step for table and relation conditions.
- 5. Add product and logistics info inside the **sub_orders** field. This time, click **+ Add Field** on the **sub_orders** table, select **Flatten**, then complete the table and relation conditions.
-
- Once all setups are done, the relationships appear as depicted below. The **orders** table now encapsulates all the table info.
-
- 
-
-4. Click the **+ Write Target** at the top right of the **orders** table editor, then select the MongoDB data source and collection name.
-
- As shown below, on the right, you can view the field types and details of the target collection **order_view**.
-
- 
-
-5. Click the **X** icon at the top left to return to the task configuration page. Click **Start** at the top right to finalize the real-time materialized view setup.
-
- Once initiated, we'll be redirected to the task monitor page, where you can observe the task's RPS (Records Per Second), latency, events, and more.
-
- 
\ No newline at end of file
diff --git a/docs/user-guide/data-service/create-api-client.md b/docs/user-guide/data-service/create-api-client.md
deleted file mode 100644
index 8b6b0ff8..00000000
--- a/docs/user-guide/data-service/create-api-client.md
+++ /dev/null
@@ -1,22 +0,0 @@
-# Create a Client
-import Content from '../../reuse-content/_enterprise-features.md';
-
-
-
-To manage and create API calls, an API client is required. Applications that developers design and develop, or any other applications needing to call API interfaces (referred to collectively as client applications), must register with the data publishing system. Upon registration, you will receive a unique client ID (client_id) and client secret (client_secret).
-
-## Procedure
-
-1. [Log in to TapData Platform](../log-in.md).
-
-2. In the left navigation bar, select **Data Services** > **API Clients**.
-
-3. Click **Create a Client** in the top right corner, fill in the relevant information, and click **OK**.
-
- 
-
- :::tip
-
- The client secret is an important basis for client applications to obtain API access authorization and should be properly stored to avoid transmission in public network environments.
-
- :::
diff --git a/docs/user-guide/data-service/create-api-service.md b/docs/user-guide/data-service/create-api-service.md
deleted file mode 100644
index 2a5444be..00000000
--- a/docs/user-guide/data-service/create-api-service.md
+++ /dev/null
@@ -1,44 +0,0 @@
-# Create Data API
-import Content from '../../reuse-content/_enterprise-features.md';
-
-
-
-To help developers easily dock interfaces and conveniently view API information published through TapData, we offer a data services feature.
-
-## Supported Data Sources
-
-Currently, it supports Doris, MongoDB, MySQL, Oracle, PostgreSQL, SQL Server, and TiDB.
-
-## Procedure
-
-1. [Log in to TapData Platform](../log-in.md).
-
-2. In the left navigation bar, choose **Data Services** > **API List**.
-
-3. Click **Create API** at the top right of the page, then complete the settings on the right panel according to the instructions below.
-
- 
-
- * **Service Name**: Enter a service name with business significance for easy identification in the future.
- * **Owner Application**: Select the affiliated application for convenient business category management. For more introduction, see [Application Management](manage-app.md).
- * **Connection Type**, **Connection Name**, **Object Name**: Choose the object to query based on business needs.
- * **Interface Type**: Choose between **Default Query** or **Custom Query**. When selecting **Custom Query**, you can set filters and set filtering/sorting conditions at the bottom of the page.
- * **API Path Settings**: Choose according to business needs.
- * **Default Path**: TapData randomly generates a unique access address.
- * **Custom Path**: The access path consists of **Version**, **Prefix**, and **Basic Path**, formatted as `/api/version/prefix/basic_path`. It supports letters, numbers, underscores (_), and dollar signs ($), but cannot start with a number.
- * **Input Parameters**: Allows modification of parameter default values.
- * **Output Results**: Supports setting the fields contained in the output results.
-
-4. Click **Save** at the top right of the page, then click **Generate** at the bottom right of the page.
-
-5. Find the service you just created and click **Publish** on its right to use the related service.
-
-6. (Optional) Click the service you just created, select the **Debug** tab in the right panel, enter request parameters, and click **Submit** to verify service availability.
-
- 
-
-7. (Optional) For the data services you have created, you can select and export them for backup or sharing with other team members. You can also import data services.
-
- 
-
- Additionally, for published data services, you can select them and click **API Document Export** to quickly establish API usage documentation within the team. The exported Word file is in docx format and includes data service name, API description, GET/POST parameter descriptions.
diff --git a/docs/user-guide/data-service/query-via-restful.md b/docs/user-guide/data-service/query-via-restful.md
deleted file mode 100644
index 1802efea..00000000
--- a/docs/user-guide/data-service/query-via-restful.md
+++ /dev/null
@@ -1,50 +0,0 @@
-# Query API through RESTful
-import Content from '../../reuse-content/_enterprise-features.md';
-
-
-
-RESTful API is an application programming interface (API or Web API) that adheres to REST architectural specifications. TapData supports integrated RESTful API services, allowing you to execute requests through the API service address and obtain managed data information.
-
-In this article, we will introduce how to use the Postman to invoke API requests.
-
-## Procedure
-
-1. [Log in to TapData Platform](../log-in.md).
-
-2. In the left navigation bar, select **Data Services** > **API List**.
-
-3. Obtain the service access address and Access Token authentication information.
-
- 1. Locate and click on the target service name.
-
- 2. Scroll down to the service access area in the right-hand panel and get the address for service access. In this case, we will demonstrate the procedure using a **GET** type service as an example.
-
- 
-
- 3. Click the **Debug** tab, scroll down to **Example Code**, and obtain the Access Token authentication information.
-
- 
-
-4. Open the [Postman tool](https://www.postman.com/), and click **Workspaces** at the top of the software page, and select your Workspace.
-
-5. Click **New**, and in the pop-up dialog box, select **HTTP Request**.
-
- 
-
-6. In the Request URL text box, enter the API query request address you obtained in step 3.
-
-7. (Optional) Click **Query Params** below the text box and set the query request parameters. For an introduction to the supported request parameters, please refer to step 3.
-
-8. Click **Authorization** below the text box, select **Type** as **Bearer Token**, and fill in the Access Token authentication information obtained in step 3.
-
- 
-
-9. Click **Query**, the return example is shown below.
-
- 
-
- :::tip
-
- TapData supports adding query conditions to the URL query string to quickly filter query results. For specific operations, see [API Query Parameter Description](api-query-params.md).
-
- :::
diff --git a/docs/user-guide/log-in.md b/docs/user-guide/log-in.md
deleted file mode 100644
index 859dde75..00000000
--- a/docs/user-guide/log-in.md
+++ /dev/null
@@ -1,24 +0,0 @@
-# Log in to TapData Platform
-import Content from '../reuse-content/_all-features.md';
-
-
-
-TapData provides a user-friendly interface, allowing you to set up and manage data pipelines easily through simple drag-and-drop actions. Before starting, you need to log in to the TapData platform according to the product series you have chosen by following the guidelines below.
-
-## TapData Cloud
-
-TapData Cloud is ideal for scenarios requiring quick deployment and low initial investment, helping you focus more on business development rather than infrastructure management. You can simply visit the [TapData Cloud](https://cloud.tapdata.io) platform and sign up to log in, with support for registration and login via email/phone number, WeChat QR code, and third-party accounts (GitHub/Google).
-
-
-
-## TapData Enterprise
-
-TapData Enterprise supports deployment to local data centers and is suitable for scenarios with strict requirements on data sensitivity or network isolation. The TapData Enterprise platform is set up by administrators [following deployment steps](../installation/install-tapdata-enterprise/README.md), who then [assign accounts](../user-guide/manage-system/manage-user.md) and [grant permissions](../user-guide/manage-system/manage-role.md) based on business needs for users within the enterprise. Regular users need to contact their administrators to obtain the login URL and credentials.
-
-
-
-## TapData Community
-
-TapData Community is an open-source data integration platform that offers basic data synchronization and transformation capabilities. It can be deployed with a single command using Docker, helping you to quickly explore and implement data integration projects. The default login is admin@admin.com with the password admin. Please change your password promptly after logging in to ensure security. Based on your business needs, you can also [assign accounts](../user-guide/manage-system/manage-user.md) to other users.
-
-
\ No newline at end of file
diff --git a/docs/user-guide/manage-system/README.md b/docs/user-guide/manage-system/README.md
deleted file mode 100644
index bd760ce0..00000000
--- a/docs/user-guide/manage-system/README.md
+++ /dev/null
@@ -1,8 +0,0 @@
-# Manage System
-import Content from '../../reuse-content/_enterprise-and-community-features.md';
-
-
-
-import DocCardList from '@theme/DocCardList';
-
-
\ No newline at end of file
diff --git a/docs/user-guide/manage-system/_manage-external-storage.md b/docs/user-guide/manage-system/_manage-external-storage.md
deleted file mode 100644
index 8ddb5d47..00000000
--- a/docs/user-guide/manage-system/_manage-external-storage.md
+++ /dev/null
@@ -1,48 +0,0 @@
-# Manage External Storage
-import Content from '../../reuse-content/_enterprise-and-community-features.md';
-
-
-
-To facilitate the quick reading of task-related information subsequently, TapData stores necessary configurations, incremental logs of source tables, and other information related to the task in its internal MongoDB database. To store more data, you can create an external database to store relevant data.
-
-:::tip
-
-If you are using TapData Cloud, you can access this feature by navigating to **Advanced** > **External Storage** in the menu.
-
-:::
-
-## Prerequisites
-
-An external database intended for data storage has been created. Currently, MongoDB and RocksDB are supported.
-
-## Create External Storage
-
-1. [Log in to TapData Platform](../log-in.md).
-
-2. In the left navigation bar, select **System** > **External Storage**.
-
-3. On the right side of the page, click **Create External Storage**.
-
-4. In the pop-up dialog, complete the configuration according to the instructions below.
-
- 
-
- * **Storage Name**: Enter a cache name with business significance for easy identification later.
- * **External Memory Type**: Supports **MongoDB** and **RocksDB**.
- * **Storage Path**: Enter the database connection address, for example, for MongoDB refer to: `mongodb:/admin:password@127.0.0.1:27017/mydb?replicaSet=xxx&authSource=admin`.
- * **Connect Using TLS/SSL**: Choose whether to enable TSL/SSL encryption; if this function is enabled, you will also need to upload the client’s private key.
- * **Set as Default**: Choose whether to use it as the default external storage.
-
-5. Click **Test**; after passing the test, click **Save**.
-
- :::tip
-
- If the connection test fails as indicated, please fix it according to the page prompt.
-
- :::
-
-## Use External Storage
-
-You can enable the Shared Mining feature and select the recently configured external storage when [creating a connection](../../prerequisites/README.md), as shown in the example below:
-
-
\ No newline at end of file
diff --git a/docs/user-guide/other-settings/README.md b/docs/user-guide/other-settings/README.md
deleted file mode 100644
index dab86945..00000000
--- a/docs/user-guide/other-settings/README.md
+++ /dev/null
@@ -1,8 +0,0 @@
-# Other Settings
-import Content from '../../reuse-content/_enterprise-and-community-features.md';
-
-
-
-import DocCardList from '@theme/DocCardList';
-
-
\ No newline at end of file
diff --git a/docs/user-guide/real-time-data-hub/README.md b/docs/user-guide/real-time-data-hub/README.md
deleted file mode 100644
index 461cbcae..00000000
--- a/docs/user-guide/real-time-data-hub/README.md
+++ /dev/null
@@ -1,16 +0,0 @@
-# Real-Time Data Hub
-import Content from '../../reuse-content/_enterprise-and-cloud-features.md';
-
-
-
-The Real-Time Data Hub supports two modes, catering to different data governance needs.
-
-* **[Data Integration Platform Mode](etl-mode)**: Suitable for data replication/synchronization, cloud migration, or building ETL pipelines. You simply drag and drop source tables to the target to automatically create data replication tasks.
-
-* **[Data Service Platform Mode](daas-mode)**: Based on the concept of data layer governance, this mode synchronizes data scattered across different business systems to a unified platform cache layer, minimizing the impact of data extraction on operations. It provides foundational data for subsequent data processing and business operations, thus building a consistent, real-time data platform and connecting data silos.
-
-
-
-import DocCardList from '@theme/DocCardList';
-
-
\ No newline at end of file
diff --git a/docs/user-guide/real-time-data-hub/_create-task.md b/docs/user-guide/real-time-data-hub/_create-task.md
deleted file mode 100644
index 86a836a5..00000000
--- a/docs/user-guide/real-time-data-hub/_create-task.md
+++ /dev/null
@@ -1,89 +0,0 @@
-# Automatic Data Flow with One Click
-
-import Content from '../../reuse-content/_enterprise-and-cloud-features.md';
-
-
-
-In the Data Service Platform Mode, you can simply drag the source table to the required level to generate a data pipeline and automatically start the task, greatly simplifying the task configuration process. This article describes how to achieve the flow of data between different levels and finally provide it to the terminal business.
-
-```mdx-code-block
-import Tabs from '@theme/Tabs';
-import TabItem from '@theme/TabItem';
-```
-
-
-## Procedure
-
-1. Log in to [TapData Cloud](https://cloud.tapdata.io/).
-
-2. In the left navigation panel, click **Real-Time Data Hub**.
-
-3. On this page, you can conveniently access the data source information you have entered. TapData organizes the data governance and flow order into four distinct levels, providing a clear view of the data hierarchy.
-
- 
-
- :::tip
-
- For detailed descriptions of each layer, see [Real-Time Data Hub Introduction](daas-mode/enable-daas-mode.md).
-
- :::
-
-4. Follow the process below to complete the data flow with one click.
-
- :::tip
- Through the **Master Data Model**, you can adjust the table structure (such as adding fields), merge tables, build wide tables, etc. If the table of the **Foundation Data Model** already meets your business needs, you can directly publish the API or drag the table of the cache layer to the **Target & Service**.
- :::
-
-```mdx-code-block
-
-
-```
-1. At the Sources, click the icon to find the table you want to synchronize and drag it to the **Foundation Data Model**.
-
-2. In the pop-up dialog box, fill in the table prefix that is meaningful for your business, select the synchronization type, and choose whether to run the task.
-
- In this case, the table we want to synchronize is the customer, fill in the prefix here is FDM_demo, then in the Master Data Model, the table is called FDM__demo_customer.
-
- 
-
- * **Only Save**: Save the task without running it. You can now click on the task name in the Foundation Data Model to customize the task further.
- * **Save and Run**: No additional action is required. TapData will automatically create a data replication task to synchronize your selected tables in real-time to the Master Data Model and automatically verify. You can click the icon on the right side of the table name in the Foundation Data Model and jump to the task monitoring page to see the task operation details.
-
-
-
-
-
-
-
-1. At the **Foundation Data Model**, click the icon to find the table you want to process and drag it to the **Master Data Model**.
-
-2. In the pop-up dialog, fill in the table name and select whether to start the task.
-
- 
-
- * Only Save: Save the task without running it. You can now click on the task name in the target data card to customize the task further. On the redirected task configuration page, you can processing nodes to meet requirements such as table structure adjustment (e.g., adding fields), table merging, and building wide tables. Once the setup is complete, click Start in the upper right corner of the page.
- * **Save and Run**: No further action is necessary as TapData automatically generates a data transformation task and initiates it to synchronize the table in real-time with the Data Processing Layer.
-
-3. At the Data Processing Layer, find the target table and clicking on the icon as below show will allows you to explore the lineage of the table, revealing the chain of relationships that led to the creation of this data table. This feature assists you in effectively managing your tables.
-
- 
-
- In this case, we constructed a wide table called **MDM_demo_customer** based on the initial **customer** and **lineorder** tables.
-
-
-
-
-
-1. From either the **Foundation Data Model** or the **Master Data Model** , locate the desired table that you wish to synchronize. Then, simply drag and drop the table onto the target data source within the **Targets & Service**.
- 
-
-2. In the pop-up dialog, enter the task name and choose whether to run the task.
-
- * Only Save: Save the task without running it. You can now click on the task name in the target data card to customize the task further. On the redirected task configuration page, you can add [processing nodes](../data-development/process-node.md) to meet requirements such as table structure adjustment (e.g., adding fields), table merging, and building wide tables. Once the setup is complete, click the Start in the upper right corner of the page.
- * Save and Run: No further action is necessary as TapData automatically generates a data transformation task and initiates it to synchronize the table in real-time with the Data Processing Layer.
-
- Once setup is complete, TapData will automatically create a data transformation task to synchronize your source tables in real-time to the selected target data source and provide them to the final business. You can also click the task name in the target data card to enter the task monitoring page to see the detailed operation status. For more information, see Monitor Task.
-
-
-
-
diff --git a/docs/user-guide/real-time-data-hub/_dashboard.md b/docs/user-guide/real-time-data-hub/_dashboard.md
deleted file mode 100644
index c2cd3c0f..00000000
--- a/docs/user-guide/real-time-data-hub/_dashboard.md
+++ /dev/null
@@ -1,66 +0,0 @@
-# Real-Time Data Hub Dashboard
-
-import Content from '../../reuse-content/_enterprise-and-cloud-features.md';
-
-
-
-Once the data service platform mode is activated, the page will be organized according to the previously mentioned [hierarchy](daas-mode/enable-daas-mode.md). You can effortlessly drag the table to the next level, which will automatically create data replication tasks and streamline the data flow.
-
-This article provides a comprehensive guide on utilizing the Data Service Platform Mode interface, enabling you to swiftly grasp the functionality of the various modules.
-
-
-```mdx-code-block
-import Tabs from '@theme/Tabs';
-import TabItem from '@theme/TabItem';
-```
-
-## Procedure
-
-1. Log in to [TapData Cloud](https://cloud.tapdata.io/).
-
-2. In the left navigation panel, click **Real-Time Data Hub**.
-
-3. On this page, you can conveniently view the information you have entered for your data source. In the following sections, we will explain the functions of each button available.
-
- 
-
-```mdx-code-block
-
-
-```
-Clicking the icon allows you to view the data source information in the form of a directory structure. You can navigate through the directory structure to select a specific table.
-
-In the catelog view, if you select a specific table, you can also see table details on the right-hand side of the page. The introduction to each tab is as follows:
-
-
-* **Overview**: Provides essential information about tables, including table size, row count, column types, column descriptions (sourced from comments by default), sample data, and more.
-* **Schema**: Offers in-depth insights into table columns, encompassing details like column types, primary keys, foreign keys, default values, and more.
-* **Tasks**: Displays associated tasks for the table, along with their respective statuses. This tab also enables the creation of new tasks.
-* **Lineage**: Presents data lineage relationships visually through a graph format, aiding in effective data quality management. Clicking on a task node allows direct navigation to the monitoring page of the relevant task.
-
-
-
-
-Clicking the icon, in the pop-up dialog, we can add a data source, select a data source will jump to the connection configuration page. For more information, see Connect Data Sources.
-
-
-
-Clicking the icon allows you to enter a keyword for the table name, enabling you to quickly navigate to the specific table. This feature is also supported in other Layers.
-
-
-
-On the right side of the data connection, clicking the icon,will display the connection information and associated tasks of the data source on the right side of the page.
-
-
-
-
-
-Clicking on this icon allows you to explore the lineage of the table, revealing the chain of relationships that led to the creation of this data table. This feature assists you in effectively managing your tables.
-
-
-
-
-
-Clicking on the icon, you can view all related data synchronization tasks for this data source, along with their operational statuses and other relevant information.
-
-
diff --git a/docs/user-guide/real-time-data-hub/daas-mode/README.md b/docs/user-guide/real-time-data-hub/daas-mode/README.md
deleted file mode 100644
index c105cdeb..00000000
--- a/docs/user-guide/real-time-data-hub/daas-mode/README.md
+++ /dev/null
@@ -1,9 +0,0 @@
-# Data as Service Mode
-import Content from '../../../reuse-content/_enterprise-and-cloud-features.md';
-
-
-
-
-import DocCardList from '@theme/DocCardList';
-
-
\ No newline at end of file
diff --git a/docs/user-guide/real-time-data-hub/daas-mode/create-daas-task.md b/docs/user-guide/real-time-data-hub/daas-mode/create-daas-task.md
deleted file mode 100644
index 3608ac93..00000000
--- a/docs/user-guide/real-time-data-hub/daas-mode/create-daas-task.md
+++ /dev/null
@@ -1,88 +0,0 @@
-# Automatic Data Flow with One Click
-import Content from '../../../reuse-content/_enterprise-and-cloud-features.md';
-
-
-
-In the Data Service Platform Mode, you can simply drag the source table to the required level to generate a data pipeline and automatically start the task, greatly simplifying the task configuration process. This article describes how to achieve the flow of data between different levels and finally provide it to the terminal business.
-
-```mdx-code-block
-import Tabs from '@theme/Tabs';
-import TabItem from '@theme/TabItem';
-```
-
-
-## Procedure
-
-1. [Log in to TapData Platform](../../log-in.md).
-
-2. In the left navigation panel, click **Real-Time Data Hub**.
-
-3. On this page, you can conveniently access the data source information you have entered. TapData organizes the data governance and flow order into four distinct levels, providing a clear view of the data hierarchy.
-
- 
-
- :::tip
-
- For detailed descriptions of each layer, see [Real-Time Data Hub Introduction](enable-daas-mode.md).
-
- :::
-
-4. Follow the process below to complete the data flow with one click.
-
- :::tip
- Through the **Master Data Model**, you can adjust the table structure (such as adding fields), merge tables, build wide tables, etc. If the table of the **Foundation Data Model** already meets your business needs, you can directly publish the API or drag the table of the cache layer to the **Target & Service**.
- :::
-
-```mdx-code-block
-
-
-```
-1. At the Sources, click the icon to find the table you want to synchronize and drag it to the **Foundation Data Model**.
-
-2. In the pop-up dialog box, fill in the table prefix that is meaningful for your business, select the synchronization type, and choose whether to run the task.
-
- In this case, the table we want to synchronize is the customer, fill in the prefix here is FDM_demo, then in the Master Data Model, the table is called FDM__demo_customer.
-
- 
-
- * **Only Save**: Save the task without running it. You can now click on the task name in the Foundation Data Model to customize the task further.
- * **Save and Run**: No additional action is required. TapData will automatically create a data replication task to synchronize your selected tables in real-time to the Master Data Model and automatically verify. You can click the icon on the right side of the table name in the Foundation Data Model and jump to the task monitoring page to see the task operation details.
-
-
-
-
-
-
-
-1. At the **Foundation Data Model**, click the icon to find the table you want to process and drag it to the **Master Data Model**.
-
-2. In the pop-up dialog, fill in the table name and select whether to start the task.
-
- 
-
- * Only Save: Save the task without running it. You can now click on the task name in the target data card to customize the task further. On the redirected task configuration page, you can processing nodes to meet requirements such as table structure adjustment (e.g., adding fields), table merging, and building wide tables. Once the setup is complete, click Start in the upper right corner of the page.
- * **Save and Run**: No further action is necessary as TapData automatically generates a data transformation task and initiates it to synchronize the table in real-time with the Data Processing Layer.
-
-3. At the Data Processing Layer, find the target table and clicking on the icon as below show will allows you to explore the lineage of the table, revealing the chain of relationships that led to the creation of this data table. This feature assists you in effectively managing your tables.
-
- 
-
- In this case, we constructed a wide table called **MDM_demo_customer** based on the initial **customer** and **lineorder** tables.
-
-
-
-
-
-1. From either the **Foundation Data Model** or the **Master Data Model** , locate the desired table that you wish to synchronize. Then, simply drag and drop the table onto the target data source within the **Targets & Service**.
- 
-
-2. In the pop-up dialog, enter the task name and choose whether to run the task.
-
- * Only Save: Save the task without running it. You can now click on the task name in the target data card to customize the task further. On the redirected task configuration page, you can add [processing nodes](../../data-development/process-node.md) to meet requirements such as table structure adjustment (e.g., adding fields), table merging, and building wide tables. Once the setup is complete, click the Start in the upper right corner of the page.
- * Save and Run: No further action is necessary as TapData automatically generates a data transformation task and initiates it to synchronize the table in real-time with the Data Processing Layer.
-
- Once setup is complete, TapData will automatically create a data transformation task to synchronize your source tables in real-time to the selected target data source and provide them to the final business. You can also click the task name in the target data card to enter the task monitoring page to see the detailed operation status. For more information, see Monitor Task.
-
-
-
-
diff --git a/docs/user-guide/real-time-data-hub/daas-mode/daas-mode-dashboard.md b/docs/user-guide/real-time-data-hub/daas-mode/daas-mode-dashboard.md
deleted file mode 100644
index 7053f73e..00000000
--- a/docs/user-guide/real-time-data-hub/daas-mode/daas-mode-dashboard.md
+++ /dev/null
@@ -1,66 +0,0 @@
-# DaaS Dashboard
-import Content from '../../../reuse-content/_enterprise-and-cloud-features.md';
-
-
-
-Once the data service platform mode is activated, the page will be organized according to the previously mentioned [hierarchy](enable-daas-mode.md). You can effortlessly drag the table to the next level, which will automatically create data replication tasks and streamline the data flow.
-
-This article provides a comprehensive guide on utilizing the Data Service Platform Mode interface, enabling you to swiftly grasp the functionality of the various modules.
-
-
-```mdx-code-block
-import Tabs from '@theme/Tabs';
-import TabItem from '@theme/TabItem';
-```
-
-## Procedure
-
-1. [Log in to TapData Platform](../../log-in.md).
-
-2. In the left navigation panel, click **Real-Time Data Hub**.
-
-3. On this page, you can conveniently view the information you have entered for your data source. In the following sections, we will explain the functions of each button available.
-
- 
-
-```mdx-code-block
-
-
-```
-Click the icon allows you to view the data source information in the form of a directory structure. You can navigate through the directory structure to select a specific table.
-
-In the catelog view, if you select a specific table, you can also see table details on the right-hand side of the page. The introduction to each tab is as follows:
-
-
-* **Overview**: Provides essential information about tables, including table size, row count, column types, column descriptions (sourced from comments by default), sample data, and more.
-* **Schema**: Offers in-depth insights into table columns, encompassing details like column types, primary keys, foreign keys, default values, and more.
-* **Tasks**: Displays associated tasks for the table, along with their respective statuses. This tab also enables the creation of new tasks.
-* **Lineage**: Presents data lineage relationships visually through a graph format, aiding in effective data quality management. Clicking on a task node allows direct navigation to the monitoring page of the relevant task.
-
-
-
-
-
-Click the icon, in the pop-up dialog, we can add a data source, select a data source will jump to the connection configuration page. For more information, see Connect Data Sources.
-
-
-
-Click the icon allows you to enter a keyword for the table name, enabling you to quickly navigate to the specific table. This feature is also supported in other Layers.
-
-
-
-On the right side of the data connection, clicking the icon,will display the connection information and associated tasks of the data source on the right side of the page.
-
-
-
-
-
-Clicking on this icon allows you to explore the lineage of the table, revealing the chain of relationships that led to the creation of this data table. This feature assists you in effectively managing your tables.
-
-
-
-
-
-Clicking on the icon, you can view all related data synchronization tasks for this data source, along with their operational statuses and other relevant information.
-
-
diff --git a/docs/user-guide/real-time-data-hub/daas-mode/enable-daas-mode.md b/docs/user-guide/real-time-data-hub/daas-mode/enable-daas-mode.md
deleted file mode 100644
index 0d841cd4..00000000
--- a/docs/user-guide/real-time-data-hub/daas-mode/enable-daas-mode.md
+++ /dev/null
@@ -1,119 +0,0 @@
-# Enable Real-Time Data Hub
-
-import Content from '../../../reuse-content/_enterprise-and-cloud-features.md';
-
-
-
-Due to digital transformation, the presence of isolated data, data fragmentation, or data silos has emerged as a significant challenge. Moreover, there is a growing demand for data in business operations. However, traditional data delivery methods pose limitations, such as lengthy processes and substantial resource requirements. This situation calls for a solution that enables organizations to swiftly establish data flow pipelines and unlock the value of their data.
-
-TapData Cloud's Real-Time Data Hub offers a powerful solution. By synchronizing data from diverse business systems to a unified platform cache layer, it enables the consolidation of data sources and facilitates seamless data processing and analysis. This unified and real-time data platform helps enterprises overcome data silos and promotes data-driven decision-making, ultimately enhancing their competitiveness in the market.
-
-
-```mdx-code-block
-import Tabs from '@theme/Tabs';
-import TabItem from '@theme/TabItem';
-```
-
-## Background
-
-In today's digital age, one of the greatest challenges for enterprises is how to efficiently process and analyze vast amounts of real-time data. Traditional methods of data handling, such as batch processing or manually writing data ETL scripts, often fail to provide timely data analysis and processing. This limitation restricts businesses' ability to make prompt decisions in a rapidly changing market environment. Moreover, performing data operations directly on production databases can impact their stability and security, affecting overall business efficiency.
-
-The introduction of a Real-Time Data Hub aims to resolve these issues. It provides an efficient and reliable platform that helps businesses process and analyze data in real time, quickly responding to market and customer demands. For example:
-
-* By integrating TapData's Real-Time Data Hub, a company successfully built a data dashboard to monitor cloud-based user behavior. They streamed database data in real time to TapData’s platform cache layer, allowing real-time processing of cache layer data to generate key business metrics without affecting the source databases. This provided the freshest data for necessary BI reports, offering immediate business insights and analysis.
-* In another case, a retail enterprise utilized the Real-Time Data Hub to build a data portal. This portal enabled front-end business developers to quickly discover and process data through self-service, allowing them to build and publish APIs. Using TapData's data catalog, they could rapidly locate necessary data, enabling self-service processing and modeling. This not only enhanced development efficiency but also reduced reliance on specialized data teams, saving the enterprise substantial costs.
-
-These cases collectively demonstrate how the Real-Time Data Center can help businesses overcome the limitations of traditional data handling, offering more efficient and flexible data management solutions. Through real-time data processing, enterprises can better grasp market dynamics, quickly respond to customer needs, and maintain a competitive edge.
-
-## Real-Time Data Hub Introduction
-
-With the increase in the tasks carried by the source database, in order to minimize the impact of data extraction on the source database and adhere to the organization's concept of data hierarchical governance, TapData organizes the data service platform in a layered manner based on the data flow order. This hierarchical arrangement ensures efficient and structured data processing, allowing for better data management and seamless integration across different systems.
-
-
-
-| Hierarchy | Description |
-| -------------------- |------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
-| **Sources** | TapData consolidates data sources from various business systems into a centralized data source layer, which serves as the initial step in bridging data silos. This abstraction of data sources enables a unified and streamlined approach to accessing and utilizing data. For more detailed instructions, please refer to the [Connect Data Sources](../../../prerequisites/README.md) section for comprehensive information on establishing connections with your data sources. |
-| **FDM (Foundation Data Model)** | By synchronizing the table from the source database to the **FDM** beforehand, the data can be readily accessed by the business through the FDM, thus eliminating the need to directly read or manipulate the data in the source database, such as performing union operations, during data processing. This approach significantly minimizes the impact on the business operations of the source database. |
-| **MDM (Master Data Model)** | If there is a need for extensive customization of data processing or operations, such as generating a wide table, it is possible to extract the data table from the FDM and perform the required operations within the **MDM**. This allows for the generation of model data that can be used in the final business processes. |
-| **Targets & Service** | TapData provides a centralized platform that aggregates and presents various data sources, allowing them to be utilized as targets for data processing. This enables the provision of processed data to the business, facilitating the creation of a unified data service platform for enterprises. |
-
-
-
-## Procedure
-
-In the Real-Time Data Hub, we need to prepare a MongoDB database as a data repository for the Data Cache Layer and Data Processing Layer.
-
-1. [Log in to TapData Platform](../../log-in.md).
-
-2. In the left navigation panel, click **Real-Time Data Hub**.
-
-3. Choose the steps based on your product series:
-
-
-
-
-1. View the introduction to the Real-Time Data Center and scroll down to the bottom of the page, click **Subscribe Storage**.
-
-2. Choose the provider for MongoDB Atlas services, deployment region, specification, and subscription period as follows:
- 
-
- * **Cloud Provider**: Currently supported: Google Cloud.
- * **Region**: Select the deployment region. Choose a region close to your data source for minimal network latency.
- * **Specification**: Pick the **specification** and **storage size** for MongoDB Atlas.
- :::tip
- TapData offers a free trial option with specifications that you can select. You can choose the **Free Trial** option to get started.
- :::
-
- Specifications Description
-
-
M10: 2 vCPUs, 2 GB RAM
-
M20: 2 vCPUs, 4 GB RAM
-
M30: 2 vCPUs, 8 GB RAM
-
M40: 4 vCPUs, 16 GB RAM
-
M50: 8 vCPUs, 32 GB RAM
-
M60: 16 vCPUs, 64 GB RAM
-
-
- * **Subscription Period**: Select the desired subscription period.
- Want to use an existing MongoDB Atlas?
- At the top of the page, click on click here to privede the connection information, and fill in the MongoDB Atlas connection URL.
-
-
-3. Click **Subscription**, on the following page, carefully review and confirm the specifications you wish to purchase. Ensure that the selected billing method aligns with your preferences. Additionally, verify that the email address provided is accurate and where you would like to receive the bill.
-
-4. Once you have double-checked all the information, click on the **Pay Now** button to proceed with the purchase.
-
-5. You will redirected to payment page. Please follow the instructions on the payment page to complete the payment process.
-
- After the payment is completed, the page will return to the **Real-Time Data Hub** page. Once the instance is automatically deployed, the page will be organized and displayed according to the [hierarchy we introduced before](#intro). For information on how to use it, see [Real-Time Data Hub Dashboard Introduction](daas-mode-dashboard.md).
-
-
-
-
-
-1. Prepare a MongoDB database (version 4.0 or above), then [connect this database](../../../prerequisites/on-prem-databases/mongodb.md) on the TapData platform, using it as the storage engine for the platform cache layer/platform processing layer. Deployment details can be seen in [deployment examples](../../../administration/production-deploy/install-replica-mongodb.md) or on the [MongoDB official website](https://www.mongodb.com/docs/manual/administration/install-on-linux/).
-
- :::tip
-
- To ensure business high availability, it is recommended that MongoDB uses a replica set/sharded cluster architecture. Additionally, based on the data scale of the source layer, sufficient storage space and Oplog space (recommended 14 days or more) should be reserved.
-
- :::
-
-2. On the right side of the TapData platform page, click the  icon.
-
-3. Select Data Services Platform Mode, then set the storage engine used for the platform cache layer/platform processing layer, which we have prepared as the MongoDB data source.
-
- 
-
- :::tip
-
- Once the storage engine is selected and saved, it cannot be modified later, so please operate with caution.
-
- :::
-
-4. Click **Save**.
-
-
-
-
diff --git a/docs/user-guide/real-time-data-hub/etl-mode/README.md b/docs/user-guide/real-time-data-hub/etl-mode/README.md
deleted file mode 100644
index 4ab319f4..00000000
--- a/docs/user-guide/real-time-data-hub/etl-mode/README.md
+++ /dev/null
@@ -1,8 +0,0 @@
-# Data Integration Mode
-import Content from '../../../reuse-content/_enterprise-and-cloud-features.md';
-
-
-
-import DocCardList from '@theme/DocCardList';
-
-
\ No newline at end of file
diff --git a/docs/user-guide/real-time-data-hub/etl-mode/create-etl-task.md b/docs/user-guide/real-time-data-hub/etl-mode/create-etl-task.md
deleted file mode 100644
index f1812877..00000000
--- a/docs/user-guide/real-time-data-hub/etl-mode/create-etl-task.md
+++ /dev/null
@@ -1,30 +0,0 @@
-# Generate Data Pipeline with One Click
-
-import Content from '../../../reuse-content/_enterprise-and-cloud-features.md';
-
-
-
-In the Data Integration Mode, you can simply drag the source table to the target database to generate a data pipeline with one click, greatly simplifying the task configuration process and real-time synchronization of source data. This article introduce how to generate a data pipeline.
-
-## Procedure
-
-1. [Log in to TapData Platform](../../log-in.md).
-
-2. In the left navigation panel, click **Real-Time Data Hub**.
-
-3. On this page, you can conveniently view the data source information you have entered. The page is divided into two columns labeled **Sources** and **Targets & Services** by TapData Cloud. This helps you distinguish between the source and target data sources and provides a clear overview of your data connections.
-
- 
-
-4. (Optional) Click the 🔍 icon to find the source table you want to synchronize and drag it to the right target data source.
-
-5. In the pop-up dialog enter the task name and choose whether to run the task.
-
- 
-
- - **Only Save**: Save the task without running it. You can now click on the task name in the target data card to customize the task further. On the redirected task configuration page, you can add [processing nodes](../../data-development/process-node.md) to meet requirements such as table structure adjustment (e.g., adding fields), table merging, and building wide tables. Once the setup is complete, click **Start** in the upper right corner of the page.
-
- - **Save and Run**: No additional action is required. TapData will automatically create a data transformation task and run it to synchronize your source tables in real-time to the selected target data source. In this case, the **customer** table in the source MySQL will be synchronized to MongoDB in real-time.
-
- You can also click the task name in the target data card to enter the task monitoring page to see the detailed operation status. For more information, see [Monitoring Tasks](../../data-development/monitor-task.md).
-
diff --git a/docs/user-guide/real-time-data-hub/etl-mode/etl-mode-dashboard.md b/docs/user-guide/real-time-data-hub/etl-mode/etl-mode-dashboard.md
deleted file mode 100644
index 7e4d8e75..00000000
--- a/docs/user-guide/real-time-data-hub/etl-mode/etl-mode-dashboard.md
+++ /dev/null
@@ -1,64 +0,0 @@
-# Data Integration Dashboard
-
-import Content from '../../../reuse-content/_enterprise-and-cloud-features.md';
-
-
-
-TapData's data console is designed with Data Integration Mode as its default setting. This mode is specifically designed for tasks such as data replication, synchronization, migrating data to the cloud, and building ETL pipelines. It offers a user-friendly interface where you can easily drag and drop the source table onto the target, allowing for the automatic creation of data replication tasks.
-
-In this article, we will provide a comprehensive guide on utilizing the Data Integration Mode dashboard. It will walk you through the various functional modules, helping you gain a better understanding of how to effectively leverage this powerful tool.
-
-:::tip
-
-To minimize the impact on source databases and align with the data hierarchical governance concept, you can switch to the [Data Service Platform Model](../daas-mode/enable-daas-mode.md) in TapData. This model enables real-time data synchronization to the Data Cache Layer, ensuring up-to-date and consistent data across systems.
-
-:::
-
-## Procedure
-
-1. [Log in to TapData Platform](../../log-in.md).
-
-2. In the left navigation panel, click **Real-Time Data Hub**.
-
-3. On this page, you can conveniently view the information you have entered for your data source. In the following sections, we will explain the functions of each button available.
-
- 
-
-
-
-import Tabs from '@theme/Tabs';
-import TabItem from '@theme/TabItem';
-
-
-
-
-
Click the icon to display the data source information in the form of a directory structure (click again to switch back to the Console view). Select the specific table, or view the tasks associated with the table and the basic information of the table, including table size, number of rows, column information, sample data, Scheme (such as primary key/foreign key), etc.
-
-
-
-
-
Clicking the icon opens a dialog where you can add a data source. Selecting a data source will take you to the connection configuration page. For more information, see Connect Data Sources.
-
-
-
-
Clicking the icon allows you to enter a keyword for the table name, enabling you to quickly navigate to the specific table. This feature is also supported in other Layers.
-
-
-
-
-
-
On the right side of the data connection, click the icon will display the connection information and associated tasks of the data source on the right side of the page.
-
-
-
-
-
On the right side of the table name, click the icon. On the right side of the page, the basic information of the tasks and tables associated with the table will be displayed, including table size, number of rows, column information, sample data, Scheme (such as primary key/foreign key), etc.