tapdata · MartinMa318 · Dec 9, 2025 · Nov 29, 2025 · Dec 2, 2025 · Dec 2, 2025
diff --git a/.github/actions/spelling/allow.txt b/.github/actions/spelling/allow.txt
@@ -720,4 +720,11 @@ topmetrics
 tsdb
 xzvf
 bson
+fileformat
+hangzhou
+HCFS
+namenode
+paimon
+upserts
+
 
diff --git a/docs/connectors/supported-data-sources.md b/docs/connectors/supported-data-sources.md
@@ -385,6 +385,15 @@ The beta version of the data sources is in public preview and has passed the bas
     <td>➖</td>
     <td>0.11.0</td>
   </tr>
+  <tr>
+    <td>Paimon</td>
+    <td>➖</td>
+    <td>➖</td>
+    <td>➖</td>
+    <td>✅</td>
+    <td>➖</td>
+    <td>0.6 and above</td>
+  </tr>
   <tr>
     <td>SelectDB</td>
     <td>➖</td>

diff --git a/docs/connectors/warehouses-and-lake/paimon.md b/docs/connectors/warehouses-and-lake/paimon.md
@@ -0,0 +1,106 @@
+# Paimon
+
+import Content1 from '../../reuse-content/_enterprise-and-community-features.md';
+
+<Content1 />
+
+Apache Paimon is a lake format that lets you build a real-time Lakehouse with Flink and Spark. TapData can stream data into Paimon tables for an always-up-to-date data lake.
+
+```mdx-code-block
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+```
+
+## Supported versions
+
+Paimon 0.6 and later (0.8.2+ recommended)
+
+## Supported operations
+
+DML only: INSERT, UPDATE, DELETE
+
+## Supported data types
+
+All Paimon 0.6+ types. To preserve precision, follow the [official docs](https://paimon.apache.org/docs/master/concepts/spec/fileformat/) when mapping columns—for example, use INT32 for DATE in Parquet files.
+
+:::tip
+Add a [Type Modification Processor](../../data-transformation/process-node.md#type-modification) to the job if you need to cast columns to a different Paimon type.
+:::
+
+## Considerations
+
+- To avoid write conflicts and reduce compaction pressure, disable multi-threaded writes in the target node and set batch size to 1,000 rows and timeout to 1,000 ms.
+- Always define a primary key for efficient upserts and deletes; for large tables, use partitioning to speed up queries and writes.
+- Paimon supports primary keys only (no secondary indexes) and does not allow runtime schema evolution.
+
+## Connect to Paimon
+
+1. Log in to TapData platform.
+2. In the left navigation bar, click **Connections**.
+3. On the right side of the page, click **Create**.
+4. In the pop-up dialog, search for and select **Paimon**.
+5. Fill in the connection details as shown below.
+
+   ![Connect to Paimon](../../images/connect_paimon.png)
+
+   **Basic Settings**
+   - **Name**: Enter a meaningful and unique name.
+   - **Type**: Only supports using Paimon as a target database.
+   - **Warehouse Path**: Enter the root path for Paimon data based on the storage type.
+     - S3: `s3://bucket/path`
+     - HDFS: `hdfs://namenode:port/path`
+     - OSS: `oss://bucket/path`
+     - Local FS: `/local/path/to/warehouse`
+   - **Storage Type**: TapData supports S3, HDFS, OSS, and Local FS, with each storage type having its own connection settings.
+
+     ```mdx-code-block
+     <Tabs className="unique-tabs">
+     <TabItem value="S3" default>
+     ```
+     Use this option for any S3-compatible object store—AWS S3, MinIO, or private-cloud solutions. Supply the endpoint, keys, and region (if required) so TapData can write Paimon data directly to your bucket.
+     - **S3 Endpoint**: full URL including protocol and port, e.g. `http://192.168.1.57:9000/`
+     - **S3 Access Key**: the Access-Key ID that owns read/write permission on the bucket/path
+     - **S3 Secret Key**: the corresponding Secret-Access-Key
+     - **S3 Region**: the region where the bucket was created, e.g. `us-east-1`
+
+     </TabItem>
+
+     <TabItem value="HDFS">
+     Choose this when your warehouse sits on Hadoop HDFS or any HCFS-compatible cluster. TapData writes through the standard HDFS client, so give it the NameNode host/port and the OS user it should impersonate.
+
+     - **HDFS Host**: NameNode hostname or IP, e.g. `192.168.1.57`
+     - **HDFS Port**: NameNode RPC port, e.g. `9000` or `8020`
+     - **HDFS User**: OS user that TapData will impersonate when writing, e.g. `hadoop` 
+
+     </TabItem>
+
+     <TabItem value="OSS">
+     Pick this for Alibaba Cloud OSS or any other OSS-compatible provider. Enter the public or VPC endpoint, the access key pair, and TapData will create Paimon files inside the bucket you specify.
+
+     - **OSS Endpoint**: VPC or public endpoint, e.g. `https://oss-cn-hangzhou.aliyuncs.com` (do **not** include the bucket name)
+     - **OSS Access Key**: Access-Key ID that has read/write permission on the bucket/path
+     - **OSS Secret Key**: the corresponding Access-Key Secret
+
+     </TabItem>
+
+     <TabItem value="Local">
+
+     **Local filesystem**:
+     Select this option if you want to store the Paimon warehouse on a local disk or an NFS mount that is visible to the TapData server. Make sure the directory is writable by the TapData OS user and that enough free space is available for both data and compaction temporary files.
+
+     </TabItem>
+     </Tabs>
+
+   - **Database Name**: one connection maps to one database (default is `default`). Create extra connections for additional databases.
+
+   **Advanced Settings**
+   - **Agent Settings**: Defaults to **Platform automatic allocation**, you can also manually specify an agent.
+   - **Model Load Time**: If there are less than 10,000 models in the data source, their schema will be updated every hour. But if the number of models exceeds 10,000, the refresh will take place daily at the time you have specified.
+
+6. Click **Test** at the bottom; after it passes, click **Save**.
+
+   :::tip
+
+   If the test fails, follow the on-screen hints to fix the issue.
+
+   :::
diff --git a/docs/data-replication/create-task.md b/docs/data-replication/create-task.md
@@ -93,7 +93,9 @@ As an example of creating a data replication task, the article demonstrates the
      * **Incremental Multi-thread Writing**: The number of concurrent threads for writing incremental data.
      * **Batch Write Item Quantity**: The number of items written per batch during full synchronization.
      * **Max Wait Time per Batch Write**: Set the maximum waiting time per batch write, evaluated based on the target database’s performance and network latency, in milliseconds.
-   * <span id="advanced-settings">**Advanced Settings**</span>
+
+   * **<span id="advanced-settings">Advanced Settings</span>**
+
      * **Data Writing Mode**: Select according to business needs.
        * **Process by Event Type**: If you choose this, you also need to select data writing strategies for insert, update, and delete events.
        * **Statistical Append Write**: Only processes insert events, discarding update and delete events.

diff --git a/docs/images/connect_paimon.png b/docs/images/connect_paimon.png
diff --git a/docs/introduction/terms.md b/docs/introduction/terms.md
@@ -76,4 +76,17 @@ A lightweight runtime component that executes pipelines. It connects to data sou
 
 ## TCM (TapData Control Manager)
 
-The centralized management plane for pipeline orchestration, configuration, monitoring, and deployment. Users interact with TCM to create, modify, and observe pipelines.
+The centralized management plane for pipeline orchestration, configuration, monitoring, and deployment. Users interact with TCM to create, modify, and observe pipelines.
+
+
+## QPS
+
+Queries Per Second. The average number of change events the sync task processes every second. It shows how fast data is replicated from the source to the target.
+
+## Incremental Validation
+
+While the task is running, TapData randomly compares rows in the target with the source to make sure they match. The check keeps going as long as the sync is active. See [Incremental Data Check](../data-replication/incremental-check.md).
+
+## API Server
+
+TapData’s built-in publishing layer. Pick any table and expose it as a [RESTful API endpoint](../publish-apis/README.md). Teams use it to share clean, governed data with mobile apps, third-party systems, or any client that speaks HTTP.
diff --git a/docs/platform-ops/operation.md b/docs/platform-ops/operation.md
@@ -377,4 +377,60 @@ a data replication task is used for scenarios that only synchronize incremental
 * [Data Services](../publish-apis/README.md)
     * Deleting or taking an API offline will render it unavailable.
 * [System Management](../system-admin/other-settings/system-settings.md)
-    * When [managing a cluster](../system-admin/manage-cluster.md), only perform close or restart operations on related services when they are experiencing anomalies.
+    * When [managing a cluster](../system-admin/manage-cluster.md), only perform close or restart operations on related services when they are experiencing anomalies.
+
+## How to run a TapData health check
+
+Use this checklist to confirm TapData is running normally.
+
+1. Log in to TapData.
+
+2. In the left menu choose **System Management > Cluster Management** and verify [component status](../system-admin/manage-cluster.md):
+   - TapData Manager, Engine, and API Server are all **Running**.
+   - CPU and memory are below 70 %.
+
+3. Open **Data Replication** or **Data Transformation** and scan the task list:
+   - Every task should show **Running**.
+   - Click a task name and check [metrics](../data-replication/monitor-task.md): lag is acceptable and QPS > 0.
+
+   If a task is unhealthy:
+   - **Read the error log** at the bottom of the monitor page and follow the hints. See [troubleshooting](../platform-ops/troubleshooting/README.md).
+   - **Test the connection**: open **Connections**, click **Test** on the related source/target and fix any auth or network issues.
+   - **Check incremental lag**: if QPS spikes for > 30 min, the source may be in a batch window—consider scaling the task. If the target receives no changes, verify CDC prerequisites (e.g. MySQL binlog = ROW). Primary-key conflicts in the log usually mean a config change.
+
+Still stuck? [Contact support](../appendix/support.md).
+
+
+## How to handle TapData alerts
+
+TapData sends alerts by [email](../case-practices/best-practice/alert-via-qqmail.md). Use the subject line to pick the right playbook below.
+
+**Task-state alerts**
+
+| Alert | What it means | What to do |
+| --- | --- | --- |
+| **Task error** | Task stopped; replication is down. | Open the task → Logs, fix the issue, restart. Escalate if stuck. |
+| **Full load finished** | Bulk copy is done. | Info only. Run a data-validate task if you need a checksum. |
+| **Incremental started** | Task is now streaming changes. | Info only. |
+| **Task stopped** | Someone clicked Stop. | Restart if it was accidental. |
+
+**Replication-lag alert**
+
+Lag exceeds the threshold you set. Open the task monitor and look for:
+
+- **Slow source reads** – “Read time” is high → ask the DBA to check load or network.
+- **Slow target writes / high QPS** – raise “Incremental read size” (≤1 000) and “Batch write size” (≤10 000); keep Agent memory <70 %.
+- **False lag** – QPS is 0 but lag still climbs → enable [heartbeat table](../case-practices/best-practice/heart-beat-task.md) on the source.
+- **Slow engine** – “Process time” keeps rising → optimise JS code or open a ticket.
+
+**Validation & performance alerts**
+
+| Alert | What it means | What to do |
+| --- | --- | --- |
+| **Validation diff** | Incremental compare found mismatches. | Auto-repair is on? Do nothing. Otherwise open the task and click **Repair**. |
+| **Data-source node slow** | Source/target latency high. | If lag alert fired, treat as “slow source reads” above; else watch and loop in the DBA if lag appears. |
+| **Process node slow** | JS node is the bottleneck. | Optimise logic or open a ticket if lag follows. |
+| **Validation job error** | Compare task crashed. | Doesn’t affect replication; restart the validation job. Escalate if it keeps failing. |
+| **Count diff limit exceeded** | Row counts don’t match. | **Full-sync task**: switch to full-field compare to pinpoint rows. **Incremental task**: wait 1–2 lag cycles and re-validate; repair if the gap remains. |
+| **Field diff limit exceeded** | Same as above but field-level. | Same playbook. |
+| **Task retry limit** | Task retried and still failed. | Open the task, follow the error message; escalate if you can’t clear it. |
diff --git a/docs/publish-apis/query/query-via-restful.md b/docs/publish-apis/query/query-via-restful.md
@@ -57,3 +57,23 @@ If you'd prefer to use an external tool or automate API testing, [Postman](https
 6. Click **Send**. You’ll get a real-time response from the API.
 
    ![Query Result](../../images/restful_api_query_result.png)
+
+
+## Common response codes
+
+| Code | Message | Meaning |
+| --- | --- | --- |
+| 200 | OK | Request succeeded |
+| 401 | Unauthorized error: token expired | Token expired; generate a new one |
+| 404 | Not Found error: endpoint not found | API does not exist or is not yet published—check the URL or wait for the publish to finish |
+| 429 | Rate limit exceeded. Maximum \${api limit} requests per second allowed | You hit the rate limit; retry later or raise the limit in the API settings |
+
+## FAQ
+
+* Q: The API takes too long to return data or times out
+
+  A: Add indexes on every column used in `WHERE`, `ORDER BY`, or joins. If the delay persists, enable response caching or increase the query timeout in the API settings.
+
+* Q: The payload doesn’t look right
+
+   A: Check the data-source model and the underlying table—make sure the data is current and that any field-merging logic matches what you expect.
diff --git a/sidebars.js b/sidebars.js
@@ -73,6 +73,7 @@ const sidebars = {
                           'connectors/warehouses-and-lake/gaussdb',
                           'connectors/warehouses-and-lake/greenplum',
                           'connectors/warehouses-and-lake/hudi',
+                          'connectors/warehouses-and-lake/paimon',
                           'connectors/warehouses-and-lake/selectdb',
                           'connectors/warehouses-and-lake/starrocks',
                           'connectors/warehouses-and-lake/tablestore',