diff --git a/docs/products/clickhouse.md b/docs/products/clickhouse.md index 5bc7e3a5b..0dec3c93b 100644 --- a/docs/products/clickhouse.md +++ b/docs/products/clickhouse.md @@ -2,118 +2,113 @@ title: Aiven for ClickHouse® --- -import DocCardList from '@theme/DocCardList'; import RelatedPages from "@site/src/components/RelatedPages"; -Aiven for ClickHouse® is a fully managed distributed columnar database based on open source ClickHouse - a fast, resource effective solution tailored for data warehouse and generation of real-time analytical data reports using advanced SQL queries. - -Discover Aiven for ClickHouse's key features and attributes which let you focus -on turning business data into actionable insights. - -ClickHouse is a highly scalable fault-tolerant database designed for online analytical -processing (OLAP) and data warehousing. Aiven for ClickHouse enables you -to execute complex SQL queries on large datasets effectively -to process large amounts of data in real time. On top of that, it -supports built-in data integrations for [Aiven for Kafka®](/docs/products/kafka) and [Aiven for -PostgreSQL®](/docs/products/postgresql). - -## Effortless setup - -With the managed ClickHouse service, you can offload on Aiven multiple -time-consuming and laborious operations on your data infrastructure: -database initialization and configuration, cluster provisioning and -management, or your infrastructure maintenance and monitoring are off -your shoulders. - -**Pre-configured settings:** The managed ClickHouse service is -pre-configured with a rational set of parameters and settings -appropriate for the plan you have selected. - -## Easy management - -- **Scalability:** You can seamlessly - [scale your ClickHouse cluster](/docs/platform/howto/scale-services) horizontally or vertically as your data and needs change - using the pre-packaged plans. Aiven for ClickHouse also supports - [sharding](/docs/products/clickhouse/howto/use-shards-with-distributed-table) as a horizontal cluster scaling strategy. -- **Resource tags:** You can assign metadata to your services in the - form of tags. They help you organize, search, and filter Aiven - resources. You can - [tag your service](/docs/platform/howto/tag-resources) by purpose, owner, environment, or any other criteria. -- **Forking:** Forking an Aiven for ClickHouse service creates a new - database service containing the latest snapshot of an existing - service. Forks don't stay up-to-date with the parent database, but - you can write to them. It provides a risk-free way of working with - your production data and schema. For example, you can use them to - test upgrades, new schema migrations, or load test your app with a - different plan. Learn how to - [fork an Aiven service](/docs/platform/concepts/service-forking). - -## Effective maintenance - -- **Automatic maintenance updates:** With 99.99% SLA, Aiven makes sure - that the ClickHouse software and the underlying platform stays - up-to-date with the latest patches and updates with zero downtime. - You can set - [maintenance windows](/docs/platform/concepts/maintenance-window) for your service to make sure the changes occur during - times that do not affect productivity. -- **Backups and disaster recovery:** Aiven for ClickHouse has - automatic backups taken every 24 hours. The retention period depends - on your plan tier. See the details on [Plan - comparison](https://aiven.io/pricing?product=clickhouse&tab=plan-comparison). - -## Intelligent observability - -- **Service health monitoring:** Aiven for ClickHouse provides metrics - and logs for your cluster at no additional charge. You can enable - pre-integrated Aiven observability services, such as Aiven for Grafana®, Aiven for - Metrics, or Aiven for OpenSearch® or push available metrics and logs to external - observability tools, such as Prometheus, AWS CloudWatch, or Google - Cloud Logging. For more details, see - [Monitor Aiven for ClickHouse metrics](/docs/products/clickhouse/howto/monitor-performance). -- **Notifications and alerts:** The service is pre-configured to alert - you on, for example, your disk running out of space or CPU - consumption running high when resource usage thresholds are - exceeded. Email notifications are sent to admins and technical - contacts of the project under which your service is created. Check - [Receive technical notifications](/docs/platform/howto/technical-emails) to learn how you can sign up for such alerts. - -## Security and compliance - -- **Single tenancy:** Your service runs on dedicated instances. - This offers true data isolation that contributes to the optimal - protection and an increased security. -- **Network isolation:** Aiven platform supports VPC peering as a - mechanism for connecting directly to your ClickHouse service via - private IP. This provides a more secure network setup. The platform - also supports PrivateLink connectivity. -- **Regulatory compliance:** ClickHouse runs on Aiven platform that is - ISO 27001:2013, SOC2, GDPR, HIPAA, and PCI/DSS compliant. - -- **Role based Access Control (RBAC)**. To learn what kind of granular access - is possible in Aiven for ClickHouse, see - [RBAC with Zookeeper](/docs/products/clickhouse/concepts/service-architecture#zookeeper). - -- **Zero lock-in:** Aiven for ClickHouse offers compatibility with - open source software (OSS), which protects you from software and - vendor lock-in. You can migrate between clouds and regions. - -See more details on security and compliance in Aiven for -ClickHouse in -[Secure a managed ClickHouse® service](/docs/products/clickhouse/howto/secure-service). - -## Devops-friendly tools - -- **Automation:** [Aiven Provider for - Terraform](https://registry.terraform.io/providers/aiven/aiven/latest/docs) - helps you automate the orchestration of your ClickHouse clusters. -- **Command-line tooling:** - [Aiven CLI](/docs/tools/cli) client - provides greater flexibility of use for proficient administrators - allowing scripting repetitive actions with ease. -- **REST APIs:** [Aiven APIs](/docs/tools/api) allow you to manage Aiven resources in a programmatic - way using HTTP requests. The whole functionality available via Aiven - Console is also available via APIs enabling you to build custom - integrations with ClickHouse and the Aiven platform. +Aiven for ClickHouse® is a fully managed distributed columnar database service based on the open source ClickHouse engine. +It is designed for online analytical processing (OLAP), data warehousing, and real-time +analytics that require fast SQL queries on large datasets. + +ClickHouse is optimized for analytical workloads. Unlike transactional (OLTP) databases +that prioritize frequent row-level updates, ClickHouse is built for complex read queries +and large-scale aggregations across millions or billions of rows. + +ClickHouse uses a columnar storage model. Data is stored by column instead of by row, +so queries read only the columns required for a specific operation. This reduces disk +I/O, improves compression efficiency, and accelerates aggregation queries. For more on +how ClickHouse indexes and processes data, see [Indexing and data processing](/docs/products/clickhouse/concepts/indexing). + +Aiven manages infrastructure, configuration, upgrades, and maintenance so you can focus +on working with your data. + +## When to use Aiven for ClickHouse® + +Aiven for ClickHouse is suitable for workloads such as: + +- Event and log analytics +- Time-series data analysis +- Reporting and dashboards +- Large-scale aggregations +- Real-time analytics pipelines +- Data warehousing + +These workloads typically involve fewer updates and more large read and aggregation +operations. Aiven for ClickHouse is not intended for high-frequency transactional +applications that require frequent row-level updates. + +## Service management + +Aiven provisions and manages your ClickHouse cluster, including: + +- Service provisioning +- Cluster configuration +- Software updates +- Infrastructure maintenance + +Each service runs as a distributed, fault-tolerant cluster. Configuration defaults are +applied based on the selected service plan. + +## Scalability + +You can scale your service as data volume and query requirements grow: + +- Scale vertically by changing the service plan +- Scale horizontally using shards and distributed tables + +Aiven supports rolling maintenance updates. You can configure maintenance windows to +control when updates occur. + +For more information, see: + +- [Scale services](/docs/platform/howto/scale-services) +- [Use shards with distributed tables](/docs/products/clickhouse/howto/use-shards-with-distributed-table) + +## Backups and recovery + +Aiven for ClickHouse includes automatic backups. Backup retention depends on the selected +service plan. + +You can restore a service from a backup or fork a service to create a new independent +service from the latest backup. Forking is useful for testing upgrades, schema changes, +or performance validation. + +For details, see [Service forking](/docs/platform/concepts/service-forking). + +## Monitoring + +Aiven provides service metrics and logs. + +You can integrate with: + +- Aiven for Grafana® +- Aiven for Metrics +- Aiven for OpenSearch® +- Prometheus +- AWS CloudWatch +- Google Cloud Logging + +For details, see [Monitor Aiven for ClickHouse metrics](/docs/products/clickhouse/howto/monitor-performance). + +## Security and network access + +Each service runs on isolated infrastructure. + +The Aiven platform supports: + +- TLS-encrypted connections +- VPC peering +- PrivateLink connectivity + +For configuration steps, see [Secure a managed ClickHouse® service](/docs/products/clickhouse/howto/secure-service). + +## Automation + +You can manage Aiven for ClickHouse using: + +- [Aiven Provider for Terraform](https://registry.terraform.io/providers/aiven/aiven/latest/docs) +- [Aiven CLI](/docs/tools/cli) +- [Aiven API](/docs/tools/api) + +Most operations available in the Aiven Console are also available through the CLI and API. diff --git a/docs/products/clickhouse/concepts/columnar-databases.md b/docs/products/clickhouse/concepts/columnar-databases.md deleted file mode 100644 index c9f55c998..000000000 --- a/docs/products/clickhouse/concepts/columnar-databases.md +++ /dev/null @@ -1,38 +0,0 @@ ---- -title: ClickHouse® as a columnar database ---- - -ClickHouse® is a columnar databases that handles data with specific benefits. - -## Fast data reading - -Compared to traditional row-oriented solutions, columnar database -management systems store data tables by columns to provide better -performance and efficiency in certain applications. As a truly columnar -database, ClickHouse® also stores the values of the same column -physically next to each other. This further increases the speed of -retrieving the values of a column. However, it also makes it slower to -retrieve complete rows, as the values of a single row are stored across -different physical locations. - -## Enhanced query performance - -Storing the data of each column independently minimizes disk access and -improves query performance by reading only the data columns that are -relevant to a specific query. - -## Data compression and queries aggregation - -This storage approach also provides better options for data compression, -for example, by the ability to better utilize similarities between -adjacent data. Columnar databases are also better at aggregating queries -involving large data sets. - -## Massive and complex read operations - -Columnar databases such as ClickHouse are therefore best suited for -analytical applications that require big data processing or data -warehousing, as these usually involve fewer write operations but more - -or more complex - read operations that focus on subsets of the stored -data. However, applications where queries mainly affect entire rows in -the data tables are less efficient in columnar databases. diff --git a/docs/products/clickhouse/concepts/data-integration-overview.md b/docs/products/clickhouse/concepts/data-integration-overview.md index 94f762ff9..da786187e 100644 --- a/docs/products/clickhouse/concepts/data-integration-overview.md +++ b/docs/products/clickhouse/concepts/data-integration-overview.md @@ -24,7 +24,7 @@ There are a few ways of classifying integration types supported in Aiven for Cli Aiven for ClickHouse supports observability integrations and data source integrations, which have different purposes: -- [Observability integration](/docs/products/clickhouse/howto/list-integrations) is +- [Observability integration](/docs/products/clickhouse/howto/list-integrate) is connecting to other services (either Aiven-managed or external ones) to expose and process logs and metrics. - Data service integration is connecting to other services (either Aiven-managed or external) @@ -133,4 +133,4 @@ needed in real-time or near-real-time. - [Set up Aiven for ClickHouse® data service integration](/docs/products/clickhouse/howto/data-service-integration) - [Manage Aiven for ClickHouse® integration databases](/docs/products/clickhouse/howto/integration-databases) -- [Integrate your Aiven for ClickHouse® service](/docs/products/clickhouse/howto/list-integrations) +- [Integrate your Aiven for ClickHouse® service](/docs/products/clickhouse/howto/list-integrate) diff --git a/docs/products/clickhouse/concepts/federated-queries.md b/docs/products/clickhouse/concepts/federated-queries.md deleted file mode 100644 index 6547d61bc..000000000 --- a/docs/products/clickhouse/concepts/federated-queries.md +++ /dev/null @@ -1,81 +0,0 @@ ---- -title: Querying external data in Aiven for ClickHouse® -sidebar_label: Federated queries ---- - -import RelatedPages from "@site/src/components/RelatedPages"; - -Discover federated queries and their capabilities in Aiven for ClickHouse® and how they simplify and speed up migrating into Aiven from external data sources. - -Federated queries allow communication between Aiven for ClickHouse and -S3-compatible object storages and web resources. The federated queries -feature in Aiven for ClickHouse enables you to read and pull data from -an external object storage that uses the S3 integration engine, or any -web resource accessible over HTTP. - -:::note -The federated queries feature in Aiven for ClickHouse is enabled by -default. -::: - -## Why use federated queries - -There are a few reasons why you might want to use federated queries: - -- Query remote data from your ClickHouse service. Ingest it into Aiven - for ClickHouse or only reference external data sources as part of an - analytics query. In the context of an increasing footprint of - connected data sources, federated queries can help you better - understand how your customers use your products. -- Simplify and speed up the import of your data into the Aiven for - ClickHouse instance from a legacy data source, avoiding a long and - sometimes complex migration path. -- Improve the migration of data in Aiven for ClickHouse, and extend - analysis over external data sources with a relatively low effort in - comparison to enabling distributed tables and [the remote and - remoteSecure - functionalities](https://clickhouse.com/docs/en/sql-reference/table-functions/remote). - -:::note -The `remote()` and `remoteSecure()` features are designed to read from -remote data sources or provide the ability to create a distributed table -across remote data sources but they are not designed to read from an -external S3 storage. -::: - -## How it works - -To run a federated query, the ClickHouse service user connecting to the -cluster requires grants to the S3 and/or URL sources. The main service -user is granted access to the sources by default, and new users can be -allowed to use the sources via the CREATE TEMPORARY TABLE grant, which -is required for both sources. - -For more information on how to enable new users to use the sources, -see [Prerequisites](/docs/products/clickhouse/howto/run-federated-queries#prerequisites). - -Federated queries read from external S3-compatible object storage -utilizing the ClickHouse S3 engine. Once you read from a remote -S3-compatible storage, you can select from that storage and insert into -a table in the Aiven local instance, enabling migration of data into -Aiven. - -For more details on how to run federated queries in Aiven for ClickHouse, -see -[Read and pull data from S3 object storages and web resources over HTTP](/docs/products/clickhouse/howto/run-federated-queries). - -## Limitations - -- Federated queries in Aiven for ClickHouse only support S3-compatible - object storage providers for the time being. More external data - sources coming soon. -- Virtual tables are only supported for URL sources, using the URL - table engine. Stay tuned for us supporting the S3 table engine in - the future. - - - -- [Read and pull data from S3 object storages and web resources over HTTP](/docs/products/clickhouse/howto/run-federated-queries) -- [Integrating S3 | ClickHouse Docs](https://clickhouse.com/docs/en/integrations/s3) -- [remote, remoteSecure | ClickHouse Docs](https://clickhouse.com/docs/en/sql-reference/table-functions/remote) -- [Cloud Compatibility | ClickHouse Docs](https://clickhouse.com/docs/en/whats-new/cloud-compatibility#federated-queries) diff --git a/docs/products/clickhouse/concepts/indexing.md b/docs/products/clickhouse/concepts/indexing.md index b0df12b8f..3af4395e9 100644 --- a/docs/products/clickhouse/concepts/indexing.md +++ b/docs/products/clickhouse/concepts/indexing.md @@ -1,5 +1,6 @@ --- title: Indexing and data processing in ClickHouse® +sidebar_label: Indexes --- import RelatedPages from "@site/src/components/RelatedPages"; @@ -15,8 +16,6 @@ the columns you read, the faster and more efficient the performance of the request. If you have to read many or all columns, using a columnar database becomes a less effective approach. -Read more about characteristics of columnar databases and their features -in [Columnar databases](columnar-databases). ## Reading data in blocks @@ -57,11 +56,11 @@ limitations of ClickHouse. A primary key, as used in ClickHouse, does not ensure uniqueness for a single searched item since only every ten thousandth item is indexed. You need to iterate over thousands of items to find a specific row, which makes this approach inadequate when -working with individual rows and suitable for processing millions or +working with individual rows, and suitable for processing millions or trillions of items. :::note[Example] -When analysing error rates based on a server log analysis, you don't +When analyzing error rates based on server log analysis, you don't focus on individual lines but look at the overall picture to see trends. Such requests allow approximate calculations using only a sample of data to draw conclusions. @@ -79,18 +78,18 @@ ClickHouse](https://clickhouse.com/docs/en/engines/table-engines/mergetree-famil ## ClickHouse data skipping indexes Although skipping indexes are used in ClickHouse as secondary indexes, -they work quite differently to secondary indexes used in other DBMSs. +they work differently from secondary indexes used in other DBMSs. Skipping indexes help boost performance by skipping some irrelevant rows in advance, when it can be predicted that these rows do not satisfy query conditions. :::note[Example] -You have numeric column *number of page visits* and run a query to +You have a numeric column `number of page visits` and run a query to select all rows where page visits are over 10000. To speed up such a query, you can add a skipping index to store extremes of the field and -help ClickHouse to skip in advance values that do not satisfy the -request condition. +help ClickHouse skip in advance values that do not satisfy the query +condition. ::: diff --git a/docs/products/clickhouse/concepts/olap.md b/docs/products/clickhouse/concepts/olap.md deleted file mode 100644 index af9ead654..000000000 --- a/docs/products/clickhouse/concepts/olap.md +++ /dev/null @@ -1,20 +0,0 @@ ---- -title: Online analytical processing ---- - -Online analytical processing (OLAP) is an approach to producing -real-time reports and insights, usually based on large amounts of source -data. - -A major function of OLAP tools is to provide data-based intelligence to -inform and support business decisions. The target data usually includes -operational activities that are then analyzed for use by various -functions, such as sales, marketing, and finance. - -From a technical perspective, most traditional, row-oriented database -systems are better suited to online transactional processing (OLTP), but -are not very efficient for OLAP scenarios. Column-oriented databases are -more efficient, as they provide quicker access to subsets of the stored -data. However, most database systems aim for a hybrid approach that -primarily focuses on optimizing either OLAP or OLTP scenarios, while -offering solutions to support the alternative scenario. diff --git a/docs/products/clickhouse/concepts/service-architecture.md b/docs/products/clickhouse/concepts/service-architecture.md index 348e9bc19..c5297e7a1 100644 --- a/docs/products/clickhouse/concepts/service-architecture.md +++ b/docs/products/clickhouse/concepts/service-architecture.md @@ -3,8 +3,9 @@ title: Aiven for ClickHouse® service architecture sidebar_label: Service architecture --- -Aiven for ClickHouse® is implemented as a multi-master cluster where the replication of data is managed by ClickHouse itself, the replication of schema and users is managed by ZooKeeper, and data backup and restoration is managed by Astacus. -Discover the technical design behind Aiven for ClickHouse. +Aiven for ClickHouse® runs as a multi-master cluster. ClickHouse manages +data replication, ZooKeeper manages replication of schema and users, and +Astacus manages backup and restore. ## Deployment modes @@ -17,12 +18,10 @@ shard of three nodes, or multiple shards of three nodes each. - With multiple shards, the data is split between all shards and the data of each shard is present in all nodes of the shard. -Each Aiven for ClickHouse service is exposed as a single server URL -pointing to all servers with connections going randomly to any of the -servers. ClickHouse is responsible for replicating the writes between -the servers. For synchronizing critical low-volume information between -servers, Aiven for ClickHouse relies on -[ZooKeeper](#zookeeper), which runs on +Aiven exposes each Aiven for ClickHouse service as a single server URL. Connections +go to any of the servers in the cluster. ClickHouse replicates writes +between servers. For synchronizing schema and similar metadata between +servers, Aiven for ClickHouse uses [ZooKeeper](#zookeeper), which runs on each ClickHouse server. ## Coordinating services @@ -31,81 +30,70 @@ Each Aiven for ClickHouse node runs ClickHouse, ZooKeeper, and Astacus. ### ZooKeeper -ZooKeeper is responsible for the cross-nodes coordination and -synchronization of the following replication processes: +ZooKeeper coordinates and synchronizes the following across nodes: -- Replication of database changes across the cluster: CREATE, UPDATE, - or ALTER TABLE (by ClickHouse's `Replicated` - [database engine](/docs/products/clickhouse/concepts/service-architecture#replicated-database-engine)) +- **Database schema**: CREATE, UPDATE, or ALTER TABLE via ClickHouse's `Replicated` + [database engine](/docs/products/clickhouse/concepts/service-architecture#replicated-database-engine). -- Replication of table data across the cluster (by ClickHouse's - `ReplicatedMergeTree` - [table engine](/docs/products/clickhouse/concepts/service-architecture#replicated-table-engine)). Data itself is not written to ZooKeeper but - transferred directly between ClickHouse servers. +- **Table data**: Via ClickHouse's `ReplicatedMergeTree` + [table engine](/docs/products/clickhouse/concepts/service-architecture#replicated-table-engine). + Data is transferred directly between ClickHouse servers, not through + ZooKeeper. -- Replication of the storage of Users, Roles, Quotas, Row Policies for - the whole cluster (by ClickHouse). +- **Users, roles, quotas, and row policies**: Stored and replicated + across the cluster by ClickHouse. :::note - Storing entities such as Users, Roles, Quotas, and Row Policies in - ZooKeeper ensures that Role Based Access Control (RBAC) is applied - consistently over the entire cluster. This type of entity storage - was developed at Aiven and is now part of the upstream ClickHouse. + Storing users, roles, quotas, and row policies in ZooKeeper ensures + that role-based access control (RBAC) is applied consistently across + the cluster. This approach was developed at Aiven and is now part of + upstream ClickHouse. ::: -ZooKeeper handles one process per node and is accessible only from -within the cluster. +ZooKeeper runs one process per node and is only accessible from within the cluster. ### Astacus {#astacus-os} -[Astacus](https://github.com/aiven/astacus) is an open-source project -originated at Aiven for coordinating backups of cluster databases, -including ClickHouse. +[Astacus](https://github.com/aiven/astacus) is an open-source project from +Aiven that coordinates backups of cluster databases, including ClickHouse. ## Data architecture -Aiven for ClickHouse enforces +Aiven for ClickHouse enforces: -- Full schema replication: all databases, tables, users, and grants - are the same on all nodes. -- Full data replication: table rows are the same on all nodes within a - shard. +- **Full schema replication**: All databases, tables, users, and + grants are the same on all nodes. +- **Full data replication**: Table rows are the same on all nodes + within a shard. -[Astacus](/docs/products/clickhouse/concepts/service-architecture#astacus-os) does most of the -backup and restore operations. +[Astacus](/docs/products/clickhouse/concepts/service-architecture#astacus-os) +performs most backup and restore operations. ## Engines: database and table -ClickHouse has engines in two flavors: table engines and database -engines. +ClickHouse uses two kinds of engines: database engines and table engines. -- Database engine is responsible for manipulating tables and decides - what happens when you try to list, create, or delete a table. It can - also be used to restrict a database to specific table engine or to - manage replication. -- Table engine decides how to store data on a disk or how to enable - reading data from outside the disk to expose it as a virtual table. +- **Database engine**: Controls how tables are listed, created, and + deleted. It can restrict which table engines a database allows and + can manage replication. +- **Table engine**: Controls how data is stored on disk or read from + external sources (exposed as a virtual table). ### `Replicated` database engine -The default ClickHouse database engine is the Atomic engine, responsible -for creating table metadata on the disk and configuring which table -engines are allowed in each database. +The default ClickHouse database engine is Atomic. It creates table +metadata on disk and configures which table engines each database +allows. -Aiven for ClickHouse uses the `Replicated` database engine, which is a -variant of Atomic. With this engine variant, queries for creating, -updating, or altering tables are replicated to all other servers using -[ZooKeeper](#zookeeper). As a result, all -servers can have the same table schema, which makes them an actual data -cluster and not multiple independent servers that can talk to each -other. +Aiven for ClickHouse uses the `Replicated` database engine, a variant of +Atomic. This engine replicates CREATE, UPDATE, and ALTER TABLE operations +to all servers via [ZooKeeper](#zookeeper). All servers share the same +table schema, so they act as one cluster rather than separate servers. ### `Replicated` table engine -The table engine is responsible for the INSERT and SELECT queries. From -a wide variety of available table engines, the most common ones belong -to the `MergeTree` engines family, which is supported in Aiven for -ClickHouse. +The table engine handles INSERT and SELECT. The most common engines in +Aiven for ClickHouse are in the `MergeTree` family. For a list of all the table engines that you can use in Aiven for ClickHouse, see @@ -113,31 +101,25 @@ ClickHouse, see #### `MergeTree` engine -With the `MergeTree` engine, at least one new file is created for each -INSERT query and each new file is written once and never modified. In -the background, new files (called *parts*) are re-read, merged, and -rewritten into compact form. Writing data in parts determines the -performance profile of ClickHouse. +With the `MergeTree` engine, each INSERT creates at least one new file. +Each file is written once and never modified. In the background, these +files (called _parts_) are merged and rewritten into a compact form. This +write pattern drives ClickHouse performance. -- INSERT queries need to be batched to avoid handling a number of - small parts. -- UPDATE and DELETE queries need to be batched. Removing or updating a - single row requires rewriting an entire part with all the rows - except the one we want to remove or update. -- SELECT queries are executed rapidly because all the data found in a - part is valid and all files can be cached since they never change. +- **INSERT**: Batch inserts to avoid creating many small parts. +- **UPDATE and DELETE**: Batch these operations. Updating or removing a + single row requires rewriting the whole part that contains it (all + other rows in that part are kept). +- **SELECT**: Runs quickly because data in a part is immutable and + files can be cached. #### `ReplicatedMergeTree` engine -Each engine of the `MergeTree` family has a matching -`ReplicatedMergeTree` engine, which additionally enables the replication -of all writes using [ZooKeeper](#zookeeper). The data itself doesn't travel through ZooKeeper and is -actually fetched from one ClickHouse server to another. A shared log of -update queries is maintained with ZooKeeper. All nodes add entries to -the queue and watch for changes to execute the queries. - -When a query to create a table using the `MergeTree` engine arrives, -Aiven for ClickHouse automatically rewrites the query to use the -`ReplicatedMergeTree` engine so that all tables are replicated and all -servers have the same table data, which in fact makes the group of -servers a high-availability cluster. +Each `MergeTree` engine has a matching `ReplicatedMergeTree` engine that +replicates writes using [ZooKeeper](#zookeeper). Data is copied directly +between ClickHouse servers; ZooKeeper only keeps a shared log of update +operations. Nodes add entries to the log and watch for changes to apply. + +When you create a table with a `MergeTree` engine, Aiven for ClickHouse +rewrites the query to use `ReplicatedMergeTree`. All tables are replicated +so every server has the same data, forming a high-availability cluster. diff --git a/docs/products/clickhouse/concepts/strings.md b/docs/products/clickhouse/concepts/strings.md index d4b1d34e9..01a213fad 100644 --- a/docs/products/clickhouse/concepts/strings.md +++ b/docs/products/clickhouse/concepts/strings.md @@ -1,5 +1,6 @@ --- title: String data type in Aiven for ClickHouse® +sidebar_label: String data type --- Aiven for ClickHouse® uses ClickHouse® databases, which can store diverse types of data, such as strings, decimals, booleans, or arrays. diff --git a/docs/products/clickhouse/get-started.md b/docs/products/clickhouse/get-started.md index fa1f64463..4c154a0a6 100644 --- a/docs/products/clickhouse/get-started.md +++ b/docs/products/clickhouse/get-started.md @@ -717,4 +717,4 @@ Once the data is loaded, you can run queries against the sample data you importe - [Secure an Aiven for ClickHouse® service](/docs/products/clickhouse/howto/secure-service) - [Manage Aiven for ClickHouse® users and roles](/docs/products/clickhouse/howto/manage-users-roles) - [Manage Aiven for ClickHouse® database and tables](/docs/products/clickhouse/howto/manage-databases-tables) -- [Integrate an Aiven for ClickHouse® service](/docs/products/clickhouse/howto/list-integrations) +- [Integrate an Aiven for ClickHouse® service](/docs/products/clickhouse/howto/list-integrate) diff --git a/docs/products/clickhouse/howto/clickhouse-query-cache.md b/docs/products/clickhouse/howto/clickhouse-query-cache.md index 3c894f0c4..e7b2a3737 100644 --- a/docs/products/clickhouse/howto/clickhouse-query-cache.md +++ b/docs/products/clickhouse/howto/clickhouse-query-cache.md @@ -1,6 +1,6 @@ --- title: Use query cache in Aiven for ClickHouse® -sidebar_label: Use query cache +sidebar_label: Query cache --- import RelatedPages from "@site/src/components/RelatedPages"; @@ -75,7 +75,7 @@ You can configure the following query cache settings: -- [Querying external data in Aiven for ClickHouse®](/docs/products/clickhouse/concepts/federated-queries) +- [Query external data using federated queries in Aiven for ClickHouse®](/docs/products/clickhouse/howto/run-federated-queries) - [Query Aiven for ClickHouse® databases](/docs/products/clickhouse/howto/query-databases) - [Fetch query statistics for Aiven for ClickHouse®](/docs/products/clickhouse/howto/fetch-query-statistics) - [Create dictionaries in Aiven for ClickHouse®](/docs/products/clickhouse/howto/create-dictionary) diff --git a/docs/products/clickhouse/howto/connect-to-grafana.md b/docs/products/clickhouse/howto/connect-to-grafana.md index bd92f4fbc..f9a9ad2a7 100644 --- a/docs/products/clickhouse/howto/connect-to-grafana.md +++ b/docs/products/clickhouse/howto/connect-to-grafana.md @@ -1,5 +1,6 @@ --- title: Visualize ClickHouse® data with Grafana® +sidebar_label: Visualize data with Grafana --- You can visualise your ClickHouse® data using Grafana® and Aiven can diff --git a/docs/products/clickhouse/howto/connect-with-clickhouse-cli.md b/docs/products/clickhouse/howto/connect-with-clickhouse-cli.md index 61a07e931..4ea06ab51 100644 --- a/docs/products/clickhouse/howto/connect-with-clickhouse-cli.md +++ b/docs/products/clickhouse/howto/connect-with-clickhouse-cli.md @@ -2,92 +2,75 @@ title: Connect to Aiven for ClickHouse® with clickhouse-client sidebar_label: clickhouse-client --- +import ConsoleLabel from "@site/src/components/ConsoleIcons"; -It's recommended to connect to a ClickHouse® cluster with the ClickHouse® client. +Connect to Aiven for ClickHouse® with the `clickhouse-client` CLI. Install via Docker, connect with your credentials, and run SQL interactively or with `--query`. -## Use the ClickHouse® client +## Before you begin -To use the ClickHouse® client across different operating systems, we -recommend utilizing [Docker](https://www.docker.com/). You can get the -latest image of the ClickHouse server which contains the most recent -ClickHouse client directly from [the dedicated page in Docker -hub](https://hub.docker.com/r/clickhouse/clickhouse-server). +Get the following connection details from the **Connection information** +section on the page of your service in the +[Aiven Console](https://console.aiven.io/): -:::note -There are other installation options available for ClickHouse clients -for different operating systems. See them in [ClickHouse -local](https://clickhouse.com/docs/en/operations/utilities/clickhouse-local) -and [Install ClickHouse](https://clickhouse.com/docs/en/install) in the -official ClickHouse documentation. -::: +- **Host** +- **Port** +- **User** +- **Password** + +## Install the ClickHouse client -## Connection properties +You can use Docker to run `clickhouse-client` on any operating system. The +ClickHouse server image includes the client. -You will need to know the following properties to establish a secure -connection with your Aiven for ClickHouse service: **Host**, **Port**, -**User** and **Password**. You will find these in the **Connection -information** section on the **Overview** page of your service in the -[Aiven Console](https://console.aiven.io/). +Pull the latest [ClickHouse server image from Docker Hub](https://hub.docker.com/r/clickhouse/clickhouse-server). + +:::note +To install the client locally instead of using Docker, see the ClickHouse +documentation: [Install ClickHouse](https://clickhouse.com/docs/en/install) and +[clickhouse-local utility](https://clickhouse.com/docs/en/operations/utilities/clickhouse-local). +::: -## Command template +## Connect to your service -The command to connect to the service looks like this, substitute the -placeholders for `USERNAME`, `PASSWORD`, `HOST` and `PORT`: +Run the following command. Replace `USERNAME`, `PASSWORD`, `HOST`, and `PORT` +with your connection details: ```bash -docker run -it \ ---rm clickhouse/clickhouse-server clickhouse-client \ ---user USERNAME \ ---password PASSWORD \ ---host HOST \ ---port PORT \ ---secure +docker run -it --rm clickhouse/clickhouse-server clickhouse-client \ + --host HOST \ + --port PORT \ + --user USERNAME \ + --password PASSWORD \ + --secure ``` -This example includes the `-it` option (a combination of `--interactive` -and `--tty`) to take you inside the container and the `--rm` option to -automatically remove the container after exiting. +Options used -The other parameters, such as `--user`, `--password`, `--host`, -`--port`, `--secure`, and `--query` are arguments accepted by the -ClickHouse client. You can see the full list of command line options in -[the ClickHouse CLI -documentation](https://clickhouse.com/docs/en/interfaces/cli/#command-line-options). +- `-it`: Runs the container in interactive mode. +- `--rm`: Removes the container after exit. +- `--secure`: Enables TLS (required for Aiven services). -Once you're connected to the server, you can type queries directly -within the client, for example, to see the list of existing databases, -run +For the full list of client options, see +[ClickHouse CLI command-line options](https://clickhouse.com/docs/en/interfaces/cli/#command-line-options). -```sql -SHOW DATABASES -``` +## Run queries -Alternatively, sometimes you might want to run individual queries and be -able to access the command prompt outside the docker container. In this -case you can set `--interactive` and use `--query` parameter without -entering the docker container: +After you connect, you can run SQL statements directly in the client. For example: -```bash -docker run --interactive \ ---rm clickhouse/clickhouse-server clickhouse-client \ ---user USERNAME \ ---password PASSWORD \ ---host HOST \ ---port PORT \ ---secure \ ---query="YOUR SQL QUERY GOES HERE" +```sql +SHOW DATABASES; ``` -Similar to above example, you can request the list of present databases -directly: +## Run a single query + +To run a single query and exit to your shell immediately, add the `--query` option: ```bash -docker run --interactive \ ---rm clickhouse/clickhouse-server clickhouse-client \ ---user USERNAME \ ---password PASSWORD \ ---host HOST \ ---port PORT \ ---secure \ ---query="SHOW DATABASES" +docker run --rm clickhouse/clickhouse-server clickhouse-client \ + --host HOST \ + --port PORT \ + --user USERNAME \ + --password PASSWORD \ + --secure \ + --query="SHOW DATABASES" ``` diff --git a/docs/products/clickhouse/howto/connect-with-jdbc.md b/docs/products/clickhouse/howto/connect-with-jdbc.md index beafc9639..603b01c74 100644 --- a/docs/products/clickhouse/howto/connect-with-jdbc.md +++ b/docs/products/clickhouse/howto/connect-with-jdbc.md @@ -1,5 +1,6 @@ --- title: Connect Aiven for ClickHouse® to external databases via JDBC +sidebar_label: External databases with JDBC --- You can use [ClickHouse JDBC diff --git a/docs/products/clickhouse/howto/copy-data-across-instances.md b/docs/products/clickhouse/howto/copy-data-across-instances.md index 481ce9073..4d1a5c078 100644 --- a/docs/products/clickhouse/howto/copy-data-across-instances.md +++ b/docs/products/clickhouse/howto/copy-data-across-instances.md @@ -1,5 +1,6 @@ --- title: Copy data from one ClickHouse® server to another +sidebar_label: Copy data between servers --- You can copy data from one ClickHouse® server to another using the `remoteSecure()` function. diff --git a/docs/products/clickhouse/howto/create-dictionary.md b/docs/products/clickhouse/howto/create-dictionary.md index ae232fa39..6c595d398 100644 --- a/docs/products/clickhouse/howto/create-dictionary.md +++ b/docs/products/clickhouse/howto/create-dictionary.md @@ -1,6 +1,6 @@ --- title: Create dictionaries in Aiven for ClickHouse® -sidebar_label: Create dictionaries +sidebar_label: Dictionaries --- Create dictionaries in Aiven for ClickHouse® to accelerate queries for better efficiency and performance. diff --git a/docs/products/clickhouse/howto/data-service-integration.md b/docs/products/clickhouse/howto/data-service-integration.md index 32dc53eb5..f18a5e685 100644 --- a/docs/products/clickhouse/howto/data-service-integration.md +++ b/docs/products/clickhouse/howto/data-service-integration.md @@ -1,6 +1,6 @@ --- title: Set up Aiven for ClickHouse® data source integrations -sidebar_label: Integrate with data source +sidebar_label: Integrate data sources --- import RelatedPages from "@site/src/components/RelatedPages"; @@ -9,16 +9,16 @@ import ConsoleLabel from "@site/src/components/ConsoleIcons"; import Tabs from '@theme/Tabs'; import TabItem from '@theme/TabItem'; -Connect your Aiven for ClickHouse® service with another Aiven-managed service or external data source to make your data available in the Aiven for ClickHouse service. +Connect your Aiven for ClickHouse® service to another Aiven service or an external data source to make data available in ClickHouse. ## Prerequisites - You are familiar with the limitations listed in [About Aiven for ClickHouse® data service integration](/docs/products/clickhouse/concepts/data-integration-overview#supported-data-source-types). -- You have an organization, a project, and an Aiven for ClickHouse service in Aiven. +- You have an organization, a project, and an Aiven for ClickHouse service. - You have access to the [Aiven Console](https://console.aiven.io/). -## Create Apache Kafka integrations +## Create an Apache Kafka integration :::tip Learn about [managed databases integrations](/docs/products/clickhouse/concepts/data-integration-overview#managed-databases-integration). @@ -35,21 +35,19 @@ Make Apache Kafka data available in Aiven for ClickHouse using the Kafka engine: 1. On the **Integrations** page, go to the **Data sources** section and click **Apache Kafka**. - The **Apache Kafka data source integration** wizard opens and displays a list of external - data sources or Aiven-managed data services available for integration. If there are - no data sources to integrate with, the wizard allows you to create them either by clicking - **Create service** (for Aiven-managed sources) or **Add external endpoint** (for external - sources). + The **Apache Kafka data source integration** wizard opens and displays available + data sources. If no data sources are listed, click **Create service** (for + Aiven-managed sources) or **Add external endpoint** (for external sources) to + create one. 1. In the **Apache Kafka data source integration** wizard: 1. Select a data source to integrate with, and click **Continue**. :::note - If a data source to integrate with is not available on the list, click one of the - following: - - **Create service**: to create an Aiven-managed data service to integrate with - - **Create external endpoint**: to make your external data source available for + If the data source is not in the list, click one of the following: + - **Create service**: Creates an Aiven-managed data service for integration + - **Create external endpoint**: Makes your external data source available for integration 1. Create tables where your Apache Kafka data will be available in Aiven @@ -74,7 +72,7 @@ Make Apache Kafka data available in Aiven for ClickHouse using the Kafka engine: 1. Click **Enable integration** > **Close**. -## Create PostgreSQL integrations +## Create a PostgreSQL integration :::tip Learn about [managed databases integrations](/docs/products/clickhouse/concepts/data-integration-overview#managed-databases-integration). @@ -84,28 +82,26 @@ Make PostgreSQL data available in Aiven for ClickHouse using the PostgreSQL engi 1. Log in to the [Aiven Console](https://console.aiven.io/), and go to an organization and a project. -1. From , select an Aiven for ClickHouse service to integrate - with a data source. +1. From , select the Aiven for ClickHouse service to + integrate with a data source. 1. On the service's page, click in the sidebar. 1. On the **Integrations** page, go to the **Data sources** section and click **PostgreSQL**. The **PostgreSQL data source integration** wizard opens and displays a list of external - data sources or Aiven-managed data services available for integration. If there are no - data sources to integrate with, the wizard allows you to create them either by clicking - **Create service** (for Aiven-managed sources) or **Add external endpoint** (for external - sources). + data sources or Aiven-managed data services available for integration. If no data + sources are listed, click **Create service** (for Aiven-managed sources) or + **Add external endpoint** (for external sources) to create one. 1. In the **PostgreSQL data source integration** wizard: 1. Select a data source to integrate with, and click **Continue**. :::note - If a data source to integrate with is not available on the list, click one of the - following: - - **Create service**: to create an Aiven-managed data service to integrate with - - **Create external endpoint**: to make your external data source available for + If the data source is not in the list, click one of the following: + - **Create service**: Creates an Aiven-managed data service for integration + - **Create external endpoint**: Makes your external data source available for integration 1. Optionally, create databases where your PostgreSQL data will be available in Aiven @@ -123,10 +119,10 @@ Make PostgreSQL data available in Aiven for ClickHouse using the PostgreSQL engi :::note You can - [create such integration databases](/docs/products/clickhouse/howto/integration-databases) - any time later, for example, by finding your integration on the **Integrations** page - and clicking > . - :::: + [create integration databases](/docs/products/clickhouse/howto/integration-databases) + later. For example, fine your integration on the **Integrations** page and + click > . + ::: 1. Click **Enable integration** > **Close**. @@ -139,7 +135,7 @@ Learn about [managed credentials integrations](/docs/products/clickhouse/concept [Set up a managed-credentials integration](/docs/products/clickhouse/howto/data-service-integration#create-managed-credentials-integrations) and [create tables](/docs/products/clickhouse/howto/data-service-integration#create-tables) -for the data to be made available through the integration. +to make data available through the integration. [Access your stored credentials](/docs/products/clickhouse/howto/data-service-integration#access-credentials-storage). ### Create managed-credentials integrations @@ -153,30 +149,28 @@ for the data to be made available through the integration. 1. On the **Integrations** page, go to the **Data sources** section and click **ClickHouse Credentials**. - The **ClickHouse credentials integration** wizard opens and displays a list of external - data sources or Aiven-managed data services available for integration. If there are no - data sources to integrate with, the wizard allows you to create them either by - clicking **Create service** (for Aiven-managed sources) or **Add external endpoint** - (for external sources). + The **ClickHouse credentials integration** wizard opens and displays a list of + external data sources or Aiven-managed data services available for integration. If + no data sources are listed, click **Create service** (for Aiven-managed sources) + or **Add external endpoint** (for external sources) to create one. 1. In the **ClickHouse credentials integration** wizard: 1. Select a data source to integrate with. :::note - If a data source to integrate with is not available on the list, click one of the - following: - - **Create service**: to create an Aiven-managed data service to integrate with - - **Create external endpoint**: to make your external data source available for + If the data source is not in the list, click one of the following: + - **Create service**: Creates an Aiven-managed data service for integration + - **Create external endpoint**: Makes your external data source available for integration 1. Click **Enable integration**. 1. Optionally, click **Test connection** > **Open in query editor** > **Execute**. :::note[Alternative] - You can test the connection any time later by going to your Aiven for ClickHouse - service's **Integrations** page, finding the credentials integration, and clicking - > . + You can test the connection later from your Aiven for ClickHouse service's + **Integrations** page. Find the credentials integration and click + > . ::: 1. Click **Close**. @@ -185,7 +179,7 @@ for the data to be made available through the integration. Create tables using [table engines](/docs/products/clickhouse/reference/supported-table-engines), for -example the PostgreSQL engine: +example, the PostgreSQL engine: ```sql CREATE TABLE default.POSTGRESQL_TABLE_NAME @@ -205,8 +199,8 @@ systems, see the ### Access credentials storage -Depending on the type of data source you are integrated with, you can access your credentials -storage by passing your data source name in the following query: +Depending on your data source type, you can access your credentials storage by passing +your data source name in the following query: ```sql title="PostgreSQL data source" SELECT * @@ -235,16 +229,15 @@ SELECT * FROM s3( ``` :::warning -When you try to run a managed credentials query with a typo, the query fails with an -error message related to grants. +When you run a managed-credentials query with a typo, the query fails with an error +message related to grants. ::: ## View data source integrations 1. Log in to the [Aiven Console](https://console.aiven.io/), and go to an organization and a project. -1. From , select an Aiven for ClickHouse service to display - integrations for. +1. From , select an Aiven for ClickHouse service. 1. On the service's page, go to one of the following: - in the sidebar > **Integrations** @@ -253,24 +246,24 @@ error message related to grants. ## Stop data source integrations :::warning -By terminating a data source integration, you disconnect from the data source, which erases -all databases and configuration information from Aiven for ClickHouse. +When you terminate a data source integration, you disconnect from the data source. Aiven +for ClickHouse removes all related databases and configuration. ::: 1. Log in to the [Aiven Console](https://console.aiven.io/), and go to an organization and a project. -1. From , select an Aiven for ClickHouse service you - want to stop integrations for. -1. On the service's page, take one of the following courses of action: +1. From , select the Aiven for ClickHouse service where + you want to stop the integration. +1. On the service's page, do one of the following: - - Click > **Integrations**, find an - integration to be stopped, and click > + - Click > **Integrations**, find the + integration to stop, and click > . - - Click , find an integration to be stopped, + - Click , find the integration to stop, and click > . -Your integration is terminated and all the corresponding databases and configuration -information are deleted. +This terminates the integration and deletes all corresponding databases and +configuration. @@ -278,4 +271,4 @@ information are deleted. - [Managed credentials integration](/docs/products/clickhouse/concepts/data-integration-overview#managed-credentials-integration) - [Managed databases integration](/docs/products/clickhouse/concepts/data-integration-overview#managed-databases-integration) - [Manage Aiven for ClickHouse® integration databases](/docs/products/clickhouse/howto/integration-databases) -- [Integrate your Aiven for ClickHouse® service](/docs/products/clickhouse/howto/list-integrations) +- [Integrate your Aiven for ClickHouse® service](/docs/products/clickhouse/howto/list-integrate) diff --git a/docs/products/clickhouse/howto/fetch-query-statistics.md b/docs/products/clickhouse/howto/fetch-query-statistics.md index 9bc902f9d..782648711 100644 --- a/docs/products/clickhouse/howto/fetch-query-statistics.md +++ b/docs/products/clickhouse/howto/fetch-query-statistics.md @@ -1,6 +1,6 @@ --- title: Fetch query statistics for Aiven for ClickHouse® -sidebar_label: Fetch query statistics +sidebar_label: Query statistics --- import RelatedPages from "@site/src/components/RelatedPages"; diff --git a/docs/products/clickhouse/howto/integrate-kafka.md b/docs/products/clickhouse/howto/integrate-kafka.md index a245a81db..072f5f756 100644 --- a/docs/products/clickhouse/howto/integrate-kafka.md +++ b/docs/products/clickhouse/howto/integrate-kafka.md @@ -1,5 +1,6 @@ --- -title: Connect Apache Kafka® to Aiven for ClickHouse® +title: Integrate Apache Kafka® with Aiven for ClickHouse® +sidebar_label: Integrate with Kafka --- import Tabs from '@theme/Tabs'; @@ -61,13 +62,13 @@ Variables used to set up and configure the integration: To connect Aiven for ClickHouse and Aiven for Apache Kafka by enabling a data service integration, see -[Create data service integrations](/docs/products/clickhouse/howto/data-service-integration#create-apache-kafka-integrations). +[Create data service integrations](/docs/products/clickhouse/howto/data-service-integration#create-an-apache-kafka-integration). When you create the integration, a database is automatically added in your Aiven for ClickHouse. Its name is `service_KAFKA_SERVICE_NAME`, where `KAFKA_SERVICE_NAME` is the name of your Apache Kafka service. In this database, you create virtual connector tables, which is also a part of the -[integration creation in the Aiven Console](/docs/products/clickhouse/howto/data-service-integration#create-apache-kafka-integrations). +[integration creation in the Aiven Console](/docs/products/clickhouse/howto/data-service-integration#create-an-apache-kafka-integration). You can have up to 400 such tables for receiving and sending messages from multiple topics. ## Update integration settings diff --git a/docs/products/clickhouse/howto/integrate-postgresql.md b/docs/products/clickhouse/howto/integrate-postgresql.md index 16c5081fe..a4393af08 100644 --- a/docs/products/clickhouse/howto/integrate-postgresql.md +++ b/docs/products/clickhouse/howto/integrate-postgresql.md @@ -1,5 +1,6 @@ --- -title: Connect PostgreSQL® to Aiven for ClickHouse® +title: Integrate PostgreSQL® with Aiven for ClickHouse® +sidebar_label: Integrate with PostgreSQL --- You can integrate Aiven for ClickHouse® with either *Aiven for PostgreSQL* service located in the same project, or *an external PostgreSQL endpoint*. @@ -42,7 +43,7 @@ The following variables will be used later in the code snippets: To connect Aiven for ClickHouse and Aiven for PostgreSQL by enabling a data service integration, see -[Create data service integrations](/docs/products/clickhouse/howto/data-service-integration#create-postgresql-integrations). +[Create data service integrations](/docs/products/clickhouse/howto/data-service-integration#create-a-postgresql-integration). The newly created database name has the following format: `service_PG_SERVICE_NAME_PG_DATABASE_PG_SCHEMA`, for example, diff --git a/docs/products/clickhouse/howto/integration-databases.md b/docs/products/clickhouse/howto/integration-databases.md index d77b64482..2412f8363 100644 --- a/docs/products/clickhouse/howto/integration-databases.md +++ b/docs/products/clickhouse/howto/integration-databases.md @@ -1,5 +1,6 @@ --- title: Manage Aiven for ClickHouse® integration databases +sidebar_label: Manage integration databases --- import ConsoleLabel from "@site/src/components/ConsoleIcons"; @@ -181,4 +182,4 @@ Depending on what you intend to delete, select **Database** or **Table**. - [Manage Aiven for ClickHouse® data service integrations](/docs/products/clickhouse/howto/data-service-integration) -- [Integrate your Aiven for ClickHouse® service](/docs/products/clickhouse/howto/list-integrations) +- [Integrate your Aiven for ClickHouse® service](/docs/products/clickhouse/howto/list-integrate) diff --git a/docs/products/clickhouse/howto/list-backups-recovery.md b/docs/products/clickhouse/howto/list-backups-recovery.md new file mode 100644 index 000000000..a30674264 --- /dev/null +++ b/docs/products/clickhouse/howto/list-backups-recovery.md @@ -0,0 +1,10 @@ +--- +title: Backups and recovery in Aiven for ClickHouse® +--- + +Configure backups, restore from a backup, copy data between servers, and plan +for disaster recovery. + +import DocCardList from '@theme/DocCardList'; + + diff --git a/docs/products/clickhouse/howto/list-connect-to-service.md b/docs/products/clickhouse/howto/list-connect-to-service.md index 23d60e538..83355e4ce 100644 --- a/docs/products/clickhouse/howto/list-connect-to-service.md +++ b/docs/products/clickhouse/howto/list-connect-to-service.md @@ -4,6 +4,7 @@ title: Connect to Aiven for ClickHouse® import DocCardList from '@theme/DocCardList'; -Connect to the Aiven for ClickHouse® service using various programming languages or tools. +Connect applications and tools to Aiven for ClickHouse®. +Configure authentication, network access, and client settings to establish a secure connection. diff --git a/docs/products/clickhouse/howto/list-integrate.md b/docs/products/clickhouse/howto/list-integrate.md new file mode 100644 index 000000000..6af55a32a --- /dev/null +++ b/docs/products/clickhouse/howto/list-integrate.md @@ -0,0 +1,14 @@ +--- +title: Integrate with Aiven for ClickHouse® +--- + +Connect ClickHouse to Grafana, Kafka, PostgreSQL, and other data sources. Use +data source integrations and integration databases. + +To manage integrations with Terraform, see [ClickHouse +examples](https://github.com/aiven/terraform-provider-aiven/tree/main/examples/clickhouse) +in the Aiven Terraform Provider repository. + +import DocCardList from '@theme/DocCardList'; + + diff --git a/docs/products/clickhouse/howto/list-integrations.md b/docs/products/clickhouse/howto/list-integrations.md deleted file mode 100644 index c6cff69f6..000000000 --- a/docs/products/clickhouse/howto/list-integrations.md +++ /dev/null @@ -1,13 +0,0 @@ ---- -title: Integrate your Aiven for ClickHouse® service ---- - -This section provides instructions on how to integrate your Aiven for -ClickHouse® service with other services or external databases. If you'd -like to use Terraform for this purpose, see integration examples in -Aiven Terraform Provider's -[GitHub repository](https://github.com/aiven/terraform-provider-aiven/tree/main/examples/clickhouse). - -import DocCardList from '@theme/DocCardList'; - - diff --git a/docs/products/clickhouse/howto/list-manage-cluster.md b/docs/products/clickhouse/howto/list-manage-cluster.md index c8070bebc..c56d8632f 100644 --- a/docs/products/clickhouse/howto/list-manage-cluster.md +++ b/docs/products/clickhouse/howto/list-manage-cluster.md @@ -1,5 +1,6 @@ --- title: Manage your Aiven for ClickHouse® cluster +sidebar_label: Manage cluster --- [Monitor a managed service](/docs/platform/howto/list-monitoring) diff --git a/docs/products/clickhouse/howto/list-manage-service.md b/docs/products/clickhouse/howto/list-manage-service.md new file mode 100644 index 000000000..903c81eb2 --- /dev/null +++ b/docs/products/clickhouse/howto/list-manage-service.md @@ -0,0 +1,10 @@ +--- +title: Manage service in Aiven for ClickHouse® +--- + +Secure your service, manage users and roles, operate your cluster, and understand +service limits. + +import DocCardList from '@theme/DocCardList'; + + diff --git a/docs/products/clickhouse/howto/list-monitor-performance.md b/docs/products/clickhouse/howto/list-monitor-performance.md new file mode 100644 index 000000000..5ac3969fc --- /dev/null +++ b/docs/products/clickhouse/howto/list-monitor-performance.md @@ -0,0 +1,10 @@ +--- +title: Monitor performance in Aiven for ClickHouse® +--- + +Monitor your service, fetch query statistics, and use metrics with Datadog, +Prometheus, or system tables. + +import DocCardList from '@theme/DocCardList'; + + diff --git a/docs/products/clickhouse/howto/list-work-with-data.md b/docs/products/clickhouse/howto/list-work-with-data.md new file mode 100644 index 000000000..8a972e259 --- /dev/null +++ b/docs/products/clickhouse/howto/list-work-with-data.md @@ -0,0 +1,11 @@ +--- +title: Work with data in Aiven for ClickHouse® +--- + +Query, manage, and optimize data in Aiven for ClickHouse®. Run queries, design tables +and table structure, and use advanced features such as dictionaries, federated +queries, and the query cache. + +import DocCardList from '@theme/DocCardList'; + + diff --git a/docs/products/clickhouse/howto/materialized-views.md b/docs/products/clickhouse/howto/materialized-views.md index b0198d756..777d7f127 100644 --- a/docs/products/clickhouse/howto/materialized-views.md +++ b/docs/products/clickhouse/howto/materialized-views.md @@ -1,11 +1,13 @@ --- title: Create materialized views in ClickHouse® +sidebar_label: Materialized views --- import RelatedPages from "@site/src/components/RelatedPages"; Use materialized views to persist data from the Kafka® table engine. -One way of integrating your ClickHouse® service with Kafka® is using the Kafka® table engine, which enables, for example, inserting data into ClickHouse® from Kafka. +One way of integrating your ClickHouse® service with Kafka® is using the Kafka® table +engine, which enables, for example, inserting data into ClickHouse® from Kafka. In such a scenario, ClickHouse can read from a Kafka® topic directly. This is, however, one-time retrieval so the data cannot be re-read. When diff --git a/docs/products/clickhouse/howto/query-databases.md b/docs/products/clickhouse/howto/query-databases.md index 4e7ece548..0f035d98f 100644 --- a/docs/products/clickhouse/howto/query-databases.md +++ b/docs/products/clickhouse/howto/query-databases.md @@ -1,50 +1,39 @@ --- -title: Query Aiven for ClickHouse® databases -sidebar_label: Query a database +title: Run queries on Aiven for ClickHouse® +sidebar_label: Run queries keywords: [query log, query_log, log table] --- -Run a query against an Aiven for ClickHouse® database using a tool of your choice. +Run queries against an Aiven for ClickHouse® database using the query editor, the Play UI, or the [ClickHouse® client](/docs/products/clickhouse/howto/connect-with-clickhouse-cli). -To ensure data security, stability, and its proper replication, we equip -our managed Aiven for ClickHouse® service with specific features, some -of them missing from the standard ClickHouse offer. Aiven for -ClickHouse® takes care of running queries in the distributed mode over -the entire cluster. In the standard ClickHouse, the queries `CREATE`, -`ALTER`, `RENAME` and `DROP` only affect the server where they are run. -In contrast, we ensure the proper distribution across all cluster -machines behind the scenes. You don't need to remember using -`ON CLUSTER` for every query. +Aiven for ClickHouse runs queries in distributed mode across the entire cluster. In +standard ClickHouse, queries such as `CREATE`, `ALTER`, `RENAME`, and `DROP` affect only +the server where they run unless you explicitly add the `ON CLUSTER` clause. In +Aiven for ClickHouse, these queries are automatically distributed across the cluster. +You do not need to include `ON CLUSTER` in your queries. :::important -There are limitations on the number of concurrent queries and the number of concurrent -connections in Aiven for ClickHouse: +Aiven for ClickHouse limits concurrent queries and connections: -- `max_concurrent_queries` ranges from `25` to `400`. -- `max_concurrent_connections` ranges from `1000` to `4000`. +- `max_concurrent_queries`: `25` to `400` +- `max_concurrent_connections`: `1000` to `4000` -See -[Aiven for ClickHouse® limits and limitations](/docs/products/clickhouse/reference/limitations) +See [Aiven for ClickHouse® limits and limitations](/docs/products/clickhouse/reference/limitations) for details. ::: -For querying your ClickHouse® databases, you can choose between our -query editor, the Play UI, and -[the ClickHouse® client](/docs/products/clickhouse/howto/connect-with-clickhouse-cli). - ## Query a database with a selected tool ### Query editor {#use-query-editor} -Aiven for ClickHouse® includes a web-based query editor, which you can -find in [Aiven Console](https://console.aiven.io/) by selecting **Query -editor** from the sidebar of your service's page. +Aiven for ClickHouse® includes a web-based query editor. In +[Aiven Console](https://console.aiven.io/), open your service and click **Query editor** +in the sidebar. #### When to use the query editor -The query editor is convenient to run queries directly from -the console on behalf of the default user. The requests that you run -through the query editor rely on the permissions granted to this user. +Use the query editor to run queries directly from the console as the default user. +Queries run with the permissions assigned to this user. #### Examples of queries @@ -66,49 +55,45 @@ Create a role: CREATE ROLE accountant ``` -### Play UI {#play-iu} +### Play UI {#play-ui} + +ClickHouse provides a built-in web interface called the Play UI for running SQL queries. -ClickHouse® includes a built-in user interface for running SQL queries. -You can access it from a web browser over the HTTPS protocol. +#### When to use the Play UI -#### When to use the play UI +Use the Play UI when you need to: -Use the play UI to run requests using a non-default user or -if you expect a large size of the response. +- Run queries as a non-default user +- Work with large query results -#### Use the play UI +#### Use the Play UI -1. Log in to [Aiven Console](https://console.aiven.io/), choose the - right project, and select your Aiven for ClickHouse service. -1. In the **Overview** page of your service, find the **Connection - information** section and select **ClickHouse HTTPS & JDBC**. -1. Copy **Service URI** and go to `YOUR_SERVICE_URI/play` from a - web browser. -1. Set the name and the password of the user on whose behalf you want - to run the queries. -1. Enter the body of the query. -1. Select **Run**. +1. Log in to [Aiven Console](https://console.aiven.io/), choose the + right project, and select your Aiven for ClickHouse service. +1. In the **Overview** page of your service, find the **Connection + information** section and select **ClickHouse HTTPS & JDBC**. +1. Copy **Service URI** and go to `YOUR_SERVICE_URI/play` from a + web browser. +1. Set the name and the password of the user on whose behalf you want + to run the queries. +1. Enter the body of the query. +1. Select **Run**. :::note -The play interface is only available if you can connect directly to -ClickHouse from your browser. If the service is -[restricted by IP addresses](/docs/platform/howto/restrict-access) or in a -[VPC without public access](/docs/platform/howto/public-access-in-vpc), you can use the -[query editor](/docs/products/clickhouse/howto/query-databases#use-query-editor) instead. -The query editor can be accessed directly from the console to run -requests on behalf of the default user. +The Play UI works only when your browser can reach ClickHouse directly. If you +[restrict access by IP](/docs/platform/howto/restrict-access) or the service is in a +[VPC without public access](/docs/platform/howto/public-access-in-vpc), use the +[query editor](/docs/products/clickhouse/howto/query-databases#use-query-editor) in the +console to run queries as the default user. ::: ## Query a non-replicated table -Behind the DNS name of your Aiven for ClickHouse service, there are multiple nodes. When -you query a non-replicated table, for example a log table, requests are routed randomly to -one of the nodes regardless of how data is distributed across them. A particular row is -found only if your `SELECT` query is directed to the node which executed a `WRITE` on -this row. +Your Aiven for ClickHouse® service has multiple nodes behind one DNS name. For +non-replicated tables, for example log tables, each request goes to one node. You get a +row only if the `SELECT` hits the node that wrote that row. -To query a non-replicated table across all the service nodes, use `clusterAllReplicas` -as follows: +To read from a non-replicated table across all nodes, use `clusterAllReplicas`: ```sql SELECT * diff --git a/docs/products/clickhouse/howto/restore-backup.md b/docs/products/clickhouse/howto/restore-backup.md index 14b3704c6..96bc86931 100644 --- a/docs/products/clickhouse/howto/restore-backup.md +++ b/docs/products/clickhouse/howto/restore-backup.md @@ -1,13 +1,13 @@ --- title: Fork and restore from Aiven for ClickHouse® backups -sidebar_label: Fork & restore from backups +sidebar_label: Fork and restore from backups --- import ConsoleLabel from "@site/src/components/ConsoleIcons"; import RelatedPages from "@site/src/components/RelatedPages"; import ForkService from "@site/static/includes/fork-service-console.md"; -Choose a service [backup](/docs/products/clickhouse/concepts/disaster-recovery#service-backup) to fork from and restore your Aiven for ClickHouse® service. +Select a [service backup](/docs/products/clickhouse/concepts/disaster-recovery#service-backup) to fork from and restore your Aiven for ClickHouse® service. :::important You cannot fork Aiven for ClickHouse services to a fewer number of nodes. diff --git a/docs/products/clickhouse/howto/run-federated-queries.md b/docs/products/clickhouse/howto/run-federated-queries.md index b574b3775..f9cadf588 100644 --- a/docs/products/clickhouse/howto/run-federated-queries.md +++ b/docs/products/clickhouse/howto/run-federated-queries.md @@ -1,94 +1,108 @@ --- -title: Read and pull data from S3 object storages and web resources over HTTP +title: Query external data using federated queries in Aiven for ClickHouse® +sidebar_label: Federated queries --- import RelatedPages from "@site/src/components/RelatedPages"; -With federated queries in Aiven for ClickHouse®, you can read and pull data from an external S3-compatible object storage or any web resource accessible over HTTP. +Federated queries let you read and write data in external S3-compatible object storages and web resources directly from Aiven for ClickHouse®. +Use them to query remote data in place, ingest it into your ClickHouse service, +or simplify migration from a legacy data source. -Learn more about capabilities and applications of federated queries in -[About querying external data in Aiven for ClickHouse®](/docs/products/clickhouse/concepts/federated-queries). +:::note +Federated queries are enabled by default in Aiven for ClickHouse. +::: -## About running federated queries +## Why use federated queries -Federated queries are written using specific SQL statements and can be -run from CLI, for instance. To run a federated query, just send a query -over an external S3-compatible object storage including relevant S3 -bucket details. A properly constructed federated query returns a -specific output. +- Query remote data from your ClickHouse service: ingest it into Aiven for + ClickHouse or reference external sources in analytics queries. +- Simplify data import from legacy sources and avoid a long or complex + migration path. +- Extend analysis over external data with less effort than enabling + distributed tables or using + [remote() / remoteSecure()](https://clickhouse.com/docs/en/sql-reference/table-functions/remote). -## Prerequisites +## How federated queries work + +Federated queries use specific SQL statements to read from external sources +through the ClickHouse S3 engine or URL table function. Once you read from a +remote source, you can select from it and insert into a local table in Aiven +for ClickHouse. + +To run a federated query, the ClickHouse service user connecting to the cluster +requires grants to the S3 and/or URL sources. The main service user has access +by default. + +Federated queries support the following external sources: -The prerequisites depend on the table function or table engine used in your federated query. +- S3-compatible object storage (including Azure Blob Storage), using the S3 + engine +- Web resources accessible over HTTP, using the URL table function -### Access to S3 and URL sources +:::note +The `remote()` and `remoteSecure()` functions query remote ClickHouse servers +or create distributed tables. They cannot read data from external object +storage such as S3. +::: -To run a federated query, the ClickHouse service user connecting to the -cluster requires grants to the S3 and/or URL sources. The main service -user is granted access to the sources by default, and new users can be -allowed to use the sources with the following query: +## Limitations + +- Only S3-compatible object storage providers are supported. More external data + sources are planned. +- Virtual tables are supported only for URL sources using the URL table engine. + Support for the S3 table engine is planned. + +## Prerequisites + +### Grant access to S3 and URL sources + +The main service user has access to S3 and URL sources by default. To grant +another user access, run the following query: ```sql GRANT CREATE TEMPORARY TABLE, S3, URL ON *.* TO [WITH GRANT OPTION] ``` -The CREATE TEMPORARY TABLE grant is required for both sources. Adding -WITH GRANT OPTION allows the user to further transfer the privileges. +The `CREATE TEMPORARY TABLE` grant is required for both sources. Adding +`WITH GRANT OPTION` allows the user to pass the privileges to others. -### Azure Blob Storage access keys +### Get Azure Blob Storage access keys To run federated queries using the `azureBlobStorage` table function or the -`AzureBlobStorage` table engine, get your Azure Blob Storage keys using one of the -following tools: - -- [Azure portal](https://portal.azure.com/) - - From the portal menu, select **Storage accounts**, go to your account, and click - **Security + Networking** > **Access keys**. View and copy your account access keys and - connection strings. +`AzureBlobStorage` table engine, obtain your Azure Blob Storage keys from one of +the following: +- [Azure portal](https://portal.azure.com/): From the portal menu, select + **Storage accounts**, go to your account, and click + **Security + Networking** > **Access keys**. View and copy your account + access keys and connection strings. - [PowerShell](https://learn.microsoft.com/en-us/powershell/scripting/install/installing-powershell?view=powershell-7.4) - [Azure CLI](https://learn.microsoft.com/en-us/cli/azure/install-azure-cli#install) -### Managed credentials for Azure Blob Storage +### Set up managed credentials for Azure Blob Storage [Managed credentials integration](/docs/products/clickhouse/concepts/data-integration-overview#managed-credentials-integration) is: - Required to - [run federated queries using the AzureBlobStorage table engine](/docs/products/clickhouse/howto/run-federated-queries#query-using-the-azureblobstorage-table-engine) + [run federated queries using the AzureBlobStorage table engine](#query-using-the-azureblobstorage-table-engine) - Optional to - [run federated queries using the azureBlobStorage table function](/docs/products/clickhouse/howto/run-federated-queries#query-using-the-azureblobstorage-table-function) + [run federated queries using the azureBlobStorage table function](#query-using-the-azureblobstorage-table-function) [Set up a managed credentials integration](/docs/products/clickhouse/howto/data-service-integration#create-managed-credentials-integrations) as needed. -## Limitations - -- Federated queries in Aiven for ClickHouse only support S3-compatible - object storage providers for the time being. -- Virtual tables are only supported for URL sources, using the URL - table engine. - ## Run a federated query -See some examples of running federated queries to read and pull -data from external S3-compatible object storages. - ### Query using the `azureBlobStorage` table function -Depending on how you choose to handle passing connection parameters in your queries, you -can run federated queries using the `azureBlobStorage` table function: - -- [With managed credentials integration](/docs/products/clickhouse/howto/run-federated-queries#azureblobstorage-table-function-without-managed-credentials) -- [Without managed credentials integration](/docs/products/clickhouse/howto/run-federated-queries#azureblobstorage-table-function-with-managed-credentials) +Depending on how you handle connection parameters, you can run federated +queries with or without managed credentials integration. -Before you start, fulfill relevant -[prerequisites](/docs/products/clickhouse/howto/run-federated-queries#prerequisites), if any. +#### Without managed credentials -#### `azureBlobStorage` table function without managed credentials - -##### SELECT +**SELECT:** ```sql SELECT * @@ -103,7 +117,7 @@ FROM azureBlobStorage( LIMIT 5 ``` -##### INSERT +**INSERT:** ```sql INSERT INTO FUNCTION @@ -118,7 +132,7 @@ INSERT INTO FUNCTION VALUES ('column1-value', 'column2-value'); ``` -#### `azureBlobStorage` table function with managed credentials +#### With managed credentials ```sql azureBlobStorage( @@ -130,9 +144,6 @@ azureBlobStorage( ### Query using the `AzureBlobStorage` table engine -Before you start, fulfill relevant -[prerequisites](/docs/products/clickhouse/howto/run-federated-queries#prerequisites), if any. - 1. Create a table: ```sql @@ -141,10 +152,15 @@ Before you start, fulfill relevant `Low` Float64, `High` Float64 ) - ENGINE = AzureBlobStorage(`endpoint_azure-blob-storage-datasets`, blob_path = 'data.csv', compression = 'auto', format = 'CSV') + ENGINE = AzureBlobStorage( + `endpoint_azure-blob-storage-datasets`, + blob_path = 'data.csv', + compression = 'auto', + format = 'CSV' + ) ``` -1. Query from the `AzureBlobStorage` table engine: +1. Query the table: ```sql SELECT avg(Low) FROM test_azure_table @@ -152,15 +168,11 @@ Before you start, fulfill relevant ### Query using the `s3` table function -Before you start, fulfill relevant -[prerequisites](/docs/products/clickhouse/howto/run-federated-queries#prerequisites), if any. - #### SELECT and `s3` -SQL SELECT statements using the S3 and URL functions are able to query -public resources using the URL of the resource. For instance, let's -explore the network connectivity measurement data provided by the [Open -Observatory of Network Interference (OONI)](https://ooni.org/data/). +`SELECT` statements using the `s3` function can query public resources by URL. +The following example uses network connectivity measurement data from the +[Open Observatory of Network Interference (OONI)](https://ooni.org/data/): ```sql WITH ooni_data_sample AS @@ -184,22 +196,20 @@ LIMIT 50 #### INSERT and `s3` -When executing an INSERT statement into the S3 function, the rows are -appended to the corresponding object if the table structure matches: +When you run an `INSERT` statement into the `s3` function, rows are appended to +the corresponding object if the table structure matches: ```sql -INSERT INTO FUNCTION - s3('https://bucket-name.s3.region-name.amazonaws.com/dataset-name/landing/raw-data.csv', 'CSVWithNames') +INSERT INTO FUNCTION s3( + 'https://bucket-name.s3.region-name.amazonaws.com/dataset-name/landing/raw-data.csv', + 'CSVWithNames' +) VALUES ('column1-value', 'column2-value'); ``` ### Query a private S3 bucket -Before you start, fulfill relevant -[prerequisites](/docs/products/clickhouse/howto/run-federated-queries#prerequisites), if any. - -Private buckets can be accessed by providing the access token and secret -as function parameters. +Private buckets require the access key ID and secret as function parameters: ```sql SELECT * @@ -210,8 +220,8 @@ FROM s3( ) ``` -Depending on the format, the schema can be automatically detected. If it -isn't, you may also provide the column types as function parameters. +If the schema is not detected automatically, provide column types as +additional parameters: ```sql SELECT * @@ -226,12 +236,9 @@ FROM s3( ### Query using the `s3Cluster` table function -Before you start, fulfill relevant -[prerequisites](/docs/products/clickhouse/howto/run-federated-queries#prerequisites), if any. - -The `s3Cluster` function allows all cluster nodes to participate in the -query execution. Using `default` for the cluster name parameter, we can -compute the same aggregations as above as follows: +The `s3Cluster` function distributes query execution across all cluster nodes. +Using `default` for the cluster name, you can run the same aggregations as in +the `s3` example: ```sql WITH ooni_clustered_data_sample AS @@ -255,15 +262,12 @@ LIMIT 50 ### Query using the `url` table function -Before you start, fulfill relevant -[prerequisites](/docs/products/clickhouse/howto/run-federated-queries#prerequisites), if any. - #### SELECT and `url` -Let's query the [Growth Projections and Complexity -Rankings](https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/XTAQMC&version=4.0) -dataset, courtesy of the [Atlas of Economic -Complexity](https://atlas.cid.harvard.edu/) project. +The following example queries the +[Growth Projections and Complexity Rankings](https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/XTAQMC&version=4.0) +dataset from the [Atlas of Economic Complexity](https://atlas.cid.harvard.edu/) +project: ```sql WITH economic_complexity_ranking AS @@ -283,10 +287,9 @@ LIMIT 20 #### INSERT and `url` -With the URL function, INSERT statements generate a POST request, which -can be used to interact with APIs having public endpoints. For instance, -if your application has a `ingest-csv` endpoint accepting CSV data, you -can insert a row using the following statement: +With the `url` function, `INSERT` statements send a POST request. For +example, if your application has an `ingest-csv` endpoint that accepts CSV +data: ```sql INSERT INTO FUNCTION @@ -296,12 +299,8 @@ VALUES ('column1-value', 'column2-value'); ### Query a virtual table -Before you start, fulfill relevant -[prerequisites](/docs/products/clickhouse/howto/run-federated-queries#prerequisites), if any. - -Instead of specifying the URL of the resource in every query, it's -possible to create a virtual table using the URL table engine. This can -be achieved by running a DDL CREATE statement similar to the following: +Instead of specifying the URL in every query, create a virtual table using the +URL table engine. Run a `CREATE` statement to define the table: ```sql CREATE TABLE trips_export_endpoint_table @@ -316,8 +315,8 @@ CREATE TABLE trips_export_endpoint_table ENGINE = URL('https://app-name.company-name.cloud/api/trip-csv-export', CSV) ``` -Once the table is defined, SELECT and INSERT statements execute GET and -POST requests to the URL respectively: +Once defined, `SELECT` statements send a GET request and `INSERT` statements +send a POST request to the URL: ```sql SELECT @@ -326,17 +325,15 @@ median(fare_amount) AS median_fare_amount, max(fare_amount) AS max_fare_amount FROM trips_export_endpoint_table GROUP BY pickup_date +``` +```sql INSERT INTO trips_export_endpoint_table VALUES (8765, 10, now() - INTERVAL 15 MINUTE, now(), 50, 20) ``` -- [About querying external data in Aiven for ClickHouse®](/docs/products/clickhouse/concepts/federated-queries) -- [Cloud Compatibility \| ClickHouse - Docs](https://clickhouse.com/docs/en/whats-new/cloud-compatibility#federated-queries) -- [Integrating S3 with - ClickHouse](https://clickhouse.com/docs/en/integrations/s3) -- [remote, remoteSecure \| ClickHouse - Docs](https://clickhouse.com/docs/en/sql-reference/table-functions/remote) +- [Cloud Compatibility | ClickHouse Docs](https://clickhouse.com/docs/en/whats-new/cloud-compatibility#federated-queries) +- [Integrating S3 | ClickHouse Docs](https://clickhouse.com/docs/en/integrations/s3) +- [remote, remoteSecure | ClickHouse Docs](https://clickhouse.com/docs/en/sql-reference/table-functions/remote) diff --git a/docs/products/clickhouse/howto/secure-service.md b/docs/products/clickhouse/howto/secure-service.md index 38d3f9b92..70d32cd28 100644 --- a/docs/products/clickhouse/howto/secure-service.md +++ b/docs/products/clickhouse/howto/secure-service.md @@ -1,5 +1,6 @@ --- title: Secure a managed ClickHouse® service +sidebar_label: Secure service --- You can secure your Aiven for ClickHouse® service in a few different ways, for example by restricting network access, using Virtual Private Cloud (VPC), and enabling service termination protection. diff --git a/docs/products/clickhouse/howto/use-shards-with-distributed-table.md b/docs/products/clickhouse/howto/use-shards-with-distributed-table.md index 9255a4d05..07b49b1fc 100644 --- a/docs/products/clickhouse/howto/use-shards-with-distributed-table.md +++ b/docs/products/clickhouse/howto/use-shards-with-distributed-table.md @@ -1,5 +1,6 @@ --- title: Enable reading and writing data across shards in Aiven for ClickHouse® +sidebar_label: Shards and distributed tables --- If your Aiven for ClickHouse® service uses multiple shards, the data is replicated only between nodes of the same shard. diff --git a/docs/products/clickhouse/reference/clickhouse-metrics-datadog.md b/docs/products/clickhouse/reference/clickhouse-metrics-datadog.md index cfec06f8e..4b7067e4b 100644 --- a/docs/products/clickhouse/reference/clickhouse-metrics-datadog.md +++ b/docs/products/clickhouse/reference/clickhouse-metrics-datadog.md @@ -1,5 +1,6 @@ --- title: Aiven for ClickHouse® metrics available via Datadog +sidebar_label: Datadog metrics --- import RelatedPages from "@site/src/components/RelatedPages"; diff --git a/docs/products/clickhouse/reference/clickhouse-metrics-prometheus.md b/docs/products/clickhouse/reference/clickhouse-metrics-prometheus.md index 9721ef02a..777f9dab1 100644 --- a/docs/products/clickhouse/reference/clickhouse-metrics-prometheus.md +++ b/docs/products/clickhouse/reference/clickhouse-metrics-prometheus.md @@ -1,5 +1,6 @@ --- title: Aiven for ClickHouse® metrics available via Prometheus +sidebar_label: Prometheus metrics --- List of all metrics available via Prometheus for Aiven for ClickHouse® services. diff --git a/docs/products/clickhouse/reference/limitations.md b/docs/products/clickhouse/reference/limitations.md index b683b4145..ad80a0dd1 100644 --- a/docs/products/clickhouse/reference/limitations.md +++ b/docs/products/clickhouse/reference/limitations.md @@ -1,6 +1,6 @@ --- title: Aiven for ClickHouse® limits and limitations -sidebar_label: Limits and limitations +sidebar_label: Limitations --- import ClickHouseTotalStorageLimitation from '@site/static/includes/clickhouse-storage-limitation.md'; @@ -14,7 +14,7 @@ From the information about restrictions on using Aiven for ClickHouse, you can draw conclusions on how to get your service to operate closer to its full potential. Use **Recommended approach** as guidelines on how to work around specific restrictions. -a + diff --git a/docs/products/clickhouse/reference/metrics-list.md b/docs/products/clickhouse/reference/metrics-list.md index 7c095a6e3..36ed4f31c 100644 --- a/docs/products/clickhouse/reference/metrics-list.md +++ b/docs/products/clickhouse/reference/metrics-list.md @@ -1,5 +1,6 @@ --- title: Aiven for ClickHouse® metrics exposed in Grafana® +sidebar_label: Grafana metrics --- Browse through metrics that are available via Grafana® for Aiven for diff --git a/docs/products/clickhouse/reference/s3-supported-file-formats.md b/docs/products/clickhouse/reference/s3-supported-file-formats.md index bbb974a4d..cb2f9d441 100644 --- a/docs/products/clickhouse/reference/s3-supported-file-formats.md +++ b/docs/products/clickhouse/reference/s3-supported-file-formats.md @@ -1,14 +1,13 @@ --- -title: File formats for the S3 table function in Aiven for ClickHouse® +title: Supported file formats for S3 in Aiven for ClickHouse® +sidebar_label: S3 file formats --- import RelatedPages from "@site/src/components/RelatedPages"; -The [S3 table -function](https://clickhouse.com/docs/en/sql-reference/table-functions/s3) -allows you to select and insert data in S3-compatible storages. The S3 -table function in Aiven for ClickHouse® can be used with the following -file formats: +The [S3 table function](https://clickhouse.com/docs/en/sql-reference/table-functions/s3) +lets you select and insert data in S3-compatible object storage. In Aiven for +ClickHouse®, it supports the following file formats: - `Arrow` - `CSV` @@ -21,4 +20,4 @@ file formats: - [Table functions supported in Aiven for ClickHouse®](/docs/products/clickhouse/reference/supported-table-functions) -- [Read and pull data from S3 object storages and web resources over HTTP](/docs/products/clickhouse/howto/run-federated-queries) +- [Query external data using federated queries in Aiven for ClickHouse®](/docs/products/clickhouse/howto/run-federated-queries) diff --git a/docs/products/clickhouse/reference/supported-input-output-formats.md b/docs/products/clickhouse/reference/supported-input-output-formats.md index 64f06b899..27b97bb95 100644 --- a/docs/products/clickhouse/reference/supported-input-output-formats.md +++ b/docs/products/clickhouse/reference/supported-input-output-formats.md @@ -1,6 +1,6 @@ --- -title: Formats for Aiven for ClickHouse® - Aiven for Apache Kafka® data exchange -sidebar_label: ClickHouse-Kafka data exchange formats +title: Data exchange formats for Aiven for ClickHouse® and Aiven for Apache Kafka® +sidebar_label: Kafka data formats --- When connecting Aiven for ClickHouse® to Aiven for Apache Kafka® using Aiven integrations, data exchange is possible with the following formats only: diff --git a/docs/products/clickhouse/reference/supported-interfaces-drivers.md b/docs/products/clickhouse/reference/supported-interfaces-drivers.md index a1fa47d60..a9d1973d5 100644 --- a/docs/products/clickhouse/reference/supported-interfaces-drivers.md +++ b/docs/products/clickhouse/reference/supported-interfaces-drivers.md @@ -1,5 +1,6 @@ --- title: Interfaces and drivers supported in Aiven for ClickHouse® +sidebar_label: Interfaces and drivers --- import RelatedPages from "@site/src/components/RelatedPages"; diff --git a/sidebars.ts b/sidebars.ts index 59bcf072c..5497e1973 100644 --- a/sidebars.ts +++ b/sidebars.ts @@ -1223,7 +1223,6 @@ const sidebars: SidebarsConfig = { }, ], }, - { type: 'category', label: 'Aiven for ClickHouse®', @@ -1233,115 +1232,162 @@ const sidebars: SidebarsConfig = { }, items: [ 'products/clickhouse/get-started', + 'products/clickhouse/concepts/service-architecture', + { type: 'category', - label: 'Concepts', + label: 'Connect', + link: { + type: 'doc', + id: 'products/clickhouse/howto/list-connect-to-service', + }, items: [ - 'products/clickhouse/concepts/service-architecture', - 'products/clickhouse/concepts/olap', - 'products/clickhouse/concepts/columnar-databases', - 'products/clickhouse/concepts/indexing', - 'products/clickhouse/concepts/disaster-recovery', - 'products/clickhouse/concepts/strings', - 'products/clickhouse/concepts/federated-queries', - 'products/clickhouse/concepts/clickhouse-tiered-storage', - 'products/clickhouse/concepts/data-integration-overview', + 'products/clickhouse/reference/supported-interfaces-drivers', + 'products/clickhouse/howto/connect-with-clickhouse-cli', + 'products/clickhouse/howto/connect-with-go', + 'products/clickhouse/howto/connect-with-python', + 'products/clickhouse/howto/connect-with-nodejs', + 'products/clickhouse/howto/connect-with-php', + 'products/clickhouse/howto/connect-with-java', + 'products/clickhouse/howto/connect-with-jdbc', ], }, + { type: 'category', - label: 'How to', + label: 'Work with data', + link: { + type: 'doc', + id: 'products/clickhouse/howto/list-work-with-data', + }, items: [ { type: 'category', - label: 'Connect to Aiven for ClickHouse®', - link: { - type: 'doc', - id: 'products/clickhouse/howto/list-connect-to-service', - }, + label: 'Query data', items: [ - 'products/clickhouse/howto/connect-with-clickhouse-cli', - 'products/clickhouse/howto/connect-with-go', - 'products/clickhouse/howto/connect-with-python', - 'products/clickhouse/howto/connect-with-nodejs', - 'products/clickhouse/howto/connect-with-php', - 'products/clickhouse/howto/connect-with-java', + 'products/clickhouse/howto/query-databases', + 'products/clickhouse/howto/create-dictionary', + 'products/clickhouse/howto/run-federated-queries', + 'products/clickhouse/howto/clickhouse-query-cache', + 'products/clickhouse/howto/sql-user-defined-functions', ], }, { type: 'category', - label: 'Manage service', + label: 'Tables and table structure', items: [ - 'products/clickhouse/howto/secure-service', - 'products/clickhouse/howto/restore-backup', - 'products/clickhouse/howto/configure-backup', - 'products/clickhouse/howto/manage-users-roles', 'products/clickhouse/howto/manage-databases-tables', - 'products/clickhouse/howto/query-databases', 'products/clickhouse/howto/materialized-views', - 'products/clickhouse/howto/monitor-performance', 'products/clickhouse/howto/use-shards-with-distributed-table', - 'products/clickhouse/howto/copy-data-across-instances', - 'products/clickhouse/howto/fetch-query-statistics', - 'products/clickhouse/howto/run-federated-queries', - 'products/clickhouse/howto/create-dictionary', - 'products/clickhouse/howto/sql-user-defined-functions', - 'products/clickhouse/howto/clickhouse-query-cache', ], }, - 'products/clickhouse/howto/list-manage-cluster', { type: 'category', - label: 'Integrate service', - link: { - type: 'doc', - id: 'products/clickhouse/howto/list-integrations', - }, + label: 'Data types and indexing', items: [ - 'products/clickhouse/howto/connect-to-grafana', - 'products/clickhouse/howto/integrate-kafka', - 'products/clickhouse/howto/integrate-postgresql', - 'products/clickhouse/howto/data-service-integration', - 'products/clickhouse/howto/integration-databases', - 'products/clickhouse/howto/connect-with-jdbc', + 'products/clickhouse/concepts/indexing', + 'products/clickhouse/concepts/strings', ], }, { type: 'category', - label: 'Tiered storage', - link: { - type: 'doc', - id: 'products/clickhouse/howto/list-tiered-storage', - }, + label: 'Table engines and formats', items: [ - 'products/clickhouse/howto/enable-tiered-storage', - 'products/clickhouse/howto/configure-tiered-storage', - 'products/clickhouse/howto/check-data-tiered-storage', - 'products/clickhouse/howto/transfer-data-tiered-storage', - 'products/clickhouse/howto/local-cache-tiered-storage', + 'products/clickhouse/reference/supported-table-engines', + 'products/clickhouse/reference/supported-table-functions', + 'products/clickhouse/reference/supported-input-output-formats', + 'products/clickhouse/reference/s3-supported-file-formats', ], }, ], }, { type: 'category', - label: 'Reference', + label: 'Integrate', + link: { + type: 'doc', + id: 'products/clickhouse/howto/list-integrate', + }, items: [ - 'products/clickhouse/reference/supported-table-engines', - 'products/clickhouse/reference/supported-interfaces-drivers', + 'products/clickhouse/concepts/data-integration-overview', + 'products/clickhouse/howto/connect-to-grafana', + 'products/clickhouse/howto/integrate-kafka', + 'products/clickhouse/howto/integrate-postgresql', + 'products/clickhouse/howto/data-service-integration', + 'products/clickhouse/howto/integration-databases', + ], + }, + + { + type: 'category', + label: 'Manage service', + link: { + type: 'doc', + id: 'products/clickhouse/howto/list-manage-service', + }, + items: [ + 'products/clickhouse/howto/secure-service', + 'products/clickhouse/howto/manage-users-roles', + 'products/clickhouse/howto/list-manage-cluster', + 'products/clickhouse/reference/limitations', + ], + }, + + { + type: 'category', + label: 'Tiered storage', + link: { + type: 'doc', + id: 'products/clickhouse/howto/list-tiered-storage', + }, + items: [ + 'products/clickhouse/concepts/clickhouse-tiered-storage', + 'products/clickhouse/howto/enable-tiered-storage', + 'products/clickhouse/howto/configure-tiered-storage', + 'products/clickhouse/howto/check-data-tiered-storage', + 'products/clickhouse/howto/transfer-data-tiered-storage', + 'products/clickhouse/howto/local-cache-tiered-storage', + ], + }, + { + type: 'category', + label: 'Backups and recovery', + link: { + type: 'doc', + id: 'products/clickhouse/howto/list-backups-recovery', + }, + items: [ + 'products/clickhouse/howto/configure-backup', + 'products/clickhouse/howto/restore-backup', + 'products/clickhouse/howto/copy-data-across-instances', + 'products/clickhouse/concepts/disaster-recovery', + ], + }, + + { + type: 'category', + label: 'Monitor performance', + link: { + type: 'doc', + id: 'products/clickhouse/howto/list-monitor-performance', + }, + items: [ + 'products/clickhouse/howto/monitor-performance', + 'products/clickhouse/howto/fetch-query-statistics', 'products/clickhouse/reference/metrics-list', 'products/clickhouse/reference/clickhouse-metrics-datadog', 'products/clickhouse/reference/clickhouse-metrics-prometheus', - 'products/clickhouse/reference/supported-table-functions', - 'products/clickhouse/reference/s3-supported-file-formats', - 'products/clickhouse/reference/supported-input-output-formats', - 'products/clickhouse/reference/advanced-params', 'products/clickhouse/reference/clickhouse-system-tables', - 'products/clickhouse/reference/limitations', ], }, + { + type: 'category', + label: 'Reference', + items: ['products/clickhouse/reference/advanced-params'], + }, ], }, + { type: 'category', label: 'Aiven for Dragonfly', diff --git a/static/_redirects b/static/_redirects index e76b8a036..a69dd8ac3 100644 --- a/static/_redirects +++ b/static/_redirects @@ -145,6 +145,8 @@ /products/cassandra/overview /docs/products/services 301 /products/cassandra/reference /docs/products/services 301 /products/clickhouse/concepts /docs/products/clickhouse/concepts/service-architecture 301 +/products/clickhouse/concepts/olap /docs/products/clickhouse 301 +/products/clickhouse/concepts/columnar-databases /docs/products/clickhouse 301 /products/clickhouse/concepts/features-overview /docs/products/clickhouse 301 /products/clickhouse/howto /docs/products/clickhouse/howto/list-connect-to-service 301 /products/clickhouse/howto/list-get-started /docs/products/clickhouse/get-started 301