diff --git a/.gitignore b/.gitignore index 1a861c45..7a69f296 100644 --- a/.gitignore +++ b/.gitignore @@ -1,3 +1,4 @@ __pycache__/ site/ - +docs/assets/.DS_Store +.DS_Store \ No newline at end of file diff --git a/docs/architecture.md b/docs/architecture.md index 2d9a8ac9..1ea04164 100644 --- a/docs/architecture.md +++ b/docs/architecture.md @@ -1,12 +1,10 @@ -# Design overview +# Architecture -The Percona Operator for PostgreSQL automates and simplifies -deploying and managing open source PostgreSQL clusters on Kubernetes. -The Operator is based on [CrunchyData’s PostgreSQL Operator :octicons-link-external-16:](https://access.crunchydata.com/documentation/postgres-operator/v5/). +This document provides a high-level overview of Percona Operator for PostgreSQL architecture, explaining how the various components connect to create a production-ready PostgreSQL cluster on Kubernetes. See also [How the Operator works](operator-how-it-works.md). -![image](assets/images/pgo.svg) +## Components -PostgreSQL containers deployed with the Operator include the following components: +The StatefulSet, deployed with the Operator includes the following components: * The [PostgreSQL :octicons-link-external-16:](https://www.postgresql.org/) database management system, including: @@ -22,46 +20,70 @@ PostgreSQL containers deployed with the Operator include the following component * The [pgBouncer :octicons-link-external-16:](http://pgbouncer.github.io/) connection pooler for PostgreSQL, -* The PostgreSQL high-availability implementation based on the [Patroni template :octicons-link-external-16:](https://patroni.readthedocs.io/), +* The PostgreSQL high-availability implementation based on the [Patroni :octicons-link-external-16:](https://patroni.readthedocs.io/) template, -* the [pg_stat_monitor :octicons-link-external-16:](https://github.com/percona/pg_stat_monitor/) PostgreSQL Query Performance Monitoring utility, +* The [pg_stat_monitor :octicons-link-external-16:](https://github.com/percona/pg_stat_monitor/) PostgreSQL query performance monitoring utility, * LLVM (for JIT compilation). +* PMM Client for observability -Each PostgreSQL cluster includes one member available for read/write transactions (PostgreSQL primary instance, or leader in terms of Patroni) and a number of replicas which can serve read requests only (standby members of the cluster). +![Operator overview](assets/images/pgo.svg) -To provide high availability from the Kubernetes side the Operator involves [node affinity :octicons-link-external-16:](https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity) -to run PostgreSQL Cluster instances on separate worker nodes if possible. If -some node fails, the Pod with it is automatically re-created on another node. +### Role of each component -![image](assets/images/operator.svg) +* **Percona Distribution for PostgreSQL** is a suite of open source software, tools and services required to deploy and maintain a reliable production cluster for PostgreSQL. + +* **Patroni** — a high-availability solution for PostgreSQL that automates replication and **failover**. It maintains the cluster state and coordinates leader election to ensure that a healthy primary node is always available. Patroni simplifies building and operating resilient PostgreSQL clusters by handling node monitoring, failover, and recovery automatically. See the [Patroni documentation :octicons-link-external-16:](https://patroni.readthedocs.io/) for how it integrates with your environment. -To provide data storage for stateful applications, Kubernetes uses -Persistent Volumes. A *PersistentVolumeClaim* (PVC) is used to implement -the automatic storage provisioning to pods. If a failure occurs, the -Container Storage Interface (CSI) should be able to re-mount storage on -a different node. +* **pgBouncer** — A **lightweight connection pooler** in front of PostgreSQL. It sits between client applications and the database server to manage and reuse connections efficiently. Instead of each client opening its own database connection, pgBouncer maintains a pool of connections and serves them to clients on demand, significantly reducing connection overhead and improving performance, especially for applications with many short-lived or concurrent connections. -The Operator functionality extends the Kubernetes API with [Custom Resources -Definitions :octicons-link-external-16:](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/#customresourcedefinitions). -These CRDs provide extensions to the Kubernetes API, and, in the case of the -Operator, allow you to perform actions such as creating a PostgreSQL Cluster, -updating PostgreSQL Cluster resource allocations, adding additional utilities to -a PostgreSQL cluster, e.g. [pgBouncer :octicons-link-external-16:](https://www.pgbouncer.org/) for -connection pooling and more. +* **pgBackRest** — Handles **full, incremental, and differential** backups, compression and encryption, parallel processing, and point-in-time recovery using WAL archives. pgBackRest helps ensure data safety by providing efficient, consistent backups and fast restores for both small and large PostgreSQL environments. Backup and restore are integrated with Custom Resources (`PerconaPGBackup`, `PerconaPGRestore`). See [About backups](backups.md) to learn more. -When a new Custom Resource is created or an existing one undergoes some changes -or deletion, the Operator automatically creates/changes/deletes all needed -Kubernetes objects with the appropriate settings to provide a proper Percona -PostgreSQL Cluster operation. +* **pg_stat_monitor** — Collects **query performance** statistics. +* **PMM Client** - a lightweight agent installed on database hosts to collect metrics, logs, and performance data and send them to Percona Monitoring and Management (PMM) Server. It gathers detailed insights from databases and the operating system such as query performance, resource usage, and health metrics. It enables centralized monitoring, troubleshooting, and performance optimization for PostgreSQL clusters. -Following CRDs are created while the Operator installation: +### How components work together -* `perconapgclusters` stores information required to manage a PostgreSQL cluster. -This includes things like the cluster name, what storage and resource classes -to use, which version of PostgreSQL to run, information about how to maintain -a high-availability cluster, etc. +This workflow shows how cluster components work together: -* `perconapgbackups` and `perconapgrestores` are in charge for making backups - and restore them. +1. Your **application** uses a Kubernetes **Service** aimed at pgBouncer. +2. **pgBouncer** accepts many client connections and forwards work through a smaller set of server connections to PostgreSQL Pods. +3. **PostgreSQL** executes queries. **Writes** go to the **primary**. **Reads** can target the primary or **replicas**. +4. Primary streams WAL to replicas via instance Services +5. Patroni monitors the cluster state and coordinates the leader elections if the primary node fails +6. pgBackRest makes backups according to the schedule that you defined or when you manually create a backup object. pgBackRest saves backups to the backup storage that you configured. To learn more about backups, their workflow, and setup, refer to the [About backups](backups.md). +7. PMM Client collects performance metrics and sends them to the PMM Server for you to see and analyze. See [Monitor the database with PMM](monitoring.md) to learn more. + +## Default cluster configuration + +The default Percona Distribution for PostgreSQL configuration includes: + +* 3 PostgreSQL servers, one primary and two replicas. +* 3 pgBouncer instances. +* a pgBackRest repository host instance - a dedicated instance in your cluster that stores filesystem backups made with pgBackRest. +* a PMM client instance - a monitoring and management tool for PostgreSQL that provides a way to monitor and manage your database. It runs as a sidecar container in the database Pods. + +### Primary, replicas, and high availability + +Each PostgreSQL cluster has **one primary** instance that accepts read/write transactions. **Replicas** are standbys: they replicate from the primary and typically serve **read-only** traffic (depending on how you expose them). + +The Operator provides high availability through multiple layers of protection: + +#### Pod distribution + +The Operator uses [node affinity and anti-affinity :octicons-link-external-16:](https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity) to distribute PostgreSQL instances across separate worker nodes when possible. This prevents a single node failure from taking down multiple database instances. + +#### Automatic recovery + +If a node fails, Kubernetes automatically reschedules the affected Pod on another healthy node. Patroni handles which PostgreSQL instance is primary and ensures replication continuity. For more on HA behavior and operations, see [High-availability](ha-deploy.md). + +## Storage and persistent volumes + +Stateful workloads rely on durable disk. Kubernetes attaches storage through **PersistentVolumeClaims (PVCs)**; the cluster’s CSI driver provisions **PersistentVolumes** and can **reattach** storage if a Pod moves to another node (subject to your storage class and platform). + +If a node fails, the expectation is that the volume can be mounted elsewhere and the Pod recreated, while Patroni and PostgreSQL recover the database layer. For storage troubleshooting, see [Check storage](debug-storage.md). + +## Next steps + +For a comparison of Percona’s approach with other deployment models, see [Comparison with other solutions](compare.md). diff --git a/docs/assets/.DS_Store b/docs/assets/.DS_Store deleted file mode 100644 index ee175dff..00000000 Binary files a/docs/assets/.DS_Store and /dev/null differ diff --git a/docs/assets/diagrams/operator_flow.py b/docs/assets/diagrams/operator_flow.py new file mode 100644 index 00000000..3a4514d2 --- /dev/null +++ b/docs/assets/diagrams/operator_flow.py @@ -0,0 +1,169 @@ +#!/usr/bin/env python3 +""" +Uses the mingrammer *diagrams* library with Kubernetes node icons: +https://diagrams.mingrammer.com/docs/nodes/k8s + +Prerequisites: + pip install diagrams + # Graphviz must be installed (provides the `dot` binary): + # macOS: brew install graphviz + # Debian/Ubuntu: apt-get install graphviz + +Run from repo root: + python docs/assets/diagrams/operator_flow.py + python docs/assets/diagrams/operator_flow.py --format png + python docs/assets/diagrams/operator_flow.py --format both + +Default output: docs/assets/images/operator-flow-diagram.png. + +Colors and cluster framing follow docs/assets/images/operator.svg (light blue fill #d4edfb, +cluster stroke #729fcf, connector blue #3465a4, label text #092256). +""" + +from __future__ import annotations + +import argparse +from pathlib import Path + +from diagrams import Cluster, Diagram, Edge +from diagrams.k8s.compute import Pod +from diagrams.k8s.controlplane import CM +from diagrams.k8s.others import CRD + +# Resolve paths so the script works when run from any cwd +_HERE = Path(__file__).resolve().parent +_REPO_ROOT = _HERE.parents[2] +_OUT_DIR = _REPO_ROOT / "docs" / "assets" / "images" +_OUT_NAME = "operator-flow-diagram" + +# Palette aligned with docs/assets/images/operator.svg (Percona operator diagram) +_BG = "#ffffff" +_CLUSTER_FILL = "#d4edfb" # rgb(212, 237, 251) +_CLUSTER_BORDER = "#729fcf" # rgb(114, 159, 207) +# Primary UI blue in operator.svg (trapezoids / hex icons): rgb(50, 108, 229) — built-in K8s PNG icons match this. +_EDGE = "#3465a4" # rgb(52, 101, 164) — connectors and arrows +_TEXT = "#092256" # rgb(9, 34, 87) — titles and labels + + +def _cluster_graph_attr() -> dict[str, str]: + """Rounded cluster box like the large light-blue area in operator.svg.""" + return { + "bgcolor": _CLUSTER_FILL, + "style": "rounded", + "color": _CLUSTER_BORDER, + "penwidth": "2", + "fontcolor": _TEXT, + "fontsize": "13", + } + + +def _graph_attr_for_format(outformat: str) -> dict[str, str]: + """Graphviz graph attributes; PNG sets dpi for readable raster output.""" + ga: dict[str, str] = { + "bgcolor": _BG, + "pad": "0.45", + "fontsize": "13", + "fontname": "Helvetica", + } + if outformat == "png": + ga["dpi"] = "150" + return ga + + +def _build(outformat: str) -> Path: + """Write operator-flow-diagram.{svg|png} to docs/assets/images/.""" + if outformat not in ("svg", "png"): + raise ValueError(f"unsupported format: {outformat!r}") + + _OUT_DIR.mkdir(parents=True, exist_ok=True) + out_path = str(_OUT_DIR / _OUT_NAME) + + graph_attr = _graph_attr_for_format(outformat) + + node_attr = { + "fontcolor": _TEXT, + "fontsize": "12", + "fontname": "Helvetica", + } + + edge_attr = { + "color": _EDGE, + "fontcolor": _TEXT, + "fontsize": "10", + "fontname": "Helvetica", + "penwidth": "1.5", + } + + # Left-to-right main flow (CRD → … → Application); colors match operator.svg + with Diagram( + "", + filename=out_path, + show=False, + direction="LR", + graph_attr=graph_attr, + node_attr=node_attr, + edge_attr=edge_attr, + outformat=outformat, + ): + with Cluster("Kubernetes Cluster", graph_attr=_cluster_graph_attr()): + crd = CRD("Custom Resource\nDefinition (CRD)") + # No dedicated "Custom Resource instance" icon in k8s provider; Pod denotes workload API objects. + custom_resources = Pod("Custom Resources") + operator = CM("Operator Deployment") + + # Stack pods horizontally inside the group. + with Cluster( + "Application", + direction="LR", + graph_attr=_cluster_graph_attr(), + ): + # Three workload replicas (labels echo operator.svg DB Pod 1 / 2 / N pattern). + app_pods = [ + Pod("DB Pod 1"), + Pod("DB Pod 2"), + Pod("DB Pod N"), + ] + + # CRD defines schema available to the API; users then create CR instances. + crd >> Edge(label="defines schema", color=_EDGE) >> custom_resources + + # Bidirectional: Operator watches CRs and reconciles cluster state. + custom_resources << Edge( + forward=True, + reverse=True, + label="watch / reconcile", + color=_EDGE, + ) >> operator + + # Operator drives the managed application (StatefulSets, Services, etc.). + operator >> Edge(label="manages", color=_EDGE) >> app_pods[1] + + return _OUT_DIR / f"{_OUT_NAME}.{outformat}" + + +def _parse_args() -> argparse.Namespace: + p = argparse.ArgumentParser( + description="Render the operator control-flow diagram (CRD → … → Application).", + ) + p.add_argument( + "-f", + "--format", + choices=("svg", "png", "both"), + default="png", + help="Output format: vector SVG, raster PNG, or both (default: png).", + ) + return p.parse_args() + + +def main() -> None: + args = _parse_args() + if args.format == "both": + paths = [_build("svg"), _build("png")] + else: + paths = [_build(args.format)] + for path in paths: + print(f"Wrote {path}") + + +if __name__ == "__main__": + main() diff --git a/docs/assets/fragments/what-you-install.txt b/docs/assets/fragments/what-you-install.txt index 0a0c04ce..718521b6 100644 --- a/docs/assets/fragments/what-you-install.txt +++ b/docs/assets/fragments/what-you-install.txt @@ -15,4 +15,4 @@ The default Percona Distribution for PostgreSQL configuration includes: * a pgBackRest repository host instance – a dedicated instance in your cluster that stores filesystem backups made with pgBackRest - a backup and restore utility. * a PMM client instance - a monitoring and management tool for PostgreSQL that provides a way to monitor and manage your database. It runs as a sidecar container in the database Pods. -Read more about the default components in the [Architecture](architecture.md) section. +Read more about the default components in [Architecture](architecture.md). diff --git a/docs/assets/images/operator-flow-diagram.png b/docs/assets/images/operator-flow-diagram.png new file mode 100644 index 00000000..8801ee28 Binary files /dev/null and b/docs/assets/images/operator-flow-diagram.png differ diff --git a/docs/features.md b/docs/features.md new file mode 100644 index 00000000..ea893096 --- /dev/null +++ b/docs/features.md @@ -0,0 +1,90 @@ +# Features and capabilities + +Percona Operator for PostgreSQL is a Kubernetes-native controller that automatically manages the full lifecycle of [Percona Distribution for PostgreSQL :octicons-link-external-16:](https://www.percona.com/software/postgresql-distribution) clusters. The Operator offloads your team from manual day-to-day database management operations. This enables you to focus on tasks that matter instead. To learn how the Operator fits into Kubernetes, see [Kubernetes Operator concepts](operator-concepts.md). + +Here’s what the Operator brings to your infrastructure: + +## High availability and failover + +Run PostgreSQL with confidence: [Patroni :octicons-link-external-16:](https://patroni.readthedocs.io/) provides automatic leader election, failover, and coordination so your cluster stays available through node and Pod failures. For architecture details, see [Cluster architecture](architecture.md). + +* **Automatic failover** — Patroni manages leader election and failover to ensure the cluster always has a healthy primary. See [High availability](ha-deploy.md). +* **Zero data loss failover** — WAL-based replication limits data loss during failover; synchronous replication is available when you need stronger guarantees. +* **Health monitoring** — Continuous health checks trigger failover when PostgreSQL is not ready to serve traffic. +* **Manual switchover** — [Promote a replica to primary](change-primary.md) in a controlled way for maintenance. + +## Automated backup and restore flows + +Safeguard your data at any scale: the Operator automates backups and restores using [pgBackRest :octicons-link-external-16:](https://pgbackrest.org/), a robust open source solution trusted for PostgreSQL in production. Read [About backups](backups.md) for the full workflow. + +Also, leverage Kubernetes [PersistentVolumeClaim snapshots](backups-snapshots.md) for rapid, consistent backup and restore operations. It is especially valuable for large database clusters. + +* **Full, incremental, and differential backups** — Select the backup strategy that matches your recovery objectives and storage requirements. +* **Point-in-time recovery (PITR)** — Achieve low Recovery Point Objectives (RPO) by [restoring to any specific time](backups-restore-inplace.md#restore-the-cluster-with-point-in-time-recovery) using WAL archives. +* **Scheduled backups** — Automate backups on your chosen [schedule with cron-like expressions](backups-schedule.md). +* **Flexible storage** — Store backups in S3-compatible object storage or on local PersistentVolumes for hybrid strategies. +* **PVC snapshot support** — Boost backup and restore performance for large datasets with a point-in-time snapshot of your data volume. +* **Encryption** — [Secure backups at rest](backup-encryption.md) where your storage backends and configuration allow it. +* **Retention** — Manage backup lifecycle and [automate old backup cleanup](backup-retention.md) to prevent storage sprawl. + +## Connection pooling with pgBouncer + +Reduce connection churn and spread read load without extra operational burden. + +* **Efficient pooling** — Lower PostgreSQL connection overhead by pooling client connections +* **Transaction-level pooling** — Manage connections at the transaction level efficiently +* **Read balancing** — Distribute read queries across replicas where configured +* **High availability** — Replica pgBouncer instances provide high availability +* **Integrated lifecycle** — Automatically configured and managed by the Operator + +## Automated scaling and resource management + +Scale your cluster up or down to match demand while keeping changes declarative. + +* **Declarative clusters** — Describe desired cluster state in YAML; the Operator automatically reconciles Kubernetes resources to match. +* **Replica scaling** — [Adjust replica count](ha-deploy.md#adding-nodes-to-a-cluster) in the Custom Resource to scale horizontally. +* **Dynamic configuration** — [Update PostgreSQL parameters](options.md) without a full cluster restart. +* **Self-healing** — The Operator automatically detects and recovers from Pod crashes, node issues, and common network problems. +* **Rolling updates** — Apply configuration and image updates with controlled rollouts. +* **Storage expansion** — Automatically [increase storage size](scaling.md#scale-storage) for PostgreSQL instances when supported by your environment and configuration. + +## PostgreSQL-specific features + +Use PostgreSQL capabilities that operators expect in production. + +* **WAL storage** — Optional dedicated volumes for Write-Ahead Logs when you want to separate I/O. +* **Tablespaces** — Custom [tablespaces](tablespaces.md) with dedicated storage. +* **Extensions** — Built-in support for extensions such as pg_stat_monitor, pgAudit, set_user, wal2json, plus ability to extend PostgreSQL with [custom extensions](custom-extensions.md). +* **Users and databases** — Automatically create users, databases, and manage credentials. +* **Init SQL** — Execute [custom SQL scripts during cluster initialization](initsql.md). + +## Standby clusters for disaster recovery + +Leverage disaster-recovery topologies that fit your RTO and RPO. + +* **Backups or streaming** — Deploy your standby cluster based on backups or streaming replication, depending on your architecture +* **Cross-namespace or cross-cluster** — Primary and standby clusters can run in different namespaces or Kubernetes clusters +* **Promotion** — Promote a standby to primary when you need to recover from an outage or drill a failover + +## Security and compliance + +Keep traffic and data protected with encryption and flexible TLS workflows. + +* **TLS for connections** — Encrypt client traffic and traffic between cluster components +* **Certificates** — Comply with your security policy via [custom certificates](tls-manual.md) or automated certificate generation [with cert-manager](tls-cert-manager.md) with configurable lifecycle management. + +## Monitoring and observability + +Understand performance and troubleshoot faster with metrics and optional Percona tooling. + +* **PMM integration** — Connect the cluster to [Percona Monitoring and Management (PMM) :octicons-link-external-16:](https://www.percona.com/software/database-tools/percona-monitoring-and-management) for dashboards and alerting. +* **pg_stat_monitor** — Get query performance insights with fingerprinting when you enable the extension. +* **Broad metrics** — Track connection counts, transaction rates, cache hit ratios, replication lag, and more. +* **Query analytics** — Deeper query analysis in PMM. See [Query Analytics :octicons-link-external-16:](https://docs.percona.com/percona-monitoring-and-management/3/use/qan/index.html#__tabbed_1_2) in the PMM documentation. + +## Operator capabilities + +Operate at the scale of your platform with flexible reconciliation scope. + +* **Selective namespaces** — Reconcile clusters in a single namespace or multi-namespace mode. See [cluster-wide deployment](cluster-wide.md) to learn more. +* **Concurrent reconciliation** — Run [concurrent reconciliations](reconciliation-concurrency.md) to manage many clusters efficiently. diff --git a/docs/operator-concepts.md b/docs/operator-concepts.md new file mode 100644 index 00000000..40c6bf3f --- /dev/null +++ b/docs/operator-concepts.md @@ -0,0 +1,85 @@ +# Kubernetes Operator concepts + +If you already run workloads on Kubernetes, you are used to declaring *desired state* in YAML and letting controllers create Pods, Services, and other objects. A **Kubernetes Operator** applies the same idea to a whole application stack—such as a database cluster—so you do not have to assemble every low-level object by hand. + +This page explains what Operators are in general and how **Percona Operator for PostgreSQL** fits in that model. For how reconciliation and Custom Resources work in practice, see [How the Operator works](operator-how-it-works.md). To learn more about what the Operator manages, see [Architecture](architecture.md). + +## What is a Kubernetes Operator? + +An Operator is a way to **package, deploy, and manage** a complex application like databases on Kubernetes. It extends the Kubernetes API with custom resources so you can create and update instances of that application using the same `kubectl` and GitOps workflows you use elsewhere. + +At its core, an Operator is a **custom controller**. It watches [custom resources](#custom-resources-explained), compares what *should* exist (desired state) with what *does* exist (current state), and takes steps to bring them in line. That pattern is often called a **control loop** or **reconciliation loop**. + +**Percona Operator for PostgreSQL** builds on patterns and components from [CrunchyData’s PostgreSQL Operator :octicons-link-external-16:](https://access.crunchydata.com/documentation/postgres-operator/v5/) and Percona’s PostgreSQL distribution practices. Percona continues to develop the Operator as open source software alongside the database and tooling stack. + +Percona Operator for PostgreSQL is particularly valuable for stateful applications like databases, which require complex lifecycle management, including initialization, scaling, backups, updates, disaster recovery and more. + +## Custom Resources explained + +How Percona Operator for PostgreSQL works in Kubernetes is defined by the relationship between these core components: + +* Custom Resource Definition +* Custom Resource +* Operator Deployment + +### Custom Resource Definition (CRD) + +A **Custom Resource Definition** is a schema that defines a new type of resource in Kubernetes. It tells Kubernetes: + +- What fields the custom resource will have +- What types of values those fields can contain +- How the resource should be validated + +For example, a Custom Resource Definition for a database cluster called `PostgreSQLCluster` has fields like `replicas`, `storageSize`, `postgresVersion`, etc. + +### Custom Resource (CR) + +A **Custom Resource** is an instance of a CRD. It represents a specific deployment of your application. When you create a Custom Resource, you're essentially declaring your desired state for that application. + +For instance, you might create a Custom Resource named `my-postgres-cluster` that specifies that you want a PostgreSQL cluster with 3 replicas, 100GB of storage, and PostgreSQL version 18. + +### Operator Deployment + +The **Operator** itself is deployed as a standard Kubernetes Deployment. It runs as a Pod (or set of Pods) in your cluster and contains the controller logic that: + +1. Watches for Custom Resources of the type defined by the CRD +2. Reads the desired state from the Custom Resource +3. Compares it with the current state of the cluster +4. Takes actions to reconcile any differences - to bring the current cluster state to the desired state defined in the Custom Resource + +In short: the CRD defines the *shape* of the API, the CR is *your* declaration, and the Operator is the automation that *implements* it. + +The following is the components workflow: + + +![image](assets/images/operator-flow-diagram.png) + +1. **Install CRDs** — The new API kinds appear in the cluster. + +2. **Install Operator Deployment**: The Operator Pod is deployed, and it starts watching for resources. + +3. **Create Custom Resource**: You create and apply a Custom Resource YAML file that describes your desired cluster state + +4. **Reconcile**: The Operator detects the new Custom Resource, reads its specification, and creates the necessary Kubernetes resources (StatefulSets, Services, ConfigMaps, Secrets, etc.) to bring the PostgreSQL cluster into existence. + +5. **Monitor continuously**: The Operator continuously monitors both the Custom Resource and the actual cluster state, making adjustments whenever they diverge. + +## Why use an Operator for a database? + +The Operator is a game-changer for database management on Kubernetes: it frees your team from tedious, error-prone infrastructure work, so you can focus on building the applications that matter. + +Traditionally, deploying PostgreSQL in Kubernetes means you must create, wire together, and continuously maintain many objects like StatefulSets, Services, PVCs, ConfigMaps, Secrets, backup Jobs, monitoring hooks, and more. Every change or upgrade requires manual intervention and careful coordination. + +With an Operator, all you do is describe your desired cluster in a Custom Resource. The Operator reads it and does the works for you: + +* It automatically translates your intent into concrete, production-ready objects +* Integrates trusted open source tools like Patroni for high availability pgBackRest for backup/restore operations, and pgBouncer for connection pooling +* Keeps your cluster healthy and up to date, even as requirements change + +Day-to-day, the Operator automates the most challenging and repetitive database tasks: seamless failovers, backup management, rolling upgrades, scaling, and self-healing. + +By trusting the Operator to manage your database infrastructure, your team can focus on building features and improving your applications and not manage YAML or deal with unexpected outages. + +## Next steps + +[Features and capabilities](features.md){.md-button} diff --git a/docs/operator-how-it-works.md b/docs/operator-how-it-works.md new file mode 100644 index 00000000..160dd0dd --- /dev/null +++ b/docs/operator-how-it-works.md @@ -0,0 +1,32 @@ +# How the Operator works + +This page describes how Percona Operator for PostgreSQL plugs into Kubernetes and how it keeps your database cluster in the state you define. + +If you are new to the Operator pattern itself, start with [Kubernetes Operators and Percona Operator](operator-concepts.md). For the runtime stack (database processes, Patroni, backups), see [Cluster architecture](architecture.md). + +## How the Operator extends Kubernetes API + +Standard Kubernetes resources cover Deployments, Services, and many other kinds. The Operator registers additional Custom Resources, defined via [Custom Resource Definitions (CRDs) :octicons-link-external-16:](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/#customresourcedefinitions). + +These Custom Resources are: + +* `PerconaPGCluster` for the database cluster +* `PerconaPGBackup` for backups +* `PerconaPGRestore` for restores +* `PerconaPGUpgrade` for major upgrades + +The Operator itself runs as a **Deployment** in the Kubernetes cluster. It has controller logic that watches the Custom Resources that the CRDs define. Whenever you create or change a relevant Custom Resource, the Operator's reconciliation loop automatically does the following: + +* Creates and manages the necessary Kubernetes resources (StatefulSets, Services, Pods) +* Ensures your cluster matches the desired state you’ve defined +* Monitors the cluster health and automatically recovers from failures +* Coordinates upgrades and scaling operations + +These operations ensure that your actual database environment always matches your request. + +To learn more about features and capabilities that the Operator brings in, see [Features and capabilities](features.md). + +## Next steps + +* [Cluster architecture](architecture.md){.md-button} + diff --git a/docs/scaling.md b/docs/scaling.md index ef094166..21c55cef 100644 --- a/docs/scaling.md +++ b/docs/scaling.md @@ -10,7 +10,7 @@ This document focuses on vertical scaling. For deploying high-availability, see ### Scale compute -There are multiple components that the Operator deploys and manages: PostgreSQL instances, pgBouncer connection pooler, pgBackRest and others (See [Architecture](architecture.md) for the full list of components.) +There are multiple components that the Operator deploys and manages: PostgreSQL instances, pgBouncer connection pooler, pgBackRest and others (see [Architecture](architecture.md) for the full list of components). You can manage compute resources for a specific component using the corresponding section in the Custom Resource manifest. We follow the structure for [requests and limits :octicons-link-external-16:](https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/) that Kubernetes provides. diff --git a/mkdocs-base.yml b/mkdocs-base.yml index aafc9a7f..f33dae58 100644 --- a/mkdocs-base.yml +++ b/mkdocs-base.yml @@ -172,9 +172,12 @@ extra: # Used in main.html template and can't be externalized nav: - Home: index.md - - Discover the Operator: + - Understand the Operator: + - "Kubernetes Operator concepts": operator-concepts.md + - "Features and capabilities": features.md + - "How the Operator works": operator-how-it-works.md + - "Cluster architecture": architecture.md - "Comparison with other solutions": compare.md - - "Design and architecture": architecture.md - get-help.md - Quickstart guide: - "Overview": quickstart.md diff --git a/requirements.txt b/requirements.txt index c35e3960..b1692f1f 100644 --- a/requirements.txt +++ b/requirements.txt @@ -12,4 +12,5 @@ mkdocs-section-index mkdocs-htmlproofer-plugin mkdocs-meta-descriptions-plugin mike -mkdocs-git-committers-plugin-2 \ No newline at end of file +mkdocs-git-committers-plugin-2 +diagrams>=0.23.0 \ No newline at end of file