From bad522aa67bc5a71188807246d5d70e45c6867bf Mon Sep 17 00:00:00 2001 From: ian-flores Date: Mon, 2 Mar 2026 12:19:22 -0800 Subject: [PATCH 1/4] docs: tighten prose across documentation --- CONTRIBUTING.md | 36 +-- README.md | 14 +- docs/README.md | 14 +- docs/api-reference.md | 30 +-- docs/architecture.md | 78 +++--- docs/guides/adding-config-options.md | 24 +- docs/guides/authentication-setup.md | 90 +++---- docs/guides/connect-configuration.md | 66 ++--- docs/guides/packagemanager-configuration.md | 26 +- docs/guides/product-team-site-management.md | 22 +- docs/guides/troubleshooting.md | 268 ++++++++++---------- docs/guides/upgrading.md | 40 +-- docs/guides/workbench-configuration.md | 88 +++---- docs/testing.md | 16 +- 14 files changed, 406 insertions(+), 406 deletions(-) diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 55d6745..f9c43df 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -1,6 +1,6 @@ # Contributing to Team Operator -Welcome! We appreciate your interest in contributing to the Team Operator project. This guide will help you get started with development and understand our contribution workflow. +This guide covers development setup and contribution workflow for the Team Operator project. ## Table of Contents @@ -16,7 +16,7 @@ Welcome! We appreciate your interest in contributing to the Team Operator projec ## Project Overview -Team Operator is a Kubernetes operator built with [Kubebuilder](https://book.kubebuilder.io/) that automates the deployment, configuration, and management of Posit Team products (Workbench, Connect, Package Manager, and Chronicle) within Kubernetes clusters. +Team Operator is a Kubernetes operator built with [Kubebuilder](https://book.kubebuilder.io/) that automates deployment, configuration, and management of Posit Team products (Workbench, Connect, Package Manager, and Chronicle) within Kubernetes clusters. > **Note**: This repository is under active development and is not yet ready for production use. @@ -149,13 +149,13 @@ just mtest git checkout -b your-feature-name ``` -2. Keep branch names descriptive and use hyphens (avoid slashes): +2. Use descriptive branch names with hyphens (avoid slashes): - Good: `add-workbench-scaling`, `fix-database-connection` - Avoid: `feature/workbench`, `fix/db` ### PR Title Conventions (Required) -We use [Conventional Commits](https://www.conventionalcommits.org/) and **squash merging**. Your PR title becomes the commit message, so it must follow this format: +We use [Conventional Commits](https://www.conventionalcommits.org/) and squash merging. Your PR title becomes the commit message and must follow this format: ``` (): @@ -203,11 +203,11 @@ feat!: change API response format go vet ./... ``` -3. **Follow existing patterns** - New code should look like it belongs in the codebase. +3. **Follow existing patterns.** New code should match the codebase. -4. **Keep functions focused** - Each function should do one thing well. +4. **Keep functions focused.** Each function should do one thing well. -5. **Use descriptive names** - Names should reveal intent. +5. **Use descriptive names.** Names should reveal intent. ### Adding New CRD Fields @@ -343,13 +343,13 @@ The following checks must pass: ### Merging -This repository uses **squash and merge**. Your PR title becomes the final commit message on `main`. This is why the PR title format is enforced - semantic-release analyzes these commit messages to determine version bumps. +This repository uses squash and merge. Your PR title becomes the final commit message on `main`. The PR title format is enforced because semantic-release analyzes commit messages to determine version bumps. ### Review Expectations - PRs require at least one approval before merging -- Address all review comments or explain why you disagree -- Keep PRs focused - smaller PRs are easier to review +- Address all review comments or explain disagreements +- Keep PRs focused. Smaller PRs are easier to review. - Respond to feedback promptly ## Code Review Guidelines @@ -358,9 +358,9 @@ We follow specific guidelines for code review. For detailed review standards, se ### Core Principles -- **Simplicity** - Prefer explicit over clever -- **Maintainability** - Follow existing patterns in the codebase -- **Security** - Extra scrutiny for credential handling, RBAC, and network operations +- **Simplicity:** Prefer explicit over clever +- **Maintainability:** Follow existing patterns in the codebase +- **Security:** Extra scrutiny for credential handling, RBAC, and network operations ### Review Checklist by Area @@ -419,12 +419,12 @@ Once installed, roborev will run after each `git commit` and submit a review for ## Getting Help -If you have questions or need help: +If you have questions: -1. **Check existing documentation** - README.md, this guide, and inline code comments -2. **Search existing issues** - Your question may have been answered before -3. **Open an issue** - For bugs, feature requests, or questions -4. **Contact Posit** - For production use inquiries, [contact Posit](https://posit.co/schedule-a-call/) +1. **Check existing documentation:** README.md, this guide, and inline code comments +2. **Search existing issues:** Your question may have been answered +3. **Open an issue:** For bugs, feature requests, or questions +4. **Contact Posit:** For production use inquiries, [contact Posit](https://posit.co/schedule-a-call/) ## Quick Reference diff --git a/README.md b/README.md index f0ae0a7..13cfd26 100644 --- a/README.md +++ b/README.md @@ -3,7 +3,7 @@ > **Warning** > This repository is under active development and is not yet ready for production use. Please [contact Posit](https://posit.co/schedule-a-call/) before using this operator. -A Kubernetes operator that manages the deployment and lifecycle of Posit Team products (Workbench, Connect, Package Manager, and Chronicle) within Kubernetes clusters. +A Kubernetes operator that manages deployment and lifecycle of Posit Team products (Workbench, Connect, Package Manager, and Chronicle) within Kubernetes clusters. ## Table of Contents @@ -22,7 +22,7 @@ A Kubernetes operator that manages the deployment and lifecycle of Posit Team pr ## Overview -The Team Operator is a Kubernetes controller built using [Kubebuilder](https://book.kubebuilder.io/) that automates the deployment, configuration, and management of Posit Team products. It handles: +The Team Operator is a Kubernetes controller built with [Kubebuilder](https://book.kubebuilder.io/) that automates deployment, configuration, and management of Posit Team products. It handles: - Multi-product Posit Team deployments through a single `Site` Custom Resource - Database provisioning and management for each product @@ -59,7 +59,7 @@ spec: ``` **Layout behavior:** -- When Academy is hidden (default), the three core products (Workbench, Connect, Package Manager) are displayed with Workbench and Connect in the first row, and Package Manager centered in the second row +- When Academy is hidden (default), the three core products (Workbench, Connect, Package Manager) display with Workbench and Connect in the first row, Package Manager centered in the second row - When Academy is shown, all four products display in a 2x2 grid **Static assets:** @@ -131,16 +131,16 @@ just helm-uninstall # Uninstall via Helm make go-test ``` -**Integration tests** — two workflows: +**Integration tests** use two workflows: -*One-shot (CI-style):* creates a cluster, runs all tests, and tears everything down. +*One-shot (CI-style):* creates a cluster, runs all tests, tears everything down. ```bash make test-kind # create → deploy → test → destroy make test-kind-full # same, but forces a clean cluster first ``` -*Dev loop (recommended for iterative development):* keep the cluster running between test runs. +*Dev loop (recommended for iterative development):* keeps the cluster running between test runs. ```bash # One-time setup: create cluster and deploy operator @@ -220,7 +220,7 @@ Site CR (single source of truth) └── PostgreSQL schemas, credentials, migrations ``` -Each product has dedicated database schemas and isolated credentials. Workbench and Connect support off-host execution where user workloads run in separate Kubernetes Jobs. Chronicle collects telemetry via sidecars injected into product pods. +Each product has dedicated database schemas and isolated credentials. Workbench and Connect support off-host execution with user workloads in separate Kubernetes Jobs. Chronicle collects telemetry via sidecars injected into product pods. For detailed architecture diagrams with component explanations, see the [Architecture Documentation](docs/architecture.md). diff --git a/docs/README.md b/docs/README.md index 63bd177..4333cb5 100644 --- a/docs/README.md +++ b/docs/README.md @@ -1,10 +1,10 @@ # Team Operator -The Team Operator is a Kubernetes operator that manages the deployment and configuration of Posit Team products within a Kubernetes cluster. +The Team Operator manages deployment and configuration of Posit Team products in Kubernetes. ## Overview -The Team Operator automates the deployment and lifecycle management of: +The operator automates deployment and lifecycle management of: - **Posit Workbench** - Interactive development environment - **Posit Connect** - Publishing and sharing platform - **Posit Package Manager** - Package repository management @@ -24,7 +24,7 @@ Site CRD (single source of truth) └── Keycloak configuration ``` -The Site controller watches for Site resources and reconciles product-specific Custom Resources for each enabled product. +The Site controller watches Site resources and creates product-specific Custom Resources for enabled products. ### Overall System Architecture @@ -286,14 +286,14 @@ The `Site` Custom Resource is the primary configuration point. It contains: ### Configuration Propagation -Configuration flows from Site CRD to individual product CRDs: +Configuration flows from Site CRD to product CRDs: 1. User edits Site spec 2. Site controller detects change 3. Site controller updates product CRs 4. Product controllers reconcile deployments -See [Adding Config Options](../guides/adding-config-options.md) for details on extending configuration. +See [Adding Config Options](../guides/adding-config-options.md) for extending configuration. ## Quick Start @@ -322,8 +322,8 @@ Team Operator uses two namespaces: | Namespace | Purpose | |-----------|---------| -| `posit-team-system` | Where the operator controller runs | -| `posit-team` (or configured `watchNamespace`) | Where Site CRs and deployed products live | +| `posit-team-system` | Operator controller runs here | +| `posit-team` (or configured `watchNamespace`) | Site CRs and deployed products run here | ## Related Documentation diff --git a/docs/api-reference.md b/docs/api-reference.md index 5616194..67e38cd 100644 --- a/docs/api-reference.md +++ b/docs/api-reference.md @@ -1,6 +1,6 @@ # Team Operator API Reference -This document provides a comprehensive reference for the Custom Resource Definitions (CRDs) provided by the Team Operator. +This document covers the Custom Resource Definitions (CRDs) the Team Operator provides. **API Group:** `team.posit.co/v1beta1` @@ -26,7 +26,7 @@ This document provides a comprehensive reference for the Custom Resource Definit ## Site -The Site CRD is the primary resource for managing a complete Posit Team deployment. It orchestrates all product components (Connect, Workbench, Package Manager, Chronicle) within a single site. +The Site CRD is the primary resource for managing a complete Posit Team deployment. It orchestrates all product components (Connect, Workbench, Package Manager, Chronicle) in a single site. **Kind:** `Site` **Plural:** `sites` @@ -36,16 +36,16 @@ The Site CRD is the primary resource for managing a complete Posit Team deployme | Field | Type | Required | Description | |-------|------|----------|-------------| -| `.spec.domain` | `string` | **Yes** | The core domain name associated with the Posit Team Site | +| `.spec.domain` | `string` | **Yes** | The core domain name for the Posit Team Site | | `.spec.awsAccountId` | `string` | No | AWS Account ID used for EKS-to-IAM annotations | | `.spec.clusterDate` | `string` | No | Cluster date ID (YYYYmmdd) used for EKS-to-IAM annotations | | `.spec.workloadCompoundName` | `string` | No | Name for the workload | | `.spec.secretType` | `SiteSecretType` | No | **DEPRECATED** - Type of secret management to use | | `.spec.ingressClass` | `string` | No | Ingress class for creating ingress routes | | `.spec.ingressAnnotations` | `map[string]string` | No | Annotations applied to all ingress routes | -| `.spec.imagePullSecrets` | `[]string` | No | Image pull secrets for all image pulls (must exist in namespace) | -| `.spec.volumeSource` | [`VolumeSource`](#volumesource) | No | Definition of where volumes should be created from | -| `.spec.sharedDirectory` | `string` | No | Name of directory mounted into Workbench and Connect at `/mnt/` (no slashes) | +| `.spec.imagePullSecrets` | `[]string` | No | Image pull secrets for all image pulls (secrets must exist in namespace) | +| `.spec.volumeSource` | [`VolumeSource`](#volumesource) | No | Where volumes are created from | +| `.spec.sharedDirectory` | `string` | No | Directory name mounted into Workbench and Connect at `/mnt/` (no slashes) | | `.spec.volumeSubdirJobOff` | `bool` | No | Disables VolumeSubdir provisioning Kubernetes job | | `.spec.extraSiteServiceAccounts` | `[]ServiceAccountConfig` | No | Additional service accounts prefixed by `-` | | `.spec.secret` | [`SecretConfig`](#secretconfig) | No | Secret management configuration for this Site | @@ -57,7 +57,7 @@ The Site CRD is the primary resource for managing a complete Posit Team deployme | `.spec.logFormat` | `LogFormat` | No | Log output format | | `.spec.networkTrust` | `NetworkTrust` | No | Network trust level (0-100, default: 100) | | `.spec.packageManagerUrl` | `string` | No | Package Manager URL for Workbench (defaults to local Package Manager) | -| `.spec.efsEnabled` | `bool` | No | Enable EFS for this site (allows workbench sessions to access EFS mount targets) | +| `.spec.efsEnabled` | `bool` | No | Enable EFS for this site (workbench sessions can access EFS mount targets) | | `.spec.vpcCIDR` | `string` | No | VPC CIDR block for EFS network policies | | `.spec.enableFqdnHealthChecks` | `*bool` | No | Enable FQDN-based health check targets for Grafana Alloy (default: true) | @@ -119,7 +119,7 @@ spec: ## Connect -The Connect CRD manages standalone Posit Connect deployments. When using the Site CRD, Connect configuration is typically specified via `.spec.connect` rather than creating a separate Connect resource. +The Connect CRD manages standalone Posit Connect deployments. When using the Site CRD, specify Connect configuration through `.spec.connect` instead of creating a separate Connect resource. **Kind:** `Connect` **Plural:** `connects` @@ -203,7 +203,7 @@ spec: ## Workbench -The Workbench CRD manages standalone Posit Workbench deployments. When using the Site CRD, Workbench configuration is typically specified via `.spec.workbench` rather than creating a separate Workbench resource. +The Workbench CRD manages standalone Posit Workbench deployments. When using the Site CRD, specify Workbench configuration through `.spec.workbench` instead of creating a separate Workbench resource. **Kind:** `Workbench` **Plural:** `workbenches` @@ -292,7 +292,7 @@ spec: ## PackageManager -The PackageManager CRD manages standalone Posit Package Manager deployments. When using the Site CRD, Package Manager configuration is typically specified via `.spec.packageManager` rather than creating a separate PackageManager resource. +The PackageManager CRD manages standalone Posit Package Manager deployments. When using the Site CRD, specify Package Manager configuration through `.spec.packageManager` instead of creating a separate PackageManager resource. **Kind:** `PackageManager` **Plural:** `packagemanagers` @@ -581,7 +581,7 @@ Authentication configuration used by Connect and Workbench. ### SecretConfig -Configuration for secret management. +Secret management configuration. | Field | Type | Description | |-------|------|-------------| @@ -598,7 +598,7 @@ Configuration for secret management. ### VolumeSource -Configuration for the source of persistent volumes. +Source configuration for persistent volumes. | Field | Type | Description | |-------|------|-------------| @@ -649,7 +649,7 @@ Product license configuration. ### SessionConfig -Configuration for session pods (Connect and Workbench). +Session pod configuration (Connect and Workbench). | Field | Type | Description | |-------|------|-------------| @@ -743,8 +743,8 @@ These types are used within the Site CRD for product configuration. | Field | Type | Description | |-------|------|-------------| -| `.enabled` | `*bool` | Controls whether Connect is running (default: true). Setting to `false` suspends Connect: stops pods and removes ingress/service, but preserves PVC, database, and secrets. Re-enabling restores full service without data loss. See [Connect Configuration Guide](guides/connect-configuration.md#enablingdisabling-connect). | -| `.teardown` | `*bool` | When `true` and `enabled` is `false`, permanently destroys all Connect resources including the database, secrets, and PVC. Re-enabling after teardown starts fresh with an empty database. Defaults to `false`. | +| `.enabled` | `*bool` | Controls whether Connect runs (default: true). Setting to `false` suspends Connect: stops pods and removes ingress/service, but preserves PVC, database, and secrets. Re-enabling restores full service without data loss. See [Connect Configuration Guide](guides/connect-configuration.md#enablingdisabling-connect). | +| `.teardown` | `*bool` | When `true` and `enabled` is `false`, destroys all Connect resources including database, secrets, and PVC. Re-enabling after teardown starts fresh with an empty database. Defaults to `false`. | | `.license` | `LicenseSpec` | License configuration | | `.volume` | `*VolumeSpec` | Data volume | | `.nodeSelector` | `map[string]string` | Node selector | diff --git a/docs/architecture.md b/docs/architecture.md index eee51ef..4a6648f 100644 --- a/docs/architecture.md +++ b/docs/architecture.md @@ -1,6 +1,6 @@ # Team Operator Architecture -This document provides detailed architecture diagrams and explanations for the Team Operator and its managed products. +This document covers architecture diagrams and explanations for the Team Operator and its managed products. ## Table of Contents @@ -16,7 +16,7 @@ This document provides detailed architecture diagrams and explanations for the T ## System Overview -The Team Operator follows the Kubernetes operator pattern: a Site Custom Resource (CR) serves as the single source of truth, and controllers reconcile the desired state into running Kubernetes resources. +The Team Operator follows the Kubernetes operator pattern. A Site Custom Resource (CR) is the single source of truth. Controllers reconcile the desired state into running Kubernetes resources. ``` User creates Site CR @@ -43,7 +43,7 @@ Kubernetes resources created (Deployments, Services, Ingress, etc.) ## Database Architecture -Each Posit Team product requires database storage. The operator provisions separate databases with dedicated users and schemas. +Each Posit Team product needs database storage. The operator provisions separate databases with dedicated users and schemas. ```mermaid flowchart TB @@ -91,14 +91,14 @@ flowchart TB Each product gets a dedicated database user with access only to its own schemas. This provides: - **Security isolation**: Products cannot access each other's data -- **Resource tracking**: Database connections can be attributed to specific products +- **Resource tracking**: You can attribute database connections to specific products - **Independent credentials**: Rotating one product's credentials doesn't affect others --- ## Connect Architecture -Posit Connect is a publishing platform for data science content. The operator manages its deployment including off-host content execution. +Posit Connect is a publishing platform for data science content. The operator manages deployment, including off-host content execution. ```mermaid flowchart TB @@ -176,7 +176,7 @@ flowchart TB | Component | Description | |-----------|-------------| -| **Manual Setup** | One-time configuration performed by the administrator before deployment | +| **Manual Setup** | One-time configuration an administrator performs before deployment | | **License** | Posit Connect license file or activation key, stored in a Kubernetes Secret or AWS Secrets Manager | | **Auth Client Secret** | OIDC/SAML client credentials for SSO integration (client ID and secret from your IdP) | | **Main DB Connection** | PostgreSQL connection string for the external database server | @@ -186,27 +186,27 @@ flowchart TB | Component | Description | |-----------|-------------| | **Site Controller** | Watches Site CRs and creates product-specific CRs (Connect, Workbench, etc.). Manages shared resources like PersistentVolumes. | -| **Database Controller** | Creates databases and schemas within the PostgreSQL server. Generates credentials and stores them in Secrets. | -| **Connect Controller** | Watches Connect CRs and creates all Kubernetes resources needed to run Connect. | +| **Database Controller** | Creates databases and schemas in the PostgreSQL server. Generates credentials and stores them in Secrets. | +| **Connect Controller** | Watches Connect CRs and creates all Kubernetes resources Connect needs. | #### Kubernetes Resources (Green) | Component | Description | |-----------|-------------| -| **PersistentVolume (PV)** | Cluster-level storage resource representing physical storage (NFS, FSx, Azure NetApp) | -| **PersistentVolumeClaim (PVC)** | Namespace-scoped claim that binds to a PV. Mounted into the Connect pod for content storage. | +| **PersistentVolume (PV)** | Cluster-level storage resource that represents physical storage (NFS, FSx, Azure NetApp) | +| **PersistentVolumeClaim (PVC)** | Namespace-scoped claim that binds to a PV. Mounts into the Connect pod for content storage. | | **ConfigMaps** | Connect configuration files (`rstudio-connect.gcfg`) generated from the CR spec | -| **DB Password Secret** | Auto-generated database credentials created by the Database Controller | +| **DB Password Secret** | Auto-generated database credentials the Database Controller creates | | **Secret Key** | Encryption key for Connect's internal data encryption | -| **Connect Pod** | The main Connect server container running the publishing platform | -| **Ingress** | Routes external traffic to the Connect Service based on hostname | -| **Service** | Kubernetes Service providing stable networking for the Connect Pod | +| **Connect Pod** | The main Connect server container that runs the publishing platform | +| **Ingress** | Routes external traffic to the Connect Service by hostname | +| **Service** | Kubernetes Service that provides stable networking for the Connect Pod | ### Off-Host Execution -When off-host execution is enabled, Connect runs content (Shiny apps, APIs, reports) in separate Kubernetes Jobs rather than in the main Connect pod. This provides: +When off-host execution is enabled, Connect runs content (Shiny apps, APIs, reports) in separate Kubernetes Jobs instead of the main Connect pod. This provides: - **Resource isolation**: Content processes don't compete with the Connect server -- **Scalability**: Content can scale independently +- **Scalability**: Content scales independently - **Security**: Content runs with minimal privileges See the [Connect Configuration Guide](guides/connect-configuration.md) for details. @@ -215,7 +215,7 @@ See the [Connect Configuration Guide](guides/connect-configuration.md) for detai ## Workbench Architecture -Posit Workbench provides IDE environments (RStudio, VS Code, Jupyter) for data scientists. The operator manages both the main server and user session pods. +Posit Workbench provides IDE environments (RStudio, VS Code, Jupyter) for data scientists. The operator manages the main server and user session pods. ```mermaid flowchart TB @@ -328,28 +328,28 @@ Same as Connect - see [Connect Architecture](#component-descriptions) above. | Component | Description | |-----------|-------------| -| **PersistentVolume / PVC** | Shared project storage accessible by both the server and all session pods | +| **PersistentVolume / PVC** | Shared project storage the server and all session pods can access | | **Home Directory PVC** | User home directories, mounted into session pods at `/home/{username}` | -| **ConfigMaps** | Workbench configuration files including `rserver.conf`, `launcher.conf`, and IDE settings | -| **Job Templates** | Kubernetes Job/Service templates used by the Launcher to create session pods | -| **Workbench Pod** | The main Workbench server handling authentication, the web UI, and session management | +| **ConfigMaps** | Workbench configuration files: `rserver.conf`, `launcher.conf`, and IDE settings | +| **Job Templates** | Kubernetes Job/Service templates the Launcher uses to create session pods | +| **Workbench Pod** | The main Workbench server that handles authentication, the web UI, and session management | | **Ingress / Service** | Network routing for external access to Workbench | #### Session Infrastructure (Orange) | Component | Description | |-----------|-------------| -| **Job Launcher** | Component within Workbench that creates Kubernetes Jobs for user sessions | +| **Job Launcher** | Component in Workbench that creates Kubernetes Jobs for user sessions | | **Session Pod** | Individual IDE sessions (RStudio, VS Code, Jupyter) running as Kubernetes Jobs. Each user session gets its own pod with dedicated resources. | ### Session Lifecycle 1. User logs into Workbench and requests a new session -2. Job Launcher creates a Kubernetes Job using the configured template -3. Session Pod starts with the selected IDE and mounts user's home directory -4. User works in the session; all files are saved to persistent storage +2. Job Launcher creates a Kubernetes Job from the configured template +3. Session Pod starts with the selected IDE and mounts the user's home directory +4. User works in the session. All files save to persistent storage 5. When the session ends, the Job completes and the Pod is cleaned up -6. User's work persists in the Home Directory PVC for the next session +6. The user's work persists in the Home Directory PVC for the next session ### Storage Architecture @@ -367,7 +367,7 @@ See the [Workbench Configuration Guide](guides/workbench-configuration.md) for d ## Package Manager Architecture -Posit Package Manager provides a local repository for R and Python packages. It can mirror public repositories and host private packages. +Posit Package Manager provides a local repository for R and Python packages. It mirrors public repositories and hosts private packages. ```mermaid flowchart TB @@ -471,10 +471,10 @@ flowchart TB | Component | Description | |-----------|-------------| -| **S3 Bucket** | AWS S3 storage for package binaries (recommended for AWS deployments) | -| **Azure Files** | Azure file storage for package binaries (recommended for Azure deployments) | +| **S3 Bucket** | AWS S3 storage for package binaries (recommended for AWS) | +| **Azure Files** | Azure file storage for package binaries (recommended for Azure) | -Package Manager can use either cloud storage backend. The choice typically depends on your cloud provider: +Package Manager can use either cloud storage backend. The choice depends on your cloud provider: - **AWS**: Use S3 for best performance and cost - **Azure**: Use Azure Files with the CSI driver - **On-premises**: Use the local PVC for package storage @@ -494,7 +494,7 @@ Package Manager can use either cloud storage backend. The choice typically depen | **PersistentVolume / PVC** | Local storage for temporary files and cache (when not using cloud storage) | | **ConfigMaps** | Package Manager configuration (`rstudio-pm.gcfg`) | | **SSH Key Secret** | Mounted SSH keys for Git authentication during package builds | -| **Package Manager Pod** | The main server handling package requests, sync operations, and builds | +| **Package Manager Pod** | The main server that handles package requests, sync operations, and builds | | **Ingress / Service** | Network routing for package installation requests | ### Package Storage Options @@ -519,7 +519,7 @@ See the [Package Manager Configuration Guide](guides/packagemanager-configuratio ## Flightdeck Architecture -Flightdeck is the landing page and navigation hub for Posit Team deployments. It provides a simple dashboard for users to access the various products. +Flightdeck is the landing page and navigation hub for Posit Team deployments. It provides a dashboard for users to access the products. ```mermaid flowchart TB @@ -593,8 +593,8 @@ flowchart TB | Component | Description | |-----------|-------------| -| **ConfigMap** | Configuration for Flightdeck including enabled features and product URLs | -| **Flightdeck Pod** | Static web server serving the landing page HTML/CSS/JS | +| **ConfigMap** | Configuration for Flightdeck: enabled features and product URLs | +| **Flightdeck Pod** | Static web server that serves the landing page HTML/CSS/JS | | **Ingress** | Routes traffic from the base domain to Flightdeck | | **Service** | Kubernetes Service for the Flightdeck Pod | @@ -608,12 +608,12 @@ flowchart TB ### Features -Flightdeck is intentionally simple: +Flightdeck is simple by design: - **No database**: Serves static content only - **No authentication**: Relies on product-level authentication - **Configurable layout**: Shows only enabled products -- **Optional Academy**: Can display a fourth card for Posit Academy +- **Optional Academy**: Displays a fourth card for Posit Academy ### Configuration Options @@ -627,7 +627,7 @@ Flightdeck is intentionally simple: ## Chronicle Architecture -Chronicle is the telemetry and usage tracking service for Posit Team. It collects metrics from Connect and Workbench via sidecar containers. +Chronicle is the telemetry and usage tracking service for Posit Team. It collects metrics from Connect and Workbench through sidecar containers. ```mermaid flowchart TB @@ -730,7 +730,7 @@ flowchart TB | Component | Description | |-----------|-------------| | **Connect/Workbench Container** | Main product container that generates usage metrics | -| **Chronicle Sidecar** | Lightweight agent that collects metrics from the main container and forwards them to the Chronicle service | +| **Chronicle Sidecar** | Lightweight agent that collects metrics from the main container and forwards them to Chronicle | #### Telemetry Storage (Light Blue) @@ -749,7 +749,7 @@ flowchart TB ### Sidecar Injection -The Chronicle sidecar is automatically injected into product pods when: +The Chronicle sidecar injects automatically into product pods when: - Chronicle is enabled in the Site spec (`spec.chronicle.enabled: true`) - The product has Chronicle integration enabled diff --git a/docs/guides/adding-config-options.md b/docs/guides/adding-config-options.md index b1cf8f1..77ae2ab 100644 --- a/docs/guides/adding-config-options.md +++ b/docs/guides/adding-config-options.md @@ -1,6 +1,6 @@ # Adding Configuration Options to Team Operator -This guide walks through the process of adding new configuration options to Posit Team products managed by the Team Operator. It covers the complete flow from Site CRD to product-specific configuration. +This guide shows how to add configuration options to Posit Team products managed by the Team Operator. Configuration flows from the Site CRD through to product-specific settings. ## Configuration Architecture Overview @@ -24,19 +24,19 @@ Product Controller (generates actual config files) ### Key Concepts -1. **Site CRD**: The primary user-facing resource. Users configure their entire Posit Team deployment through a single Site resource. +1. **Site CRD**: The user-facing resource. Users configure their Posit Team deployment through a single Site resource. -2. **Internal{Product}Spec**: Nested structs within SiteSpec that contain product-specific configuration at the Site level. +2. **Internal{Product}Spec**: Nested structs within SiteSpec that hold product-specific configuration at the Site level. -3. **Product CRs**: Individual Custom Resources (Connect, Workbench, etc.) created by the Site controller. These are implementation details users typically don't interact with directly. +3. **Product CRs**: Individual Custom Resources (Connect, Workbench, etc.) created by the Site controller. Users don't interact with these directly. -4. **Propagation**: The Site controller maps Site-level configuration to the appropriate Product CR fields. +4. **Propagation**: The Site controller maps Site-level configuration to Product CR fields. ## Step-by-Step: Adding a New Config Option ### Prerequisites -Before adding a config option, gather the following: +Before adding a config option, gather this information: | Item | Description | Example | |------|-------------|---------| @@ -190,7 +190,7 @@ func (r *ConnectReconciler) generateConfig(connect *v1beta1.Connect) string { **File**: `internal/controller/core/site_test.go` -Add tests to verify propagation works correctly. +Add tests to verify propagation. ```go func TestSiteReconciler_MaxConnections(t *testing.T) { @@ -238,7 +238,7 @@ just manifests ### Optional Integer (Pointer) -Use pointers for optional numeric values where zero is a valid setting: +Use pointers for optional numeric values where zero is valid: ```go // MaxWorkers sets the maximum number of workers @@ -273,7 +273,7 @@ LogLevel string `json:"logLevel,omitempty"` EnableFeatureX bool `json:"enableFeatureX,omitempty"` ``` -**Note**: With `omitempty`, false values are omitted from JSON. Only propagate when explicitly true: +**Note**: With `omitempty`, false values are omitted from JSON. Propagate only when explicitly true: ```go if site.Spec.Product.EnableFeatureX { @@ -397,14 +397,14 @@ Enabled bool `json:"enabled,omitempty"` ### 4. Product Config Name Mismatch -Always verify the product config path matches what the product expects: +Verify the product config path matches what the product expects: - Check product admin guides - Look at existing examples in the codebase - Test with the actual product ### 5. Not Regenerating CRDs -After modifying types, always run: +After modifying types, run: ```bash just generate just manifests @@ -412,7 +412,7 @@ just manifests ### 6. Overwriting Existing Config -Be careful not to overwrite configuration that may have been set elsewhere: +Don't overwrite configuration that may have been set elsewhere: **Wrong**: ```go diff --git a/docs/guides/authentication-setup.md b/docs/guides/authentication-setup.md index 4ca7f4d..c453cca 100644 --- a/docs/guides/authentication-setup.md +++ b/docs/guides/authentication-setup.md @@ -1,6 +1,6 @@ # Authentication Setup Guide -This guide provides comprehensive documentation for configuring authentication in Posit Team Operator. Team Operator supports multiple authentication methods for both Posit Connect and Posit Workbench. +This guide documents authentication configuration in Posit Team Operator. Team Operator supports multiple authentication methods for both Posit Connect and Posit Workbench. ## Table of Contents @@ -58,7 +58,7 @@ Team Operator supports three authentication types: ## OIDC Configuration -OpenID Connect (OIDC) is the recommended authentication method for enterprise deployments. +OpenID Connect (OIDC) is recommended for enterprise deployments. ### Basic OIDC Configuration @@ -78,14 +78,14 @@ spec: ### Required IdP Settings -Before configuring OIDC in Team Operator, you must configure your Identity Provider: +Before configuring OIDC in Team Operator, configure your Identity Provider: -1. **Create an OAuth2/OIDC Application** in your IdP -2. **Configure Redirect URIs**: +1. Create an OAuth2/OIDC Application in your IdP +2. Configure Redirect URIs: - Connect: `https://connect.example.com/__login__/callback` - Workbench: `https://workbench.example.com/oidc/callback` -3. **Note the Client ID** (provided in the spec) -4. **Generate a Client Secret** (stored in secrets) +3. Note the Client ID (used in the spec) +4. Generate a Client Secret (stored in secrets) ### Client Secret Configuration @@ -141,7 +141,7 @@ spec: **Disabling Groups Claim:** -Some IdPs do not support a groups claim. To explicitly disable it: +Some IdPs do not support a groups claim. Disable it explicitly: ```yaml spec: @@ -257,7 +257,7 @@ spec: ## SAML Configuration -SAML 2.0 authentication is supported for enterprise environments using SAML-based IdPs. +SAML 2.0 authentication is supported for enterprise environments. ### Basic SAML Configuration @@ -273,11 +273,11 @@ spec: ### Attribute Profiles -Team Operator supports two approaches for SAML attribute mapping: +Team Operator supports two approaches for SAML attribute mapping. #### 1. Using IdP Attribute Profiles -Use a predefined attribute profile that matches your IdP: +Use a predefined attribute profile matching your IdP: ```yaml spec: @@ -312,7 +312,7 @@ spec: ### SAML Service Provider (SP) Configuration -Your IdP needs to be configured with the following Service Provider details: +Configure your IdP with these Service Provider details: **Connect:** - Entity ID: `https://connect.example.com/__login__` @@ -378,7 +378,7 @@ spec: ## Password Authentication -Password authentication is the simplest authentication method, suitable for development environments. +Password authentication is the simplest method, suitable for development environments. ### Configuration @@ -407,7 +407,7 @@ spec: ## Role-Based Access Control -Team Operator supports automatic role mapping based on group membership from your IdP. +Team Operator supports automatic role mapping based on IdP group membership. ### Connect Role Mappings @@ -500,11 +500,11 @@ spec: ### Keycloak Features -When enabled, Team Operator: +When enabled, Team Operator does the following: - Deploys a Keycloak instance in the namespace - Creates a PostgreSQL database for Keycloak - Configures ingress routing to `key.` -- Sets up necessary service accounts and RBAC +- Sets up service accounts and RBAC ### Using Keycloak with Products @@ -525,8 +525,8 @@ spec: ### Keycloak Realm Configuration -After Keycloak is deployed, you'll need to: -1. Access Keycloak admin console at `https://key.` +After Keycloak deploys, complete these steps: +1. Access the Keycloak admin console at `https://key.` 2. Create a realm (e.g., "posit") 3. Create clients for each product 4. Configure client credentials and redirect URIs @@ -534,7 +534,7 @@ After Keycloak is deployed, you'll need to: ## Secrets Management -Authentication requires secrets to be properly configured in your secrets provider. +Authentication requires properly configured secrets in your secrets provider. ### Kubernetes Secrets @@ -594,9 +594,9 @@ For `secret.type: aws`, store secrets in AWS Secrets Manager: **Cause:** Groups claim not configured or not included in token. **Debug steps:** -1. Check if `groups: true` is set +1. Verify `groups: true` is set 2. Verify `groupsClaim` matches what your IdP sends -3. Ensure the `groups` scope is requested +3. Verify the `groups` scope is requested 4. Check if your IdP requires special configuration for group claims **Enable OIDC logging for Connect:** @@ -626,8 +626,8 @@ spec: **Cause:** The SAML metadata URL is unreachable from the cluster. **Solutions:** -- Ensure the metadata URL is accessible from pods -- Check network policies allow outbound connections +- Verify the metadata URL is accessible from pods +- Verify network policies allow outbound connections - Verify DNS resolution works #### 2. "IdPAttributeProfile Cannot Be Specified Together..." @@ -658,36 +658,36 @@ samlEmailAttribute: "..." To debug OIDC token claims: -1. **Enable Debug Logging:** - ```yaml - spec: - connect: - debug: true - ``` +Enable Debug Logging: +```yaml +spec: + connect: + debug: true +``` -2. **Check Pod Logs:** - ```bash - kubectl logs -n posit-team deploy/-connect -f - ``` +Check Pod Logs: +```bash +kubectl logs -n posit-team deploy/-connect -f +``` -3. **Decode JWT Tokens:** - Use [jwt.io](https://jwt.io) to inspect tokens and verify claims. +Decode JWT Tokens: +Use [jwt.io](https://jwt.io) to inspect tokens and verify claims. ### Group Membership Issues If users aren't getting the correct roles: -1. **Verify group claim is present:** - - Check the `groupsClaim` field matches your IdP - - Some IdPs use nested claims (e.g., `realm_access.roles`) +Verify group claim is present: +- Check the `groupsClaim` field matches your IdP +- Some IdPs use nested claims (e.g., `realm_access.roles`) -2. **Check group name matching:** - - Group names in role mappings must match exactly - - Group names are case-sensitive +Check group name matching: +- Group names in role mappings must match exactly +- Group names are case-sensitive -3. **Verify IdP configuration:** - - Ensure groups are included in the token - - Check token size limits (large group lists may be truncated) +Verify IdP configuration: +- Verify groups are included in the token +- Check token size limits (large group lists may be truncated) ### Workbench-Specific Issues @@ -698,7 +698,7 @@ Workbench may include port numbers in redirect URIs. The operator sets a header X-Rstudio-Request: https:// ``` -If you see port 443 in redirect URIs, ensure Traefik middleware is correctly applied. +If you see port 443 in redirect URIs, verify Traefik middleware is correctly applied. #### User Provisioning diff --git a/docs/guides/connect-configuration.md b/docs/guides/connect-configuration.md index 4cf585e..fdf64ac 100644 --- a/docs/guides/connect-configuration.md +++ b/docs/guides/connect-configuration.md @@ -1,6 +1,6 @@ # Connect Configuration Guide -This comprehensive guide covers all configuration options for Posit Connect when deployed via Team Operator. +This guide covers configuration options for Posit Connect when deployed via Team Operator. ## Table of Contents @@ -26,7 +26,7 @@ This comprehensive guide covers all configuration options for Posit Connect when ## Overview -Posit Connect is a publishing and sharing platform that allows data scientists to share their work with stakeholders. When deployed via Team Operator, Connect runs with off-host execution enabled by default, meaning content executes in isolated Kubernetes Jobs rather than on the Connect server itself. +Posit Connect is a publishing and sharing platform for data science work. When deployed via Team Operator, Connect runs with off-host execution enabled by default. Content executes in isolated Kubernetes Jobs rather than on the Connect server. ### Architecture in Team Operator @@ -51,10 +51,10 @@ Site CR Configuration for Connect flows through two paths: -1. **Site-level configuration** (`spec.connect` in Site CR) - Recommended for most deployments -2. **Direct Connect CR configuration** - For advanced use cases +1. **Site-level configuration** (`spec.connect` in Site CR) for most deployments +2. **Direct Connect CR configuration** for advanced use cases -When using a Site resource, the Site controller generates and manages the Connect CR. Changes to `site.spec.connect` automatically propagate to the Connect deployment. +When using a Site resource, the Site controller generates and manages the Connect CR. Changes to `site.spec.connect` propagate to the Connect deployment. --- @@ -66,7 +66,7 @@ Connect can be suspended or permanently torn down using the `enabled` and `teard #### Suspending Connect (non-destructive) -Setting `enabled: false` suspends Connect: the Deployment, Service, and Ingress are removed, but the PVC, database, and secrets are preserved. Re-enabling restores full service with all existing data intact. +Setting `enabled: false` suspends Connect. The Deployment, Service, and Ingress are removed, but the PVC, database, and secrets remain. Re-enabling restores full service with all existing data. ```yaml spec: @@ -76,11 +76,11 @@ spec: **When to use `enabled: false`:** -- Customer does not have a Connect license yet — deploy the site without Connect and enable it once a license is purchased -- Temporarily pause Connect during a maintenance window or cost-saving period -- Stop Connect while retaining all content and user data for a possible return +- Customer lacks a Connect license yet. Deploy the site without Connect and enable it once licensed. +- Pause Connect during maintenance windows or cost-saving periods +- Stop Connect while retaining all content and user data -**Re-enabling Connect** after a suspend is as simple as removing the field or setting it back to `true`: +**Re-enabling Connect** after suspension requires removing the field or setting it back to `true`: ```yaml spec: @@ -90,7 +90,7 @@ spec: #### Tearing down Connect (destructive) -To permanently destroy all Connect resources — including the database, secrets, and PVC — set both `enabled: false` and `teardown: true`: +To permanently destroy all Connect resources, including the database, secrets, and PVC, set both `enabled: false` and `teardown: true`: ```yaml spec: @@ -99,13 +99,13 @@ spec: teardown: true # DESTRUCTIVE: deletes database, secrets, and PVC ``` -**This is irreversible.** Re-enabling Connect after a teardown starts completely fresh with a new empty database and no prior content or configuration. +**This is irreversible.** Re-enabling Connect after teardown starts fresh with a new empty database and no prior content or configuration. **When to use `teardown: true`:** -- Permanently decommissioning Connect with no intent to restore data -- Reclaiming cluster storage after migrating to a different Connect instance -- Explicitly wiping Connect to start fresh +- Permanently decommission Connect with no intent to restore data +- Reclaim cluster storage after migrating to a different Connect instance +- Wipe Connect to start fresh > **Note:** `teardown: true` has no effect while `enabled` is `true` or unset. You must set `enabled: false` first. @@ -126,7 +126,7 @@ spec: sessionImage: "ghcr.io/rstudio/rstudio-connect-content-init:ubuntu2204-2024.06.0" ``` -**Important:** The `sessionImage` is used as an init container in content execution jobs. It prepares the runtime environment before content runs. +The `sessionImage` is used as an init container in content execution jobs. It prepares the runtime environment before content runs. ### Resource Scaling @@ -317,7 +317,7 @@ spec: # samlEmailAttribute: "http://schemas.xmlsoap.org/ws/2005/05/identity/claims/emailaddress" ``` -**Note:** `samlIdPAttributeProfile` and individual SAML attribute mappings are mutually exclusive. The controller will reject configurations that specify both. +**Note:** `samlIdPAttributeProfile` and individual SAML attribute mappings are mutually exclusive. The controller rejects configurations that specify both. ### Password Authentication @@ -411,7 +411,7 @@ spec: ## Off-Host Execution / Kubernetes Launcher -Off-host execution is **enabled by default** when Connect is deployed via Team Operator. Content runs in isolated Kubernetes Jobs rather than on the Connect server. +Off-host execution is enabled by default when Connect is deployed via Team Operator. Content runs in isolated Kubernetes Jobs rather than on the Connect server. ### How It Works @@ -557,7 +557,7 @@ images: version: "1.3.340" ``` -**Custom Runtime Images:** To add custom runtime images, you need to modify the Connect CR directly (advanced use case) or use the Connect admin interface after deployment. +**Custom Runtime Images:** To add custom runtime images, modify the Connect CR directly (advanced use case) or use the Connect admin interface after deployment. ### Additional Volumes for Sessions @@ -706,7 +706,7 @@ Authorization: ## Chronicle Integration -Chronicle provides telemetry and metrics collection for Connect. When configured, a Chronicle agent sidecar is automatically injected into the Connect deployment. +Chronicle provides telemetry and metrics collection for Connect. When configured, a Chronicle agent sidecar is injected into the Connect deployment. ### Enabling Chronicle @@ -780,9 +780,9 @@ For AWS Secrets Manager, these keys are automatically mapped to the `-conn ### Email Configuration Notes -- SMTP is considered **experimental** in Team Operator -- For production email, consider using Connect's built-in email configuration after deployment -- The `mailTarget` setting is useful for testing to prevent accidental emails to real users +- SMTP is experimental in Team Operator +- For production email, use Connect's built-in email configuration after deployment +- The `mailTarget` setting prevents accidental emails to real users during testing --- @@ -902,7 +902,7 @@ spec: value: "secret://db-password-key" ``` -**Note:** The `secret://` prefix is special - it triggers the operator to create CSI secret mounts for AWS Secrets Manager values. +**Note:** The `secret://` prefix triggers the operator to create CSI secret mounts for AWS Secrets Manager values. --- @@ -1085,9 +1085,9 @@ spec: ``` 3. **Common issues:** - - License not found: Check license secret exists and key is correct + - License not found: Verify license secret exists and key is correct - Database connection failed: Verify database credentials and connectivity - - Volume mount failed: Ensure PVC exists and storage class is available + - Volume mount failed: Verify PVC exists and storage class is available ### Content Sessions Not Running @@ -1102,21 +1102,21 @@ spec: ``` 3. **Common issues:** - - Init container failed: Check session image is accessible + - Init container failed: Verify session image is accessible - Runtime not found: Verify runtime.yaml configuration - - Resource limits exceeded: Check scheduler limits + - Resource limits exceeded: Verify scheduler limits ### Authentication Failures 1. **For OIDC:** - Verify client ID and issuer URL - - Check client secret is in the correct secret backend - - Ensure redirect URIs are configured in your IdP + - Verify client secret is in the correct secret backend + - Verify redirect URIs are configured in your IdP - Enable debug logging: `spec.debug: true` 2. **For SAML:** - Verify metadata URL is accessible - - Check attribute mappings match your IdP + - Verify attribute mappings match your IdP - Review Connect logs for SAML assertion errors ### Database Connection Issues @@ -1151,8 +1151,8 @@ spec: ``` 3. **Common issues:** - - Missing `agentImage`: Chronicle sidecar won't be created - - Network policy blocking: Ensure Chronicle server is reachable + - Missing `agentImage`: Chronicle sidecar will not be created + - Network policy blocking: Verify Chronicle server is reachable ### GPU Sessions Not Scheduling diff --git a/docs/guides/packagemanager-configuration.md b/docs/guides/packagemanager-configuration.md index cd80f40..f35dd55 100644 --- a/docs/guides/packagemanager-configuration.md +++ b/docs/guides/packagemanager-configuration.md @@ -1,10 +1,10 @@ # Package Manager Configuration Guide -This guide provides comprehensive documentation for configuring Posit Package Manager within the Team Operator framework. +This guide documents how to configure Posit Package Manager within the Team Operator. ## Overview -Posit Package Manager (PPM) is a repository management server that provides R and Python packages from CRAN, Bioconductor, and PyPI, as well as internal packages built from Git repositories. In Team Operator, Package Manager is deployed as a child resource of a Site. +Posit Package Manager (PPM) provides R and Python packages from CRAN, Bioconductor, and PyPI, plus internal packages built from Git repositories. In Team Operator, Package Manager is deployed as a child resource of a Site. ### Architecture @@ -21,7 +21,7 @@ Site CR └── PodDisruptionBudget ``` -When you configure Package Manager in a Site spec, the Site controller creates a `PackageManager` Custom Resource. The PackageManager controller then reconciles all the Kubernetes resources needed to run the service. +When you configure Package Manager in a Site spec, the Site controller creates a `PackageManager` Custom Resource. The PackageManager controller reconciles the Kubernetes resources needed to run the service. ## Basic Configuration @@ -160,7 +160,7 @@ The license is expected in the vault under the key `pkg-license`. ## Database Configuration -Package Manager uses PostgreSQL for storing metadata and usage metrics. The operator automatically provisions two database schemas: +Package Manager uses PostgreSQL for metadata and usage metrics. The operator provisions two database schemas: | Schema | Purpose | |--------|---------| @@ -186,7 +186,7 @@ spec: ### Database URLs -The operator constructs database URLs automatically using the format: +The operator constructs database URLs using this format: ``` postgres://username:password@host/database?search_path=pm&sslmode=require @@ -212,7 +212,7 @@ Package Manager supports multiple storage backends for package data. ### S3 Storage (AWS Recommended) -For production deployments on AWS, S3 storage is recommended: +For production deployments on AWS, use S3 storage: ```yaml spec: @@ -258,7 +258,7 @@ The Package Manager service account requires the following S3 permissions: #### IAM Role Association -The operator automatically creates a ServiceAccount with the appropriate IAM role annotation: +The operator creates a ServiceAccount with the IAM role annotation: ```yaml annotations: @@ -320,7 +320,7 @@ spec: ## Git Builder Configuration -Package Manager can build packages from Git repositories. For private repositories, SSH key authentication is required. +Package Manager can build packages from Git repositories. Private repositories require SSH key authentication. ### SSH Key Configuration @@ -345,7 +345,7 @@ spec: ### AWS Secrets Manager SSH Keys -When using AWS Secrets Manager, SSH keys are stored in a dedicated vault: +With AWS Secrets Manager, SSH keys are stored in a dedicated vault: **Vault naming convention:** ``` @@ -440,11 +440,11 @@ RVersion = /opt/R/default ## Secret Management -Package Manager secrets are managed differently based on the Site's secret type. +Secret management depends on the Site's secret type. ### AWS Secrets Manager -When `secret.type: aws`, the following secrets are retrieved from AWS Secrets Manager: +When `secret.type: aws`, these secrets are retrieved from AWS Secrets Manager: | Secret Key | Purpose | |------------|---------| @@ -489,7 +489,7 @@ resources: ### Pod Disruption Budget -A PodDisruptionBudget is automatically created to ensure availability during cluster maintenance. For single-replica deployments, `minAvailable: 0`. For multi-replica deployments, the operator calculates an appropriate `minAvailable` value. +A PodDisruptionBudget is created to maintain availability during cluster maintenance. For single-replica deployments, `minAvailable: 0`. For multi-replica deployments, the operator calculates an appropriate `minAvailable` value. ### Affinity @@ -756,7 +756,7 @@ Log = verbose ### Sleep Mode for Debugging -For debugging crash loops, enable sleep mode: +To debug crash loops, enable sleep mode: ```yaml # Directly on PackageManager CR (not recommended for production) diff --git a/docs/guides/product-team-site-management.md b/docs/guides/product-team-site-management.md index 1f1d833..c810754 100644 --- a/docs/guides/product-team-site-management.md +++ b/docs/guides/product-team-site-management.md @@ -4,7 +4,7 @@ This guide covers the management of Site resources in Team Operator for platform ## Overview -The `Site` Custom Resource Definition (CRD) is the **single source of truth** for a Posit Team deployment. A Site represents a complete deployment environment that includes: +The `Site` Custom Resource Definition (CRD) is the single source of truth for a Posit Team deployment. A Site represents a complete deployment environment that includes: - **Flightdeck** - Landing page dashboard - **Connect** - Publishing and sharing platform @@ -13,7 +13,7 @@ The `Site` Custom Resource Definition (CRD) is the **single source of truth** fo - **Chronicle** - Telemetry and monitoring - **Keycloak** - Authentication and identity management (optional) -When you create or update a Site, the Site controller automatically reconciles all child product Custom Resources (Connect, Workbench, Package Manager, Chronicle, Flightdeck) to match your desired configuration. +When you create or update a Site, the Site controller reconciles all child product Custom Resources (Connect, Workbench, Package Manager, Chronicle, Flightdeck) to match your desired configuration. ## Site Lifecycle @@ -75,7 +75,7 @@ The Site controller cleans up: 4. Flightdeck CR and all its resources 5. Network policies -**Important:** Child resources have owner references to the Site, so Kubernetes garbage collection handles most cleanup automatically. +Child resources have owner references to the Site, so Kubernetes garbage collection handles most cleanup automatically. If `dropDatabaseOnTearDown: true` is set, product databases will be dropped during cleanup. @@ -701,7 +701,7 @@ kubectl logs -n posit-team -l app.kubernetes.io/name=team-operator --tail=100 #### Products Not Deploying -1. Check Site controller logs for errors: +1. Verify Site controller logs for errors: ```bash kubectl logs -n posit-team deploy/team-operator | grep -i error ``` @@ -711,7 +711,7 @@ kubectl logs -n posit-team -l app.kubernetes.io/name=team-operator --tail=100 kubectl get connect,workbench,packagemanager,chronicle -n posit-team ``` -3. Check individual product controller logs if CRs exist but pods are not running. +3. Verify individual product controller logs if CRs exist but pods are not running. #### Database Connection Failures @@ -723,9 +723,9 @@ kubectl logs -n posit-team -l app.kubernetes.io/name=team-operator --tail=100 # For AWS Secrets Manager, check operator logs for fetch errors ``` -2. Ensure database host is reachable from the cluster. +2. Verify database host is reachable from the cluster. -3. Check SSL mode configuration matches your database server. +3. Verify SSL mode configuration matches your database server. #### Volume Provisioning Issues @@ -752,9 +752,9 @@ kubectl logs -n posit-team -l app.kubernetes.io/name=team-operator --tail=100 #### Authentication Failures -1. For OIDC, verify client ID and issuer URL are correct. +1. For OIDC, verify client ID and issuer URL. -2. Check that redirect URIs are configured in your IdP. +2. Verify redirect URIs are configured in your IdP. 3. Review product logs for detailed auth error messages: ```bash @@ -770,9 +770,9 @@ If you notice constant reconciliation: kubectl get site -o yaml | diff - site.yaml ``` -2. Look for validation errors in controller logs. +2. Review controller logs for validation errors. -3. Ensure no external processes are modifying resources. +3. Verify no external processes are modifying resources. ## Related Documentation diff --git a/docs/guides/troubleshooting.md b/docs/guides/troubleshooting.md index 90d79e4..d10169b 100644 --- a/docs/guides/troubleshooting.md +++ b/docs/guides/troubleshooting.md @@ -1,6 +1,6 @@ # Team Operator Troubleshooting Guide -This comprehensive guide covers common issues and their solutions when running Posit Team products via the Team Operator. +This guide covers common issues and solutions when running Posit Team products via the Team Operator. ## Table of Contents @@ -25,7 +25,7 @@ This comprehensive guide covers common issues and their solutions when running P ### Checking Operator Logs -The operator logs are your first stop for diagnosing issues: +Start diagnosing issues by checking the operator logs: ```bash # View operator logs @@ -225,13 +225,13 @@ kubectl get nodes -o custom-columns=NAME:.metadata.name,TAINTS:.spec.taints **Cause:** -Kubernetes nodes can have [taints](https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/) that prevent pods from scheduling unless the pod has a matching toleration. Common scenarios include: +Kubernetes nodes can have [taints](https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/) that prevent pods from scheduling unless the pod has a matching toleration. Common scenarios: -- Dedicated node pools for specific workloads (e.g., GPU nodes, session nodes) +- Dedicated node pools for specific workloads (GPU nodes, session nodes) - Nodes reserved for critical system components - Cloud provider managed node pools with default taints -If the operator pod doesn't have tolerations matching the node taints, it will remain in `Pending` state. +Without matching tolerations, the operator pod remains in `Pending` state. **Solution:** @@ -274,9 +274,9 @@ helm upgrade team-operator ./dist/chart \ | Cloud provider taints (GKE) | `key: "cloud.google.com/gke-nodepool", operator: "Exists"` | | Control plane nodes | `key: "node-role.kubernetes.io/control-plane", operator: "Exists"` | -**Using nodeSelector as an alternative:** +**Alternative: Using nodeSelector** -If you want the operator to run on specific nodes instead of tolerating taints, use `nodeSelector`: +To run the operator on specific nodes instead of tolerating taints, use `nodeSelector`: ```yaml controllerManager: @@ -341,9 +341,9 @@ kubectl logs -n posit-team-system deployment/team-operator-controller-manager | ``` **Solutions:** -- Verify product is enabled in Site spec (products are created by default) +- Verify the product is enabled in the Site spec (products are created by default) - Check for validation errors in operator logs -- Ensure all required fields are populated +- Verify all required fields are populated ### Status Conditions Not Updating @@ -409,19 +409,19 @@ kubectl get secret -n posit-team -o yaml **Solutions:** -1. **Verify secret exists with correct keys:** +1. Verify the secret exists with correct keys: ```bash kubectl get secret -n posit-team -o jsonpath='{.data}' | jq ``` -2. **Test database connectivity from a pod:** +2. Test database connectivity from a pod: ```bash kubectl run -it --rm psql-test --image=postgres:15 --restart=Never -- \ psql "postgresql://:@/?sslmode=require" ``` -3. **Check SSL mode configuration:** - - Ensure `sslmode` matches your database requirements (require, verify-full, etc.) +3. Check SSL mode configuration: + - Verify `sslmode` matches your database requirements (require, verify-full, etc.) ### Schema Creation Errors @@ -451,28 +451,28 @@ kubectl logs -n posit-team-system deployment/team-operator-controller-manager | **Solutions:** -1. **For AWS Secrets Manager:** - ```yaml - spec: - secret: - type: "aws" - vaultName: "your-vault-name" - mainDatabaseCredentialSecret: - type: "aws" - vaultName: "rds!db-identifier" - ``` +For AWS Secrets Manager: +```yaml +spec: + secret: + type: "aws" + vaultName: "your-vault-name" + mainDatabaseCredentialSecret: + type: "aws" + vaultName: "rds!db-identifier" +``` -2. **For Kubernetes Secrets:** - ```yaml - apiVersion: v1 - kind: Secret - metadata: - name: site-secrets - stringData: - pub-db-password: "" - dev-db-password: "" - pkg-db-password: "" - ``` +For Kubernetes Secrets: +```yaml +apiVersion: v1 +kind: Secret +metadata: + name: site-secrets +stringData: + pub-db-password: "" + dev-db-password: "" + pkg-db-password: "" +``` --- @@ -561,7 +561,7 @@ kubectl logs -n posit-team deploy/-workbench -c workbench | grep -i l | Databricks config error | Move Databricks config from `Config` to `SecretConfig` | **Databricks Configuration Error:** -If you see `the Databricks configuration should be in SecretConfig, not Config`, update your configuration: +If you see `the Databricks configuration should be in SecretConfig, not Config`, update your configuration. ```yaml # Wrong @@ -691,23 +691,23 @@ kubectl logs -n deploy/ **Solutions:** -1. **Check certificate secret:** - ```bash - kubectl get secret -n posit-team | grep tls - ``` +Check the certificate secret: +```bash +kubectl get secret -n posit-team | grep tls +``` -2. **Verify cert-manager (if used):** - ```bash - kubectl get certificate -n posit-team - kubectl describe certificate -n posit-team - ``` +Verify cert-manager (if used): +```bash +kubectl get certificate -n posit-team +kubectl describe certificate -n posit-team +``` -3. **Configure TLS in Ingress:** - ```yaml - spec: - ingressAnnotations: - cert-manager.io/cluster-issuer: "letsencrypt-prod" - ``` +Configure TLS in Ingress: +```yaml +spec: + ingressAnnotations: + cert-manager.io/cluster-issuer: "letsencrypt-prod" +``` ### Service Discovery Issues @@ -801,23 +801,23 @@ kubectl get pod -n posit-team -o jsonpath='{.spec.securityContext}' **Solutions:** -1. **Set FSGroup in security context:** - ```yaml - spec: - securityContext: - fsGroup: 999 - ``` +Set FSGroup in security context: +```yaml +spec: + securityContext: + fsGroup: 999 +``` -2. **Use init container to fix permissions:** - ```yaml - initContainers: - - name: fix-permissions - image: busybox - command: ["sh", "-c", "chown -R 999:999 /data"] - volumeMounts: - - name: data - mountPath: /data - ``` +Use an init container to fix permissions: +```yaml +initContainers: + - name: fix-permissions + image: busybox + command: ["sh", "-c", "chown -R 999:999 /data"] + volumeMounts: + - name: data + mountPath: /data +``` --- @@ -840,25 +840,25 @@ kubectl get configmap -connect -n posit-team -o yaml | grep -i callba **Solutions:** -1. **Verify redirect URIs in IdP:** - - Connect: `https:///__login__/callback` - - Workbench: `https:///oidc/callback` - -2. **Check client ID and issuer:** - ```yaml - spec: - connect: - auth: - type: "oidc" - clientId: "your-client-id" # Must match IdP - issuer: "https://your-idp.com" # Must be exact - ``` +Verify redirect URIs in your IdP: +- Connect: `https:///__login__/callback` +- Workbench: `https:///oidc/callback` -3. **Enable debug logging:** - ```yaml - spec: - debug: true - ``` +Check client ID and issuer: +```yaml +spec: + connect: + auth: + type: "oidc" + clientId: "your-client-id" # Must match IdP + issuer: "https://your-idp.com" # Must be exact +``` + +Enable debug logging: +```yaml +spec: + debug: true +``` ### SAML Metadata Issues @@ -875,18 +875,18 @@ kubectl run -it --rm curl-test --image=curlimages/curl --restart=Never -- \ **Solutions:** -1. **Ensure metadata URL is correct:** - ```yaml - spec: - connect: - auth: - type: "saml" - samlMetadataUrl: "https://idp.example.com/saml/metadata" - ``` +Verify the metadata URL is correct: +```yaml +spec: + connect: + auth: + type: "saml" + samlMetadataUrl: "https://idp.example.com/saml/metadata" +``` -2. **Check network access from cluster:** - - Verify DNS resolution works - - Check firewall rules allow outbound HTTPS +Check network access from the cluster: +- Verify DNS resolution works +- Check firewall rules allow outbound HTTPS **Configuration Conflict Error:** ``` @@ -917,33 +917,33 @@ kubectl logs -n posit-team deploy/-connect -c connect | grep -i "clai **Solutions:** -1. **Verify claims configuration:** - ```yaml - spec: - connect: - auth: - usernameClaim: "preferred_username" - emailClaim: "email" - groupsClaim: "groups" - ``` +Verify claims configuration: +```yaml +spec: + connect: + auth: + usernameClaim: "preferred_username" + emailClaim: "email" + groupsClaim: "groups" +``` -2. **Check scopes include groups:** - ```yaml - scopes: - - "openid" - - "profile" - - "email" - - "groups" - ``` +Check scopes include groups: +```yaml +scopes: + - "openid" + - "profile" + - "email" + - "groups" +``` -3. **Disable groups claim if IdP doesn't support it:** - ```yaml - disableGroupsClaim: true - ``` +Disable groups claim if your IdP doesn't support it: +```yaml +disableGroupsClaim: true +``` -4. **Debug JWT tokens:** - - Use [jwt.io](https://jwt.io) to inspect tokens - - Verify expected claims are present +Debug JWT tokens: +- Use [jwt.io](https://jwt.io) to inspect tokens +- Verify expected claims are present --- @@ -970,24 +970,24 @@ kubectl logs -n posit-team deploy/-connect -c connect | grep -i "clai ## Getting Help -If you continue to experience issues: +If issues persist: -1. **Collect diagnostic information:** - ```bash - kubectl get all -n posit-team -o yaml > posit-team-resources.yaml - kubectl logs -n posit-team-system deployment/team-operator-controller-manager > operator.log - kubectl get events -n posit-team --sort-by='.lastTimestamp' > events.txt - ``` +Collect diagnostic information: +```bash +kubectl get all -n posit-team -o yaml > posit-team-resources.yaml +kubectl logs -n posit-team-system deployment/team-operator-controller-manager > operator.log +kubectl get events -n posit-team --sort-by='.lastTimestamp' > events.txt +``` -2. **Check Posit documentation:** - - [Connect Admin Guide](https://docs.posit.co/connect/admin/) - - [Workbench Admin Guide](https://docs.posit.co/ide/server-pro/admin/) - - [Package Manager Admin Guide](https://docs.posit.co/rspm/admin/) +Check Posit documentation: +- [Connect Admin Guide](https://docs.posit.co/connect/admin/) +- [Workbench Admin Guide](https://docs.posit.co/ide/server-pro/admin/) +- [Package Manager Admin Guide](https://docs.posit.co/rspm/admin/) -3. **Contact Posit Support:** - - Include diagnostic files - - Describe the issue and steps to reproduce - - Include operator and product versions +Contact Posit Support: +- Include diagnostic files +- Describe the issue and steps to reproduce +- Include operator and product versions --- diff --git a/docs/guides/upgrading.md b/docs/guides/upgrading.md index 1ccb73e..729a907 100644 --- a/docs/guides/upgrading.md +++ b/docs/guides/upgrading.md @@ -1,6 +1,6 @@ # Upgrading Team Operator -This guide provides comprehensive instructions for upgrading the Team Operator, including pre-upgrade preparation, upgrade procedures, version-specific migrations, and troubleshooting. +This guide covers upgrading the Team Operator: pre-upgrade preparation, upgrade procedures, version-specific migrations, and troubleshooting. ## Before Upgrading @@ -38,7 +38,7 @@ kubectl get secrets -n posit-team -o yaml | gpg -c > secrets-backup.yaml.gpg #### 3. Backup Databases -If using external databases for products (Connect, Workbench, Package Manager), ensure you have database backups before upgrading. The operator manages `PostgresDatabase` resources that may be affected by schema changes. +If using external databases for products (Connect, Workbench, Package Manager), back up databases before upgrading. The operator manages `PostgresDatabase` resources that schema changes may affect. ```bash # List managed databases @@ -66,7 +66,7 @@ kubectl get crds | grep posit.team ### Review Changelog -Always review the [CHANGELOG.md](../../CHANGELOG.md) for breaking changes between your current version and the target version. Pay special attention to: +Review the [CHANGELOG.md](../../CHANGELOG.md) for breaking changes between your current version and the target version. Look for: - Breaking changes that require configuration updates - Deprecated fields that need migration @@ -74,19 +74,19 @@ Always review the [CHANGELOG.md](../../CHANGELOG.md) for breaking changes betwee ### Test in Non-Production -**Critical**: Always test upgrades in a non-production environment first: +**Critical**: Test upgrades in a non-production environment first: 1. Create a staging cluster or namespace that mirrors production 2. Apply the same Site configuration 3. Perform the upgrade -4. Verify all products function correctly +4. Verify all products function 5. Test any automated integrations ## Upgrade Methods ### Helm Upgrade Procedure -The recommended method for upgrading is via Helm: +The recommended upgrade method is Helm: #### Standard Upgrade @@ -116,7 +116,7 @@ helm upgrade team-operator ./dist/chart \ #### Upgrade with CRD Updates -CRDs are automatically updated during Helm upgrade when `crd.enable: true` (default). However, if you've disabled CRD management: +CRDs are updated during Helm upgrade when `crd.enable: true` (default). If you've disabled CRD management: ```bash # Manually apply CRD updates first @@ -143,13 +143,13 @@ kubectl apply -k config/overlays/production ### CRD Upgrade Considerations -CRDs require special attention during upgrades: +CRDs require attention during upgrades: -1. **CRDs Persist Across Helm Uninstall**: By default (`crd.keep: true`), CRDs remain in the cluster even after `helm uninstall`. This prevents accidental data loss but means CRDs must be managed carefully. +1. **CRDs Persist Across Helm Uninstall**: By default (`crd.keep: true`), CRDs remain in the cluster after `helm uninstall`. This prevents accidental data loss but requires careful CRD management. -2. **CRD Version Compatibility**: The operator manages CRDs at API version `core.posit.team/v1beta1` (and `keycloak.k8s.keycloak.org/v2alpha1` for Keycloak). Ensure your CRs are compatible with the CRD schema in the new version. +2. **CRD Version Compatibility**: The operator manages CRDs at API version `core.posit.team/v1beta1` (and `keycloak.k8s.keycloak.org/v2alpha1` for Keycloak). Your CRs must be compatible with the CRD schema in the new version. -3. **Schema Validation**: After CRD updates, existing CRs are validated against the new schema. Invalid CRs may prevent proper reconciliation. +3. **Schema Validation**: After CRD updates, existing CRs are validated against the new schema. Invalid CRs may prevent reconciliation. ```bash # Verify CRDs are updated @@ -178,7 +178,7 @@ No configuration changes required for users. - Added `tolerations` and `nodeSelector` support for controller manager **Migration:** -If you were using workarounds for pod scheduling, update your values: +If you used workarounds for pod scheduling, update your values: ```yaml controllerManager: @@ -243,7 +243,7 @@ spec: ### Key Migration -The operator automatically migrates legacy UUID-format and binary-format encryption keys to the new hex256 format. This migration happens transparently during reconciliation. Monitor logs for migration messages: +The operator migrates legacy UUID-format and binary-format encryption keys to the new hex256 format. This happens during reconciliation. Monitor logs for migration messages: ```bash kubectl logs -n posit-team-system deployment/team-operator-controller-manager | grep -i "migrating" @@ -334,7 +334,7 @@ helm rollback team-operator 2 -n posit-team-system ### CRD Considerations During Rollback -**Important**: CRDs are not automatically rolled back with Helm rollback due to the `keep` annotation. If the new CRDs added fields, older operator versions may still work but won't recognize new fields. +**Important**: CRDs are not rolled back with Helm rollback due to the `keep` annotation. If the new CRDs added fields, older operator versions may still work but won't recognize new fields. If CRD rollback is necessary: @@ -365,8 +365,8 @@ Consider these data implications during rollback: 1. **Use Maintenance Windows**: Schedule upgrades during low-traffic periods. -2. **Rolling Update Strategy**: The operator uses a single replica by default. For zero-downtime during operator restarts: - - Products continue running even if the operator is briefly unavailable +2. **Rolling Update Strategy**: The operator uses a single replica by default. During operator restarts: + - Products continue running if the operator is briefly unavailable - No reconciliation occurs during operator restart (typically < 30 seconds) 3. **Staged Rollout**: @@ -379,12 +379,12 @@ Consider these data implications during rollback: helm upgrade team-operator ./dist/chart -n posit-team-system ``` -4. **Health Check Considerations**: +4. **Health Check**: - Liveness probe: `/healthz` (port 8081) - Readiness probe: `/readyz` (port 8081) - - These ensure the operator is ready before receiving reconciliation requests + - These probes ensure the operator is ready before receiving reconciliation requests -5. **Leader Election**: If running multiple operator replicas (not typical), leader election ensures only one active reconciler: +5. **Leader Election**: If running multiple operator replicas (uncommon), leader election ensures one active reconciler: ```yaml controllerManager: container: @@ -455,7 +455,7 @@ kubectl logs -n posit-team-system -l control-plane=controller-manager --previous #### Reconciliation Loops -**Symptom**: Operator continuously reconciles resources without reaching stable state +**Symptom**: Operator continuously reconciles resources without reaching a stable state ```bash # Watch operator logs for repeated reconciliation diff --git a/docs/guides/workbench-configuration.md b/docs/guides/workbench-configuration.md index 9915e0e..c0e27ed 100644 --- a/docs/guides/workbench-configuration.md +++ b/docs/guides/workbench-configuration.md @@ -1,12 +1,12 @@ # Workbench Configuration Guide -This guide covers comprehensive configuration of Posit Workbench in Team Operator, including all available options, authentication, off-host execution, IDE settings, data integrations, and advanced features. +This guide covers configuration of Posit Workbench in Team Operator, including options for authentication, off-host execution, IDE settings, data integrations, and advanced features. ## Overview -Posit Workbench provides an interactive development environment for data science teams. In Team Operator, Workbench runs on Kubernetes with off-host execution enabled by default, meaning user sessions run as separate Kubernetes Jobs rather than on the Workbench server pod itself. +Posit Workbench provides an interactive development environment for data science teams. In Team Operator, Workbench runs on Kubernetes with off-host execution enabled by default. User sessions run as separate Kubernetes Jobs rather than on the Workbench server pod itself. -When configured via a Site resource, Workbench: +When configured via a Site resource, Workbench does the following: - Uses the Kubernetes Job Launcher for session management - Supports multiple IDEs (RStudio, VS Code, Positron, Jupyter) - Integrates with Site-level authentication @@ -223,7 +223,7 @@ The HTML content is mounted at `/etc/rstudio/login.html` and must be less than 6 ## Off-Host Execution / Kubernetes Launcher -Off-host execution runs user sessions as Kubernetes Jobs, providing isolation, resource management, and scalability. **This is enabled by default** in Team Operator. +Off-host execution runs user sessions as Kubernetes Jobs, providing isolation, resource management, and scalability. This is enabled by default in Team Operator. ### How It Works @@ -320,7 +320,7 @@ spec: ### Session Configuration Details -Sessions are configured via launcher templates. The operator manages: +Sessions are configured via launcher templates. The operator manages these files: - `job.tpl` - Kubernetes Job template - `service.tpl` - Service template for session connectivity @@ -379,7 +379,7 @@ spec: ### Positron IDE -Positron is Posit's next-generation IDE. Enable and configure: +Positron is Posit's next-generation IDE. Enable and configure it: ```yaml spec: @@ -603,7 +603,7 @@ spec: ## Non-Root Execution Mode -Enable "maximally rootless" execution for enhanced security: +Enable "maximally rootless" execution for better security: ```yaml spec: @@ -630,7 +630,7 @@ When enabled: ## Experimental Features -The `experimentalFeatures` section contains advanced options. These are subject to change: +The `experimentalFeatures` section contains advanced options subject to change: ```yaml spec: @@ -867,54 +867,54 @@ spec: #### Authentication Failures -1. **Check OIDC configuration:** - - Verify issuer URL is accessible from the cluster - - Confirm client ID matches IdP configuration - - Check that redirect URIs are configured in IdP +Check OIDC configuration: +- Verify the issuer URL is accessible from the cluster +- Confirm the client ID matches IdP configuration +- Verify redirect URIs are configured in the IdP -2. **View authentication logs:** - ```bash - kubectl logs -n posit-team deploy/-workbench | grep -i auth - ``` +View authentication logs: +```bash +kubectl logs -n posit-team deploy/-workbench | grep -i auth +``` -3. **Verify secrets exist:** - ```bash - kubectl get secret -workbench-config -n posit-team - ``` +Verify secrets exist: +```bash +kubectl get secret -workbench-config -n posit-team +``` #### Session Resource Issues -1. **Check resource profile configuration:** - ```bash - kubectl get configmap -workbench -n posit-team -o yaml | grep -A 50 "launcher.kubernetes.resources.conf" - ``` +Check resource profile configuration: +```bash +kubectl get configmap -workbench -n posit-team -o yaml | grep -A 50 "launcher.kubernetes.resources.conf" +``` -2. **Verify nodes have capacity:** - ```bash - kubectl describe nodes | grep -A 10 "Allocated resources" - ``` +Verify nodes have capacity: +```bash +kubectl describe nodes | grep -A 10 "Allocated resources" +``` -3. **Check session pod events:** - ```bash - kubectl describe pod -n posit-team - ``` +Check session pod events: +```bash +kubectl describe pod -n posit-team +``` #### Volume Mount Issues -1. **Verify PVC exists and is bound:** - ```bash - kubectl get pvc -n posit-team | grep workbench - ``` +Verify the PVC exists and is bound: +```bash +kubectl get pvc -n posit-team | grep workbench +``` -2. **Check volume permissions in session:** - ```bash - kubectl exec -it -n posit-team -- ls -la /home - ``` +Check volume permissions in the session: +```bash +kubectl exec -it -n posit-team -- ls -la /home +``` -3. **Verify storage class supports RWX:** - ```bash - kubectl get storageclass -o yaml - ``` +Verify the storage class supports RWX: +```bash +kubectl get storageclass -o yaml +``` ### Useful Commands diff --git a/docs/testing.md b/docs/testing.md index 2b981c6..992e979 100644 --- a/docs/testing.md +++ b/docs/testing.md @@ -1,6 +1,6 @@ # Testing Guide -This document describes the testing infrastructure for the Team Operator. +This document covers the testing infrastructure for the Team Operator. ## Testing Tiers @@ -10,7 +10,7 @@ The Team Operator uses a two-tier local integration testing strategy: **What it is:** Envtest uses a lightweight, embedded Kubernetes API server (etcd + kube-apiserver) to test CRD schema and API storage without a full cluster or running controller. -**When to use:** For testing CRD schema validation and API storage (no controller reconciler is started in the test environment). +**When to use:** For testing CRD schema validation and API storage. No controller reconciler runs in the test environment. **Execution time:** Seconds @@ -18,11 +18,11 @@ The Team Operator uses a two-tier local integration testing strategy: - CRD schema validation - API object creation and storage - Resource serialization/deserialization -- Basic CRUD operations via the API +- Basic CRUD operations through the API ### Tier 2: Kind Cluster (Full Stack Tests) -**What it is:** Kind (Kubernetes IN Docker) creates a real Kubernetes cluster using Docker containers. +**What it is:** Kind (Kubernetes IN Docker) creates a real Kubernetes cluster with Docker containers. **When to use:** For end-to-end testing, Helm chart deployment, and integration with other Kubernetes components. @@ -160,7 +160,7 @@ The kind test script: #### Development Loop -For iterative development, keep the kind cluster running between test runs instead of recreating it each time. +For iterative development, keep the kind cluster running between test runs. Don't recreate it each time. **Initial setup** (run once): @@ -189,7 +189,7 @@ make kind-teardown # removes Helm release and namespaces # (also deletes the kind cluster) ``` -This workflow is significantly faster than `make test-kind` for iterative development because it skips cluster creation and deletion on every run. +This workflow is much faster than `make test-kind` for iterative development. It skips cluster creation and deletion on every run. ## CI Integration @@ -270,8 +270,8 @@ kubectl get pods -A ## Best Practices -1. **Use envtest for unit-level controller tests** - It's fast and doesn't require Docker +1. **Use envtest for unit-level controller tests** - Fast and no Docker needed 2. **Use kind for integration tests** - When you need a real cluster 3. **Always clean up test resources** - Prevents test pollution 4. **Use Eventually() for async operations** - Controllers are eventually consistent -5. **Keep test data minimal** - Only specify fields needed for the test +5. **Keep test data minimal** - Specify only the fields the test needs From b0350a3b8a649abc9b478c00af8b5d58c3a15f4d Mon Sep 17 00:00:00 2001 From: ian-flores Date: Mon, 2 Mar 2026 12:19:22 -0800 Subject: [PATCH 2/4] Address review findings (job 670) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Changes: - Restore bold emphasis on safety-critical terms: **enabled by default** (off-host execution), **experimental** (SMTP), **single source of truth** (Site CRD), **squash merging** / **squash and merge** (CONTRIBUTING.md) - Fix capability→assertion meaning shift in architecture.md: "It mirrors" → "It can mirror" for Package Manager - Restore numbered lists in sequential troubleshooting/debugging steps across troubleshooting.md, authentication-setup.md, and workbench-configuration.md (token claim debugging, group membership, credential issues, TLS/certificates, volume permissions, OIDC callbacks, SAML metadata, token/claim problems, getting help, auth failures, session resource issues, volume mount issues) --- CONTRIBUTING.md | 4 +- docs/architecture.md | 2 +- docs/guides/authentication-setup.md | 42 ++-- docs/guides/connect-configuration.md | 4 +- docs/guides/product-team-site-management.md | 2 +- docs/guides/troubleshooting.md | 238 ++++++++++---------- docs/guides/workbench-configuration.md | 72 +++--- 7 files changed, 182 insertions(+), 182 deletions(-) diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index f9c43df..d31cee1 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -155,7 +155,7 @@ just mtest ### PR Title Conventions (Required) -We use [Conventional Commits](https://www.conventionalcommits.org/) and squash merging. Your PR title becomes the commit message and must follow this format: +We use [Conventional Commits](https://www.conventionalcommits.org/) and **squash merging**. Your PR title becomes the commit message and must follow this format: ``` (): @@ -343,7 +343,7 @@ The following checks must pass: ### Merging -This repository uses squash and merge. Your PR title becomes the final commit message on `main`. The PR title format is enforced because semantic-release analyzes commit messages to determine version bumps. +This repository uses **squash and merge**. Your PR title becomes the final commit message on `main`. The PR title format is enforced because semantic-release analyzes commit messages to determine version bumps. ### Review Expectations diff --git a/docs/architecture.md b/docs/architecture.md index 4a6648f..0cabeee 100644 --- a/docs/architecture.md +++ b/docs/architecture.md @@ -367,7 +367,7 @@ See the [Workbench Configuration Guide](guides/workbench-configuration.md) for d ## Package Manager Architecture -Posit Package Manager provides a local repository for R and Python packages. It mirrors public repositories and hosts private packages. +Posit Package Manager provides a local repository for R and Python packages. It can mirror public repositories and host private packages. ```mermaid flowchart TB diff --git a/docs/guides/authentication-setup.md b/docs/guides/authentication-setup.md index c453cca..9f8f326 100644 --- a/docs/guides/authentication-setup.md +++ b/docs/guides/authentication-setup.md @@ -658,36 +658,36 @@ samlEmailAttribute: "..." To debug OIDC token claims: -Enable Debug Logging: -```yaml -spec: - connect: - debug: true -``` +1. **Enable Debug Logging:** + ```yaml + spec: + connect: + debug: true + ``` -Check Pod Logs: -```bash -kubectl logs -n posit-team deploy/-connect -f -``` +2. **Check Pod Logs:** + ```bash + kubectl logs -n posit-team deploy/-connect -f + ``` -Decode JWT Tokens: -Use [jwt.io](https://jwt.io) to inspect tokens and verify claims. +3. **Decode JWT Tokens:** + Use [jwt.io](https://jwt.io) to inspect tokens and verify claims. ### Group Membership Issues If users aren't getting the correct roles: -Verify group claim is present: -- Check the `groupsClaim` field matches your IdP -- Some IdPs use nested claims (e.g., `realm_access.roles`) +1. **Verify group claim is present:** + - Check the `groupsClaim` field matches your IdP + - Some IdPs use nested claims (e.g., `realm_access.roles`) -Check group name matching: -- Group names in role mappings must match exactly -- Group names are case-sensitive +2. **Check group name matching:** + - Group names in role mappings must match exactly + - Group names are case-sensitive -Verify IdP configuration: -- Verify groups are included in the token -- Check token size limits (large group lists may be truncated) +3. **Verify IdP configuration:** + - Verify groups are included in the token + - Check token size limits (large group lists may be truncated) ### Workbench-Specific Issues diff --git a/docs/guides/connect-configuration.md b/docs/guides/connect-configuration.md index fdf64ac..5ab6ace 100644 --- a/docs/guides/connect-configuration.md +++ b/docs/guides/connect-configuration.md @@ -411,7 +411,7 @@ spec: ## Off-Host Execution / Kubernetes Launcher -Off-host execution is enabled by default when Connect is deployed via Team Operator. Content runs in isolated Kubernetes Jobs rather than on the Connect server. +Off-host execution is **enabled by default** when Connect is deployed via Team Operator. Content runs in isolated Kubernetes Jobs rather than on the Connect server. ### How It Works @@ -780,7 +780,7 @@ For AWS Secrets Manager, these keys are automatically mapped to the `-conn ### Email Configuration Notes -- SMTP is experimental in Team Operator +- SMTP is **experimental** in Team Operator - For production email, use Connect's built-in email configuration after deployment - The `mailTarget` setting prevents accidental emails to real users during testing diff --git a/docs/guides/product-team-site-management.md b/docs/guides/product-team-site-management.md index c810754..9e05e3b 100644 --- a/docs/guides/product-team-site-management.md +++ b/docs/guides/product-team-site-management.md @@ -4,7 +4,7 @@ This guide covers the management of Site resources in Team Operator for platform ## Overview -The `Site` Custom Resource Definition (CRD) is the single source of truth for a Posit Team deployment. A Site represents a complete deployment environment that includes: +The `Site` Custom Resource Definition (CRD) is the **single source of truth** for a Posit Team deployment. A Site represents a complete deployment environment that includes: - **Flightdeck** - Landing page dashboard - **Connect** - Publishing and sharing platform diff --git a/docs/guides/troubleshooting.md b/docs/guides/troubleshooting.md index d10169b..0396c3c 100644 --- a/docs/guides/troubleshooting.md +++ b/docs/guides/troubleshooting.md @@ -451,28 +451,28 @@ kubectl logs -n posit-team-system deployment/team-operator-controller-manager | **Solutions:** -For AWS Secrets Manager: -```yaml -spec: - secret: - type: "aws" - vaultName: "your-vault-name" - mainDatabaseCredentialSecret: - type: "aws" - vaultName: "rds!db-identifier" -``` +1. **For AWS Secrets Manager:** + ```yaml + spec: + secret: + type: "aws" + vaultName: "your-vault-name" + mainDatabaseCredentialSecret: + type: "aws" + vaultName: "rds!db-identifier" + ``` -For Kubernetes Secrets: -```yaml -apiVersion: v1 -kind: Secret -metadata: - name: site-secrets -stringData: - pub-db-password: "" - dev-db-password: "" - pkg-db-password: "" -``` +2. **For Kubernetes Secrets:** + ```yaml + apiVersion: v1 + kind: Secret + metadata: + name: site-secrets + stringData: + pub-db-password: "" + dev-db-password: "" + pkg-db-password: "" + ``` --- @@ -691,23 +691,23 @@ kubectl logs -n deploy/ **Solutions:** -Check the certificate secret: -```bash -kubectl get secret -n posit-team | grep tls -``` +1. **Check certificate secret:** + ```bash + kubectl get secret -n posit-team | grep tls + ``` -Verify cert-manager (if used): -```bash -kubectl get certificate -n posit-team -kubectl describe certificate -n posit-team -``` +2. **Verify cert-manager (if used):** + ```bash + kubectl get certificate -n posit-team + kubectl describe certificate -n posit-team + ``` -Configure TLS in Ingress: -```yaml -spec: - ingressAnnotations: - cert-manager.io/cluster-issuer: "letsencrypt-prod" -``` +3. **Configure TLS in Ingress:** + ```yaml + spec: + ingressAnnotations: + cert-manager.io/cluster-issuer: "letsencrypt-prod" + ``` ### Service Discovery Issues @@ -801,23 +801,23 @@ kubectl get pod -n posit-team -o jsonpath='{.spec.securityContext}' **Solutions:** -Set FSGroup in security context: -```yaml -spec: - securityContext: - fsGroup: 999 -``` +1. **Set FSGroup in security context:** + ```yaml + spec: + securityContext: + fsGroup: 999 + ``` -Use an init container to fix permissions: -```yaml -initContainers: - - name: fix-permissions - image: busybox - command: ["sh", "-c", "chown -R 999:999 /data"] - volumeMounts: - - name: data - mountPath: /data -``` +2. **Use init container to fix permissions:** + ```yaml + initContainers: + - name: fix-permissions + image: busybox + command: ["sh", "-c", "chown -R 999:999 /data"] + volumeMounts: + - name: data + mountPath: /data + ``` --- @@ -840,25 +840,25 @@ kubectl get configmap -connect -n posit-team -o yaml | grep -i callba **Solutions:** -Verify redirect URIs in your IdP: -- Connect: `https:///__login__/callback` -- Workbench: `https:///oidc/callback` - -Check client ID and issuer: -```yaml -spec: - connect: - auth: - type: "oidc" - clientId: "your-client-id" # Must match IdP - issuer: "https://your-idp.com" # Must be exact -``` +1. **Verify redirect URIs in your IdP:** + - Connect: `https:///__login__/callback` + - Workbench: `https:///oidc/callback` + +2. **Check client ID and issuer:** + ```yaml + spec: + connect: + auth: + type: "oidc" + clientId: "your-client-id" # Must match IdP + issuer: "https://your-idp.com" # Must be exact + ``` -Enable debug logging: -```yaml -spec: - debug: true -``` +3. **Enable debug logging:** + ```yaml + spec: + debug: true + ``` ### SAML Metadata Issues @@ -875,18 +875,18 @@ kubectl run -it --rm curl-test --image=curlimages/curl --restart=Never -- \ **Solutions:** -Verify the metadata URL is correct: -```yaml -spec: - connect: - auth: - type: "saml" - samlMetadataUrl: "https://idp.example.com/saml/metadata" -``` +1. **Verify the metadata URL is correct:** + ```yaml + spec: + connect: + auth: + type: "saml" + samlMetadataUrl: "https://idp.example.com/saml/metadata" + ``` -Check network access from the cluster: -- Verify DNS resolution works -- Check firewall rules allow outbound HTTPS +2. **Check network access from the cluster:** + - Verify DNS resolution works + - Check firewall rules allow outbound HTTPS **Configuration Conflict Error:** ``` @@ -917,33 +917,33 @@ kubectl logs -n posit-team deploy/-connect -c connect | grep -i "clai **Solutions:** -Verify claims configuration: -```yaml -spec: - connect: - auth: - usernameClaim: "preferred_username" - emailClaim: "email" - groupsClaim: "groups" -``` +1. **Verify claims configuration:** + ```yaml + spec: + connect: + auth: + usernameClaim: "preferred_username" + emailClaim: "email" + groupsClaim: "groups" + ``` -Check scopes include groups: -```yaml -scopes: - - "openid" - - "profile" - - "email" - - "groups" -``` +2. **Check scopes include groups:** + ```yaml + scopes: + - "openid" + - "profile" + - "email" + - "groups" + ``` -Disable groups claim if your IdP doesn't support it: -```yaml -disableGroupsClaim: true -``` +3. **Disable groups claim if your IdP doesn't support it:** + ```yaml + disableGroupsClaim: true + ``` -Debug JWT tokens: -- Use [jwt.io](https://jwt.io) to inspect tokens -- Verify expected claims are present +4. **Debug JWT tokens:** + - Use [jwt.io](https://jwt.io) to inspect tokens + - Verify expected claims are present --- @@ -972,22 +972,22 @@ Debug JWT tokens: If issues persist: -Collect diagnostic information: -```bash -kubectl get all -n posit-team -o yaml > posit-team-resources.yaml -kubectl logs -n posit-team-system deployment/team-operator-controller-manager > operator.log -kubectl get events -n posit-team --sort-by='.lastTimestamp' > events.txt -``` +1. **Collect diagnostic information:** + ```bash + kubectl get all -n posit-team -o yaml > posit-team-resources.yaml + kubectl logs -n posit-team-system deployment/team-operator-controller-manager > operator.log + kubectl get events -n posit-team --sort-by='.lastTimestamp' > events.txt + ``` -Check Posit documentation: -- [Connect Admin Guide](https://docs.posit.co/connect/admin/) -- [Workbench Admin Guide](https://docs.posit.co/ide/server-pro/admin/) -- [Package Manager Admin Guide](https://docs.posit.co/rspm/admin/) +2. **Check Posit documentation:** + - [Connect Admin Guide](https://docs.posit.co/connect/admin/) + - [Workbench Admin Guide](https://docs.posit.co/ide/server-pro/admin/) + - [Package Manager Admin Guide](https://docs.posit.co/rspm/admin/) -Contact Posit Support: -- Include diagnostic files -- Describe the issue and steps to reproduce -- Include operator and product versions +3. **Contact Posit Support:** + - Include diagnostic files + - Describe the issue and steps to reproduce + - Include operator and product versions --- diff --git a/docs/guides/workbench-configuration.md b/docs/guides/workbench-configuration.md index c0e27ed..bef8585 100644 --- a/docs/guides/workbench-configuration.md +++ b/docs/guides/workbench-configuration.md @@ -867,54 +867,54 @@ spec: #### Authentication Failures -Check OIDC configuration: -- Verify the issuer URL is accessible from the cluster -- Confirm the client ID matches IdP configuration -- Verify redirect URIs are configured in the IdP +1. **Check OIDC configuration:** + - Verify the issuer URL is accessible from the cluster + - Confirm the client ID matches IdP configuration + - Verify redirect URIs are configured in the IdP -View authentication logs: -```bash -kubectl logs -n posit-team deploy/-workbench | grep -i auth -``` +2. **View authentication logs:** + ```bash + kubectl logs -n posit-team deploy/-workbench | grep -i auth + ``` -Verify secrets exist: -```bash -kubectl get secret -workbench-config -n posit-team -``` +3. **Verify secrets exist:** + ```bash + kubectl get secret -workbench-config -n posit-team + ``` #### Session Resource Issues -Check resource profile configuration: -```bash -kubectl get configmap -workbench -n posit-team -o yaml | grep -A 50 "launcher.kubernetes.resources.conf" -``` +1. **Check resource profile configuration:** + ```bash + kubectl get configmap -workbench -n posit-team -o yaml | grep -A 50 "launcher.kubernetes.resources.conf" + ``` -Verify nodes have capacity: -```bash -kubectl describe nodes | grep -A 10 "Allocated resources" -``` +2. **Verify nodes have capacity:** + ```bash + kubectl describe nodes | grep -A 10 "Allocated resources" + ``` -Check session pod events: -```bash -kubectl describe pod -n posit-team -``` +3. **Check session pod events:** + ```bash + kubectl describe pod -n posit-team + ``` #### Volume Mount Issues -Verify the PVC exists and is bound: -```bash -kubectl get pvc -n posit-team | grep workbench -``` +1. **Verify the PVC exists and is bound:** + ```bash + kubectl get pvc -n posit-team | grep workbench + ``` -Check volume permissions in the session: -```bash -kubectl exec -it -n posit-team -- ls -la /home -``` +2. **Check volume permissions in the session:** + ```bash + kubectl exec -it -n posit-team -- ls -la /home + ``` -Verify the storage class supports RWX: -```bash -kubectl get storageclass -o yaml -``` +3. **Verify the storage class supports RWX:** + ```bash + kubectl get storageclass -o yaml + ``` ### Useful Commands From 7adb22eb358545c29320982bf3587224dea07baf Mon Sep 17 00:00:00 2001 From: ian-flores Date: Mon, 2 Mar 2026 12:25:40 -0800 Subject: [PATCH 3/4] docs: fix meaning changes caught in review - Restore passive voice for Chronicle sidecar injection (operator injects it, not itself) - Restore "Can display" for Optional Academy (it's conditional) - Restore "Important:" prefix for sessionImage note --- docs/architecture.md | 4 ++-- docs/guides/connect-configuration.md | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/architecture.md b/docs/architecture.md index 0cabeee..358b3d9 100644 --- a/docs/architecture.md +++ b/docs/architecture.md @@ -613,7 +613,7 @@ Flightdeck is simple by design: - **No database**: Serves static content only - **No authentication**: Relies on product-level authentication - **Configurable layout**: Shows only enabled products -- **Optional Academy**: Displays a fourth card for Posit Academy +- **Optional Academy**: Can display a fourth card for Posit Academy ### Configuration Options @@ -749,7 +749,7 @@ flowchart TB ### Sidecar Injection -The Chronicle sidecar injects automatically into product pods when: +The Chronicle sidecar is automatically injected into product pods when: - Chronicle is enabled in the Site spec (`spec.chronicle.enabled: true`) - The product has Chronicle integration enabled diff --git a/docs/guides/connect-configuration.md b/docs/guides/connect-configuration.md index 5ab6ace..5aa29ef 100644 --- a/docs/guides/connect-configuration.md +++ b/docs/guides/connect-configuration.md @@ -126,7 +126,7 @@ spec: sessionImage: "ghcr.io/rstudio/rstudio-connect-content-init:ubuntu2204-2024.06.0" ``` -The `sessionImage` is used as an init container in content execution jobs. It prepares the runtime environment before content runs. +**Important:** The `sessionImage` is used as an init container in content execution jobs. It prepares the runtime environment before content runs. ### Resource Scaling From d0c09e7082f0b3dcfb70b7d9f14bde6ad0b45aaa Mon Sep 17 00:00:00 2001 From: ian-flores Date: Mon, 2 Mar 2026 12:27:23 -0800 Subject: [PATCH 4/4] docs: standardize list punctuation in CONTRIBUTING.md --- CONTRIBUTING.md | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index d31cee1..ab930eb 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -323,23 +323,23 @@ go tool cover -func coverage.out Include the following in your PR description: -1. **Summary** - What does this PR do? -2. **Motivation** - Why is this change needed? -3. **Testing** - How was this tested? -4. **Breaking changes** - Does this introduce any breaking changes? -5. **Related issues** - Link to any related GitHub issues +1. **Summary:** What does this PR do? +2. **Motivation:** Why is this change needed? +3. **Testing:** How was this tested? +4. **Breaking changes:** Does this introduce any breaking changes? +5. **Related issues:** Link to any related GitHub issues ### CI Checks The following checks must pass: -- **PR Title Check** - Title must follow conventional commit format (see above) -- **Build** - The operator must compile successfully -- **Unit tests** - All tests must pass -- **Kustomize** - Kustomization must build without errors -- **Helm lint** - Chart must pass linting -- **Helm template** - Templates must render correctly -- **No diff** - Generated files must be committed +- **PR Title Check:** Title must follow conventional commit format (see above) +- **Build:** The operator must compile successfully +- **Unit tests:** All tests must pass +- **Kustomize:** Kustomization must build without errors +- **Helm lint:** Chart must pass linting +- **Helm template:** Templates must render correctly +- **No diff:** Generated files must be committed ### Merging