Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
116 changes: 90 additions & 26 deletions deploy/helm/spark-k8s-operator/templates/roles.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,51 +6,87 @@ metadata:
labels:
{{- include "operator.labels" . | nindent 4 }}
rules:
# For automatic cluster domain detection: the operator lists and watches nodes to
# determine the Kubernetes cluster domain (e.g. cluster.local), and reads kubelet
# information via the nodes/proxy subresource.
- apiGroups:
- ""
resources:
- nodes
verbs:
- list
- watch
# For automatic cluster domain detection
- apiGroups:
- ""
resources:
- nodes/proxy
verbs:
- get
# The pod-driver controller (Controller::new(Pod)) watches Spark driver pods
# (labelled spark-role=driver) to track SparkApplication completion. It also deletes
# driver pods once the application reaches a terminal phase (Succeeded or Failed).
- apiGroups:
- ""
resources:
- persistentvolumeclaims
- pods
verbs:
- delete
- get
- list
- watch
# ConfigMaps hold pod templates and Spark configuration. All three controllers apply
# them via Server-Side Apply (create + patch). The history and connect controllers
# track them for orphan cleanup (list + delete). All controllers watch ConfigMaps via
# .owns(ConfigMap) so that changes trigger re-reconciliation.
# get is required for the ReconciliationPaused strategy in cluster_resources.add().
- apiGroups:
- ""
resources:
- configmaps
verbs:
- create
- delete
- deletecollection
- get
- list
- patch
- update
- watch
# Services expose Spark History Server and Spark Connect Server for metrics and
# inter-component communication. Applied via Server-Side Apply and tracked for orphan
# cleanup by the history and connect controllers. The history and connect controllers
# watch Services via .owns(Service) to trigger re-reconciliation on change.
# get is required for the ReconciliationPaused strategy in cluster_resources.add().
- apiGroups:
- ""
resources:
- pods
- configmaps
- secrets
- services
- endpoints
- serviceaccounts
verbs:
- create
- delete
- deletecollection
- get
- list
- patch
- update
- watch
# ServiceAccounts are created per SparkApplication (directly via client.apply_patch,
# referencing spark-k8s-clusterrole) and per SparkHistoryServer/SparkConnectServer
# (via cluster_resources.add). The history and connect controllers track them for
# orphan cleanup (list + delete). No controller watches ServiceAccounts via .owns().
# get is required for the ReconciliationPaused strategy in cluster_resources.add().
- apiGroups:
- ""
resources:
- serviceaccounts
verbs:
- create
- delete
- get
- list
- patch
# RoleBindings are created per SparkApplication (directly via client.apply_patch,
# binding to spark-k8s-clusterrole) and per SparkHistoryServer/SparkConnectServer
# (via cluster_resources.add, binding to their respective ClusterRoles). The history
# and connect controllers track them for orphan cleanup (list + delete).
# No controller watches RoleBindings via .owns().
# get is required for the ReconciliationPaused strategy in cluster_resources.add().
- apiGroups:
- rbac.authorization.k8s.io
resources:
Expand All @@ -61,32 +97,36 @@ rules:
- get
- list
- patch
- update
- watch
# StatefulSets run the Spark History Server and Spark Connect Server. Applied via
# Server-Side Apply (create + patch), tracked for orphan cleanup (list + delete),
# and watched by the history and connect controllers via .owns(StatefulSet).
# get is required for the ReconciliationPaused strategy in cluster_resources.add().
- apiGroups:
- apps
resources:
- statefulsets
- deployments
verbs:
- create
- delete
- get
- list
- patch
- update
- watch
# A Kubernetes Job is created per SparkApplication via Server-Side Apply to run
# spark-submit. The app controller applies Jobs directly (not via cluster_resources),
# so only create + patch (SSA) are needed. Jobs are not watched and not tracked for
# orphan cleanup by any controller.
- apiGroups:
- batch
resources:
- jobs
verbs:
- create
- delete
- get
- list
- patch
- update
- watch
# PodDisruptionBudgets limit voluntary disruptions to Spark History Server pods.
# Applied via Server-Side Apply and tracked for orphan cleanup by the history
# controller. No controller watches PDBs via .owns().
# get is required for the ReconciliationPaused strategy in cluster_resources.add().
- apiGroups:
- policy
resources:
Expand All @@ -97,8 +137,6 @@ rules:
- get
- list
- patch
- update
- watch
- apiGroups:
- apiextensions.k8s.io
resources:
Expand All @@ -114,13 +152,20 @@ rules:
- list
- watch
{{- end }}
# The operator emits Kubernetes events for controller reconciliation outcomes
# (create for new events, patch to aggregate/update existing events).
- apiGroups:
- events.k8s.io
resources:
- events
verbs:
- create
- patch
# The operator reconciles SparkApplication, SparkHistoryServer, SparkConnectServer,
# and SparkApplicationTemplate objects as the primary resources for their respective
# controllers. get + list + watch are required for Controller::new() and .owns().
# The main resource objects are never patched directly; only the /status subresources
# are patched (see the separate rule below).
- apiGroups:
- spark.stackable.tech
resources:
Expand All @@ -131,15 +176,22 @@ rules:
verbs:
- get
- list
- patch
- watch
# The app controller patches SparkApplication status after creating the Job (to prevent
# duplicate job creation on restart). The pod-driver controller also patches it when the
# driver pod transitions to a terminal phase. The connect controller patches
# SparkConnectServer status each reconciliation with readiness conditions.
# The history controller does not update SparkHistoryServer status.
- apiGroups:
- spark.stackable.tech
resources:
- sparkapplications/status
- sparkconnectservers/status
verbs:
- patch
# S3Connection and S3Bucket objects provide S3 configuration for Spark (event log
# storage, data access). The operator reads them during reconciliation and watches them
# so that S3 configuration changes trigger re-reconciliation.
- apiGroups:
- s3.stackable.tech
resources:
Expand All @@ -149,6 +201,12 @@ rules:
- get
- list
- watch
# The operator creates per-application/per-server RoleBindings that reference the
# product ClusterRoles, granting workload pods the permissions they need at runtime.
# The bind verb is required to create RoleBindings that reference a ClusterRole.
# - {{ include "operator.name" . }}-clusterrole: bound per SparkApplication
# - spark-history-clusterrole: bound per SparkHistoryServer
# - spark-connect-clusterrole: bound per SparkConnectServer
- apiGroups:
- rbac.authorization.k8s.io
resources:
Expand All @@ -157,17 +215,23 @@ rules:
- bind
resourceNames:
- {{ include "operator.name" . }}-clusterrole
- spark-history-clusterrole
- spark-connect-clusterrole
# Listeners expose the Spark History Server and Spark Connect Server to the network.
# Applied via Server-Side Apply (create + patch) and tracked for orphan cleanup
# (list + delete) by the history and connect controllers.
# get is required for the ReconciliationPaused strategy in cluster_resources.add().
# No controller watches Listeners via .owns(), so the watch verb is not required.
- apiGroups:
- listeners.stackable.tech
resources:
- listeners
verbs:
- create
- delete
- get
- list
- watch
- patch
- create
- delete
{{ if .Capabilities.APIVersions.Has "security.openshift.io/v1" }}
- apiGroups:
- security.openshift.io
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,13 @@ metadata:
labels:
{{- include "operator.labels" . | nindent 4 }}
rules:
# These permissions are for the Spark driver pod at runtime, not the operator itself.
# The driver uses Kubernetes-native scheduling to create and manage executor pods,
# and needs access to configmaps (executor config), services (driver-executor
# communication), secrets (credentials), persistentvolumeclaims (PVC-based dynamic
# allocation scratch space), and pods (executor lifecycle management).
# serviceaccounts is included from the upstream template but Spark does not create
# service accounts at runtime; it could be removed in a future cleanup.
- apiGroups:
- ""
resources:
Expand All @@ -24,6 +31,7 @@ rules:
- patch
- update
- watch
# Spark may emit events for executor lifecycle transitions.
- apiGroups:
- events.k8s.io
resources:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,13 @@ metadata:
labels:
{{- include "operator.labels" . | nindent 4 }}
rules:
# These permissions are for the Spark Connect Server pod at runtime, not the operator.
# The Spark Connect Server acts as a long-running Spark driver that creates and manages
# executor pods via Kubernetes-native scheduling. It requires access to pods (executor
# lifecycle), configmaps (executor config), services (driver-executor communication),
# secrets (credentials), and persistentvolumeclaims (PVC-based dynamic allocation).
# serviceaccounts is included from the upstream template but Spark Connect does not
# create service accounts at runtime; it could be removed in a future cleanup.
- apiGroups:
- ""
resources:
Expand All @@ -24,6 +31,7 @@ rules:
- patch
- update
- watch
# Spark Connect Server may emit Kubernetes events for executor lifecycle transitions.
- apiGroups:
- events.k8s.io
resources:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,12 @@ metadata:
labels:
{{- include "operator.labels" . | nindent 4 }}
rules:
# These permissions are for the Spark History Server pod at runtime, not the operator.
# The History Server is a read-only web UI that reads completed Spark event logs from
# a shared log directory (typically S3 or HDFS). It does not create pods, services,
# or other Kubernetes resources at runtime — this ClusterRole is significantly
# over-permissioned and should be tightened in a future audit once the minimal
# runtime requirements are confirmed against a live deployment.
- apiGroups:
- ""
resources:
Expand All @@ -24,6 +30,7 @@ rules:
- patch
- update
- watch
# Spark History Server may emit Kubernetes events.
- apiGroups:
- events.k8s.io
resources:
Expand Down
Loading