refactor(k8s): replace per-user dynamic PVCs with shared PVC + subPath#105
refactor(k8s): replace per-user dynamic PVCs with shared PVC + subPath#105fred-scitix merged 2 commits intomainfrom
Conversation
jacoblee-io
left a comment
There was a problem hiding this comment.
Good refactor — dynamic PVC provisioning per user was unnecessarily complex. Shared PVC + subPath is the right call. A few issues below, one of which could cause silent bugs.
src/gateway/agentbox/k8s-spawner.ts
Outdated
| throw err; | ||
| } | ||
| private ensureUserDir(userId: string, workspaceId: string): void { | ||
| const mountPath = this.config.persistence?.mountPath || ".siclaw/shared-data"; |
There was a problem hiding this comment.
mountPath default is a relative path — fragile.
The Helm template always passes an absolute path (/app/.siclaw/shared-data), but the code defaults here and in gateway-main.ts are relative (.siclaw/shared-data). If someone runs without Helm or the env var isn't set, path.resolve() resolves against CWD, which depends on container WORKDIR and can silently break.
Suggest making the default absolute:
const mountPath = this.config.persistence?.mountPath || "/app/.siclaw/shared-data";And same in gateway-main.ts:31.
src/gateway/agentbox/k8s-spawner.ts
Outdated
| await this.ensureUserPvc(userId); | ||
| const subDir = `users/${safeUserId}/${safeWorkspaceId}`; | ||
| console.log(`[k8s-spawner] Persistence enabled: shared PVC "${this.config.persistence.claimName}", subPath "${subDir}"`); | ||
| this.ensureUserDir(userId, workspaceId); |
There was a problem hiding this comment.
Nit: double sanitization.
safeUserId/safeWorkspaceId are computed at lines 183-184, but ensureUserDir() takes the raw userId/workspaceId and sanitizes internally again. Both produce the same result, so it's correct, but the pattern is confusing — a reader might wonder if they diverge.
Consider either:
- Passing
safeUserId/safeWorkspaceIdand removingsanitizePathSegmentfromensureUserDir, or - Having
ensureUserDirreturn the safe values so the caller doesn't need to compute them separately.
| - name: SICLAW_PERSISTENCE_CLAIM_NAME | ||
| value: {{ .Values.agentbox.persistence.claimName | quote }} | ||
| - name: SICLAW_PERSISTENCE_MOUNT_PATH | ||
| value: "/app/.siclaw/shared-data" |
There was a problem hiding this comment.
Hardcoded mount path not configurable via values.yaml.
The SICLAW_PERSISTENCE_MOUNT_PATH env var is hardcoded to /app/.siclaw/shared-data, and the volumeMount.mountPath on line 83 is also hardcoded to the same value. If someone changes one but not the other, they'll silently diverge.
Consider either:
- Adding
mountPathtovalues.yamland templating both, or - Adding a comment noting these two must stay in sync.
5000d0c to
e406a5c
Compare
jacoblee-io
left a comment
There was a problem hiding this comment.
Previous comments all addressed — nice. Two small nits remaining, neither is a blocker.
src/gateway/agentbox/k8s-spawner.ts
Outdated
| size: string; // e.g. "1Gi" | ||
| /** Name of the pre-existing shared PVC (e.g. "siclaw-data") */ | ||
| claimName: string; | ||
| /** Local mount path of the shared PVC on the gateway (default: ".siclaw/shared-data") */ |
There was a problem hiding this comment.
Nit: stale JSDoc default.
The comment says (default: ".siclaw/shared-data") but the actual default in ensureUserDir (line 352) is now "/app/.siclaw/user-data". Should be:
/** Local mount path of the shared PVC on the gateway (default: "/app/.siclaw/user-data") */| - name: skills-pv | ||
| mountPath: /app/.siclaw/user-data | ||
| subPath: user/${USER_ID}/agent-data | ||
| subPath: users/${USER_ID}/${WORKSPACE_ID} |
There was a problem hiding this comment.
Nit: users/ vs user/ prefix inconsistency on the same PVC.
User data subPath is now users/{USER_ID}/{WORKSPACE_ID} (plural), but three lines below (line 114), credentials still use user/{USER_ID}/... (singular). Skills also use user/ (lines 107-108). Having both users/ and user/ as top-level directories on the same PVC is a readability trap — someone will inevitably confuse them.
Consider aligning all to one convention (either users/ everywhere or user/ everywhere) if there's no hard reason to keep them separate.
jacoblee-io
left a comment
There was a problem hiding this comment.
Re: stale JSDoc — Fixed, now reads "/app/.siclaw/user-data".
Re: users/ vs user/ prefix — Intentional. user/ is the skills PVC directory layout (skills, credentials, kube configs), while users/ is the shared data PVC layout (memory, sessions). These are different volumes in the actual K8s spawner; the template co-locates them on one PVC for reference only. Renaming user/ would break existing skills sync paths in local-spawner.ts and resource-handlers.ts.
Instead of dynamically creating a PVC per user (requiring StorageClass
with dynamic provisioning), use a single pre-existing shared PVC.
Gateway creates per-user subdirectories (users/{userId}/{workspaceId}/);
AgentBox pods mount via subPath for isolation.
- Replace persistence config (storageClass/accessMode/size) with claimName
- Remove ensureUserPvc() / userPvcName() — no more K8s PVC creation
- Add ensureUserDir() with sanitizePathSegment() for safe directory names
- Mount shared PVC on gateway for directory management
- Template mountPath via values.yaml (env + volumeMount stay in sync)
- Remove PVC "create" verb from RBAC (only get/list needed)
- Update all K8s manifests and Helm templates for consistency
e406a5c to
7d9a32a
Compare
jacoblee-io
left a comment
There was a problem hiding this comment.
All previous comments addressed — JSDoc default fixed, double sanitization cleaned up, mountPath templated via values.yaml. No remaining issues. LGTM, ready to merge.
Summary
siclaw-data)users/{userId}/{workspaceId}/) on the shared volumesubPathfor isolation — no dynamic provisioning or StorageClass requiredDetails
Before: Each user got a dedicated PVC (
siclaw-user-{userId}) created on demand via K8s API, requiring a StorageClass that supports dynamic provisioning.After: Operator pre-creates one shared
ReadWriteManyPVC. Gateway mounts it and creates subdirectories at spawn time; AgentBox pods mount the user-specific subdirectory viasubPath.Changes
ensureUserPvc()/userPvcName(), addensureUserDir()/sanitizePathSegment(), use shared PVCclaimName+subPathstorageClass/accessMode/sizeenv vars withclaimName/mountPathenabled+claimNameSICLAW_PERSISTENCE_MOUNT_PATHcreateverb (onlyget/listneeded)ReadWriteMany, 10Gi, updated commentsTest plan
persistence.enabled=true, verify gateway createsusers/{userId}/{workspaceId}/on the shared PVCpersistence.enabled=false(default), verify emptyDir fallback still works