Skip to content

fix(k8s): remove stale NFS writes and decouple PVC from skills#106

Merged
fred-scitix merged 2 commits intomainfrom
fix/remove-agent-data-from-skills-dir
Mar 13, 2026
Merged

fix(k8s): remove stale NFS writes and decouple PVC from skills#106
fred-scitix merged 2 commits intomainfrom
fix/remove-agent-data-from-skills-dir

Conversation

@jacoblee-io
Copy link
Collaborator

@jacoblee-io jacoblee-io commented Mar 13, 2026

Problem

  1. buildCredentialPayload() in rpc-methods.ts creates agent-data/ under skillsDir on NFS — gateway should never write to NFS
  2. agentbox-template.yaml mounted the NFS PVC for skills, credentials, and kube configs — but these are all synced via RPC, not shared filesystem
  3. k8s/gateway-deployment.yaml was missing the NFS PVC mount needed for ensureUserDir()

Solution

rpc-methods.ts: Remove the fs.mkdirSync() call from buildCredentialPayload().

agentbox-template.yaml: Rewrite to match what k8s-spawner.ts actually generates:

  • Skills, credentials, config → emptyDir (synced from gateway via RPC)
  • User data → NFS PVC (siclaw-data) with subPath users/{userId}/{workspaceId}
  • Client cert → Secret volume

gateway-deployment.yaml: Add NFS PVC mount at /app/.siclaw/user-data + persistence env vars, so gateway can ensureUserDir() before spawning AgentBox pods.

NFS PVC usage after this PR

Component Mount path Purpose Access
Gateway /app/.siclaw/user-data ensureUserDir() — create per-user subdirectories write (dirs only)
AgentBox /app/.siclaw/user-data (subPath) memory, investigations, sessions read-write

Skills are not on NFS — they live in emptyDir and are synced via RPC.

Test plan

  • Gateway starts without errors
  • Credential sync works (AgentBox receives credentials correctly)
  • No new agent-data/ directories appear under skillsDir
  • AgentBox pods start with correct volume mounts

…edentialPayload

buildCredentialPayload() was creating an `agent-data/` directory under
skillsDir on the NFS, contradicting its own JSDoc ("Returns data only —
does NOT write to disk"). Gateway should never write to the shared NFS;
only AgentBox pods should write to their own user-data mount.
@jacoblee-io jacoblee-io force-pushed the fix/remove-agent-data-from-skills-dir branch from eafac9f to ec13edb Compare March 13, 2026 14:41
Skills are synced via RPC (buildSkillBundle), not shared filesystem.
The NFS PVC should only be used for user-data persistence.

agentbox-template.yaml:
- Remove all skills/credentials/kube NFS mounts (were on siclaw-skills PVC)
- Add emptyDir volumes for skills, credentials, config (synced via RPC)
- Add client-cert secret volume (mTLS)
- Align with what k8s-spawner.ts actually generates

gateway-deployment.yaml:
- Add NFS PVC (siclaw-data) mount at /app/.siclaw/user-data
- Add persistence env vars (CLAIM_NAME, MOUNT_PATH)
- Gateway uses this mount only for ensureUserDir() before spawning pods
@jacoblee-io jacoblee-io force-pushed the fix/remove-agent-data-from-skills-dir branch from ec13edb to c95c67b Compare March 13, 2026 14:43
@jacoblee-io jacoblee-io changed the title fix(gateway): remove stale agent-data dir creation from NFS fix(k8s): remove stale NFS writes and decouple PVC from skills Mar 13, 2026
Copy link
Collaborator Author

@jacoblee-io jacoblee-io left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. The removed mkdirSync was writing to skillsDir (gateway's skills emptyDir/NFS), but the AgentBox pod's user-data subPath (user/{userId}/agent-data) is on a completely separate volume. So this directory creation never served a purpose for the AgentBox — it was a no-op side effect that contradicted the JSDoc. Clean removal, no risk.

@fred-scitix fred-scitix self-requested a review March 13, 2026 14:46
@fred-scitix fred-scitix merged commit 12f77c8 into main Mar 13, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants