Summary
I’m reporting an outage-class failure mode with SMB external storage on YunoHost Nextcloud: when an SMB mount is misconfigured (bad/missing credential, “pending”, etc.) and a user browses it in the Nextcloud Files web UI, Nextcloud can spawn large numbers of smbclient processes (timeout ~20s). This saturates the Nextcloud PHP‑FPM pool and causes global 502 Bad Gateway for all Nextcloud routes — not just the SMB mount.
This is easy to trigger accidentally and hard to diagnose because it looks like “Nextcloud is down” rather than “one external mount is broken.”
Full write-up / context: https://wonko.net/fixing-the-smb-password-error-in-nextcloud-yunohost/
───
Environment
• YunoHost: 12.x
• nextcloud_ynh: 32.0.4~ynh1
• OS: Debian 12
• PHP-FPM: 8.3 (Nextcloud pool socket: /var/run/php/php8.3-fpm-nextcloud.sock)
• DB: MariaDB
• Nextcloud install: subdirectory (/nextcloud)
• External storage: SMB/CIFS (files_external)
───
Observed behavior / evidence
When the SMB mount is in a broken/pending auth state and is accessed from the Files UI:
- PHP‑FPM saturates and begins spawning many smbclient children (20s timeout). Example from systemctl status php8.3-fpm showing smbclient processes under the service:
/usr/bin/smbclient -t 20 --authentication-file=/proc/self/fd/3 //192.168.x.x/sambashare
(repeated many times)
- nginx begins returning 502 for unrelated Nextcloud routes, because it cannot connect to the Nextcloud FPM socket. Example error pattern:
connect() to unix:/var/run/php/php8.3-fpm-nextcloud.sock failed (11: Resource temporarily unavailable) while connecting to upstream
- Immediate recovery (restores service quickly):
• pkill -f 'smbclient.*//192.168.x.x/sambashare'
This frees the FPM pool and the whole instance comes back.
───
Additional pitfall: CLI footgun can store SMB password in plaintext mount options
On this installation, it is easy to mistakenly “set the SMB password” via:
• php occ files_external:option <mount_id> password '...'
This stores password as a mount option (plaintext visible in php occ files_external:list --output=json as options.password) and does not reliably set SMB credentials. The correct command is:
• php occ files_external:config <mount_id> password '...'
This is relevant because it increases the chance admins end up with a broken mount and trigger the outage mode above.
(I’m planning to report the CLI guardrail/doc aspect upstream to Nextcloud as well; including here because it’s directly related to how operators trip into the broken mount state.)
───
Proposed mitigation for nextcloud_ynh (no secrets)
Even if the underlying Nextcloud/files_external behavior is upstream, I think nextcloud_ynh can prevent full-instance outages with a packaging-level mitigation that does not store secrets:
• Enumerate SMB mounts after upgrade (and/or via a periodic health check)
• Run php occ files_external:verify <mount_id>
• If verify fails, automatically disable the mount (or emit a very loud warning plus the exact disable command), so one broken mount can’t DoS the entire instance
I’m happy to test a PR if maintainers agree on approach/location (upgrade script vs hook vs cron).
───
AI disclosure (for transparency)
This report is based on my own debugging on a live YunoHost/Nextcloud instance (commands/log patterns included). I used AI tools (Claude Sonnet 4.6 and OpenClaw / OpenAI Codex gpt‑5.2) to help structure and edit the write-up, but all technical claims are backed by the included command outputs and the linked full write-up.
Summary
I’m reporting an outage-class failure mode with SMB external storage on YunoHost Nextcloud: when an SMB mount is misconfigured (bad/missing credential, “pending”, etc.) and a user browses it in the Nextcloud Files web UI, Nextcloud can spawn large numbers of smbclient processes (timeout ~20s). This saturates the Nextcloud PHP‑FPM pool and causes global 502 Bad Gateway for all Nextcloud routes — not just the SMB mount.
This is easy to trigger accidentally and hard to diagnose because it looks like “Nextcloud is down” rather than “one external mount is broken.”
Full write-up / context: https://wonko.net/fixing-the-smb-password-error-in-nextcloud-yunohost/
───
Environment
• YunoHost: 12.x
• nextcloud_ynh: 32.0.4~ynh1
• OS: Debian 12
• PHP-FPM: 8.3 (Nextcloud pool socket: /var/run/php/php8.3-fpm-nextcloud.sock)
• DB: MariaDB
• Nextcloud install: subdirectory (/nextcloud)
• External storage: SMB/CIFS (files_external)
───
Observed behavior / evidence
When the SMB mount is in a broken/pending auth state and is accessed from the Files UI:
/usr/bin/smbclient -t 20 --authentication-file=/proc/self/fd/3 //192.168.x.x/sambashare
(repeated many times)
connect() to unix:/var/run/php/php8.3-fpm-nextcloud.sock failed (11: Resource temporarily unavailable) while connecting to upstream
• pkill -f 'smbclient.*//192.168.x.x/sambashare'
This frees the FPM pool and the whole instance comes back.
───
Additional pitfall: CLI footgun can store SMB password in plaintext mount options
On this installation, it is easy to mistakenly “set the SMB password” via:
• php occ files_external:option <mount_id> password '...'
This stores password as a mount option (plaintext visible in php occ files_external:list --output=json as options.password) and does not reliably set SMB credentials. The correct command is:
• php occ files_external:config <mount_id> password '...'
This is relevant because it increases the chance admins end up with a broken mount and trigger the outage mode above.
(I’m planning to report the CLI guardrail/doc aspect upstream to Nextcloud as well; including here because it’s directly related to how operators trip into the broken mount state.)
───
Proposed mitigation for nextcloud_ynh (no secrets)
Even if the underlying Nextcloud/files_external behavior is upstream, I think nextcloud_ynh can prevent full-instance outages with a packaging-level mitigation that does not store secrets:
• Enumerate SMB mounts after upgrade (and/or via a periodic health check)
• Run php occ files_external:verify <mount_id>
• If verify fails, automatically disable the mount (or emit a very loud warning plus the exact disable command), so one broken mount can’t DoS the entire instance
I’m happy to test a PR if maintainers agree on approach/location (upgrade script vs hook vs cron).
───
AI disclosure (for transparency)
This report is based on my own debugging on a live YunoHost/Nextcloud instance (commands/log patterns included). I used AI tools (Claude Sonnet 4.6 and OpenClaw / OpenAI Codex gpt‑5.2) to help structure and edit the write-up, but all technical claims are backed by the included command outputs and the linked full write-up.