v2.12.0: Multi-instance Portainer, self-update, and bug fixes by Will-Luck · Pull Request #75 · Will-Luck/Docker-Sentinel

Will-Luck · 2026-03-12T11:45:00Z

Summary

Multi-instance Portainer support: Connect multiple Portainer instances with per-endpoint toggles, containers appear as dashboard host groups, BoltDB migration for instance storage
Portainer self-update: Via portainer-updater helper container
NPM resolver hardening: Auto-detects local IPs to prevent cross-host port shadowing, skips wildcard domains
10+ bug fixes: History page scan summaries, failed approval recording, images page alignment, filter bar borders, container detail for remote containers, scoped key lookups, and more

What changed

40 commits since v2.11.1
New BoltDB bucket: portainer_instances with CRUD + migration from legacy single-instance settings
Multi-Portainer scanning with per-endpoint filtering and local socket auto-blocking
Portainer connector UI (settings page) with endpoint toggles
Smart local socket detection (IsLocalSocket) to prevent scanning the host Docker twice
NPM resolver improvements (local IP detection, wildcard skip)
Portainer self-update feature via helper container
Images page: column alignment, red unused badge
History page: scan summary row fix, failed approval recording
Filter bar bottom border

Test plan

Bug hunt on the diff
Build and deploy to test environment
Verify multi-Portainer connector UI
Verify Portainer self-update flow
Verify NPM URL resolution with multiple hosts
Verify history and images page fixes
Run full test suite

🤖 Generated with Claude Code

When accessing the dashboard via NPM domain (sen.lucknet.uk), the HTTP Host header was used for NPM ForwardHost matching. Since NPM stores IPs not domains, no proxy hosts matched and every port fell back to sen.lucknet.uk:<port>. For local containers, use Lookup() (matches against the resolver's configured sentinelHost IP) instead of LookupForHost() with the request domain. Cluster containers still use LookupForHost() with the remote host's IP.

NPM proxy hosts with wildcard domains like *.s3.garage.example.com produced broken URLs. The resolver now picks the first non-wildcard domain from the list, falling back to skipping the entry entirely if all domains are wildcards.

…N.md

The filter bar on history, logs, and images pages was missing the bottom border that the dashboard had. Added explicit border-bottom to .filter-bar.

… queue entries Portainer settings now take effect immediately without a container restart, using the same factory pattern as the NPM connector. The connection test always recreates the provider from current DB settings so token changes are picked up. Portainer endpoints that point at the same Docker socket Sentinel monitors no longer produce duplicate queue entries (container IDs are compared against the local scan). Updated help text and integration descriptions for accuracy.

The dashboard stat card used a local-only pending count that excluded Portainer queue items, while the nav badge used the full queue length. Both now use the queue length directly so they always match. Removed the checkmark icon from the zero-state as it added no value.

The detail page handler only knew about local and cluster containers. Portainer containers (host=portainer:N) fell through to the cluster lookup which returned "not found". Added a Portainer branch that extracts the endpoint ID, fetches containers from the Portainer API, and builds the detail view with policy, version, and queue info.

Portainer and cluster container detail pages looked up history and snapshots using the bare container name, but records are stored under scoped keys (e.g. "portainer:3::name"). Now uses hostFilter::name when a host filter is present.

Covers data model, local socket detection, connector UI, dashboard integration, engine changes, and migration path.

13 tasks across 7 chunks: store CRUD, migration, engine multi-instance scanning, local socket detection, web API, dashboard host groups, and frontend connector cards.

Adds MigratePortainerSettings() which converts the flat portainer_url/ portainer_token/portainer_enabled settings keys into a PortainerInstance record (id "p1", name "Portainer") and clears the old keys. Safe to call multiple times: skips if any instances already exist. Also adds DeleteSetting() to bolt.go.

Replace single-instance Portainer settings (flat portainer_enabled/url/token keys) with a full instance CRUD API backed by PortainerInstanceStore. The PortainerProvider interface now takes instanceID parameters on all methods. New routes: GET/POST /api/portainer/instances, PUT/DELETE instances/{id}, POST instances/{id}/test, GET instances/{id}/endpoints, PUT instances/{id}/endpoints/{epid}. Existing container detail handler updated to parse the new "portainer:instanceID:epID" host filter format while remaining backwards compatible with the legacy "portainer:epID" format. Note: cmd/sentinel/ adapters will not compile until Task 7 updates them.

Three bugs found during live testing on the test cluster: 1. Portainer instances added via the API had no live scanner (only boot-time instances worked). Added ConnectInstance/DisconnectInstance to PortainerProvider, called from create/update/delete handlers. 2. Portainer self-signed certs caused TLS verification failures. Added InsecureSkipVerify to the Portainer HTTP client (standard for homelab and private network setups). 3. csrfToken is a function reference (window.csrfToken = getCSRFToken) but connectors.html passed it as a value. Changed all 11 occurrences to csrfToken() calls.

IsLocalSocket() was defined but never called. Now applied in two places: 1. Scanner.Endpoints() filters out local socket endpoints so the engine never scans them (defence in depth). 2. Test Connection handler auto-marks new local socket endpoints as blocked with reason "local Docker socket (duplicates direct monitoring)" so users see why the endpoint is disabled. Updated scanner tests to use TCP URLs for mock endpoints (empty URL + EndpointDocker type now correctly triggers IsLocalSocket).

Only auto-block unix:// endpoints when the Portainer instance runs on the same host as Sentinel. Previously all unix:// endpoints were blocked regardless of host, which incorrectly disabled remote Portainer instances. - Add isLocalPortainerInstance() to compare Portainer URL against local IPs - Remove over-aggressive IsLocalSocket filter from Scanner.Endpoints() - Wire engine into multiPortainerAdapter so runtime-added instances are scanned without restart - Reconnect engine after endpoint config changes (test, update) - Add unit tests for local detection helpers

DetectLocalAddrs was including container-internal addresses (172.17.x.x, localhost, hostname) which never match NPM ForwardHost values. This caused Lookup() to silently filter out all proxies when SENTINEL_HOST was not set, making port chips fall back to raw IP:port links. Now only includes routable addresses: explicit SENTINEL_HOST values and Docker host IP via host.docker.internal. Returns an empty set when neither is available, which disables filtering (safe fallback matching all proxies).

…t shadowing When hostAddr from the HTTP request is a valid IP (direct IP access or SENTINEL_HOST), use LookupForHost to match only NPM proxies forwarding to that specific host. Falls back to Lookup() when accessed via domain. Fixes regression from 1c18dec where empty localAddrs disabled all filtering, allowing port 8080 on host A to shadow port 8080 on host B.

…ction Two bugs fixed: - API control/queue handlers routed Portainer hostIDs to cluster branch (hostID != "" was too broad). Added portainer: prefix exclusion to all 7 guards in api_control.go and reordered api_queue.go routing. - Portainer containers passed empty digest to CheckVersionedWithDigest, causing digestsMatch("", remoteDigest) to always report false updates. Now fetches real repo digests via Portainer image inspect API.

PullImage now properly drains the Docker streaming response to ensure the image pull completes before container creation. Previously the response body was closed without reading, causing create to fail with "no such image" because the pull hadn't finished. Also adds success history recording for remote updates (Portainer, cluster agent, swarm). Previously only failures were recorded; the success path only existed in the local UpdateContainer function. Debug logging retained in Portainer scan path for ongoing diagnostics (semverScope, digest, isLocal, up-to-date status).

Inside Docker, the gRPC server only sees container bridge IPs in its network interfaces. Agents connecting via the host's external IP fail TLS verification because that IP isn't in the server certificate SANs. New SENTINEL_CLUSTER_ADVERTISE env var (or cluster_advertise DB setting) accepts comma-separated IPs/hostnames to include as additional SANs in the ephemeral server certificate. Also exposed in the cluster settings API for UI configuration.

Detect when the same Docker host is reachable via multiple sources (local socket, cluster agent, Portainer connector) and auto-block the lower-priority source. Priority: local > cluster > Portainer. - Proto: add engine_id field to StateReport (field 6) - Agent: collect Engine ID on startup, include in state reports - Hub: store local Engine ID as DB setting on boot - Registry: persist agent Engine IDs in host state - Portainer: probe endpoint Engine ID via Docker info API - Web: findEngineOverlap checks local/cluster before Portainer - Auto-block overlapping endpoints on Test Connection - Connectors UI: show overlap reason + Force Enable button - SSE source_overlap event for real-time dashboard notifications - ForceAllow user override clears auto-block per endpoint

Two CSS issues when cluster hosts are present: - tbody tr:last-child removed border-bottom from last row of each host-group, leaving no separator between groups - section-divider border-top was on inner div (inset by td padding) instead of on the td itself

Swarm service rows (svc-header, svc-task-row) had an extra <td class="col-actions"> that regular container rows and the thead did not have. With table-layout:fixed, this created a phantom column that stole ~300px of width, pushing the entire table layout left and preventing dividers from spanning the full UI width. Closes #62

When approving updates from the Pending Updates page and navigating to the Dashboard, containers could show "Updating" indefinitely. The update completed and cleared the maintenance flag, but the SSE event was published before the Dashboard's EventSource connection was established -- a missed-event race between server-side page render and client-side SSE subscribe. Added a catch-up fetch in the SSE connected handler: on initial connect, the Dashboard scans for any rows with .badge-updating and re-fetches their current state from /api/containers/{name}/row, picking up the cleared maintenance flag.

Agent side: detect TLS certificate errors (x509 unknown authority) in the reconnection loop and log a clear ERROR message once with the fix steps (stop agent, delete cluster data dir, re-enroll with fresh token). Subsequent reconnect attempts log only the WARN without repeating the guidance. Cluster page: show troubleshooting section for ALL disconnected hosts, not just those with a known disconnect category. When the category is empty (e.g. host loaded from store after restart with no prior disconnect event), show a generic section covering the three common causes: agent not running, network issue, and CA certificate mismatch after volume recreation.

…hooting

LoadSetting returns ("", nil) for missing keys, which the old code interpreted as show_stopped=false. Add a v != "" guard so a fresh database preserves the default-true behaviour. Closes #63

JS-created task rows had 7 cells (extra actions column) vs the 6-column table, pushing status/ports right. Also missing col-status/col-policy classes so centering rules didn't apply. Match the HTML template's 6-cell structure and add the correct column classes. Closes #64

The previous commit (fa32a2a) fixed the source JS but the bundled app.js was stale. Ran make frontend to produce the correct bundle with col-* classes on shutdown task rows.

The "Service scaled to 0" placeholder had colspan="6" but starts at column 2 (after the checkbox cell), making the browser allocate 7 columns in a 6-column table. With table-layout:fixed this caused the table to shrink when any swarm service was stopped. Changed to colspan="5" so the total (1 + 5) matches the 6-column colgroup. The JS path in swarm.js already had the correct value.

) When a service is scaled to 0, Docker removes all tasks within seconds. Added an in-memory task cache to swarmAdapter that preserves last-seen running tasks per service and serves them as "shutdown" when Docker returns none. This ensures task rows with node names and SHUTDOWN badges survive full page refreshes instead of showing a generic placeholder.

…check - Scan() accessed u.portainerInstances without portainerMu lock at two sites (prune loop and len guard) while HTTP handlers mutate the slice concurrently. Snapshot under RLock, matching scanPortainerInstances. - SavePortainerInstance and convertStoreInstance dropped EngineID and ForceAllow when converting between web and store types. ForceAllow loss caused manually unblocked endpoints to get re-blocked on reconnect. - Replace zero-value struct comparison with map ok idiom for endpoint existence check.

web-flow added 30 commits March 10, 2026 23:17

fix: replace broken backage badge with self-hosted pkgbadge

8fc5e7b

fix: skip wildcard domains in NPM port URL resolution

7750407

NPM proxy hosts with wildcard domains like *.s3.garage.example.com produced broken URLs. The resolver now picks the first non-wildcard domain from the list, falling back to skipping the entry entirely if all domains are wildcards.

docs: add Docker Hub pulls badge to README

383d9a8

docs: replace fake email with GitHub security advisory link

18f6bc9

docs: fix phantom email in CODE_OF_CONDUCT, outdated version in DESIG…

2ed1308

…N.md

fix: add bottom border to filter bar for visual consistency (#61)

d649a87

The filter bar on history, logs, and images pages was missing the bottom border that the dashboard had. Added explicit border-bottom to .filter-bar.

docs: add Portainer detail page and stat card fixes to changelog

811cedc

docs: add multi-instance Portainer design spec

77305f6

Covers data model, local socket detection, connector UI, dashboard integration, engine changes, and migration path.

docs: add multi-instance Portainer implementation plan

d9d4e15

13 tasks across 7 chunks: store CRUD, migration, engine multi-instance scanning, local socket detection, web API, dashboard host groups, and frontend connector cards.

feat: add portainer_instances BoltDB bucket with CRUD operations

c5b172e

feat: migrate queue/history HostID fields for portainer instance IDs

f6e10b4

feat: multi-instance Portainer scanning with per-endpoint filtering

802113f

feat: add IsLocalSocket detection for Portainer endpoints

cbc7f31

feat: wire multi-instance Portainer adapters and migration on boot

13bcd06

feat: Portainer containers appear as dashboard host groups

854f0d3

feat: multi-instance Portainer connector UI with endpoint toggles

7e5c8e1

docs: add multi-instance Portainer to changelog

df74e23

docs: add bug fixes from live testing to changelog

47101c2

docs: add local socket blocking fix to changelog

229e0c5

docs: update changelog with smart blocking and engine wiring fixes

e3229bc

web-flow added 29 commits March 12, 2026 01:29

docs: add images page alignment and unused badge fixes to changelog

8ad4509

docs: add 4 missing fixes to changelog (filter bar, scoped keys, NPM)

3bae22d

fix: include NewImage in failed approval history records

914a647

fix: add 2s timeout to DNS lookup in isLocalPortainerInstance

ca93028

fix: guard portainerInstances with RWMutex to prevent data race

7d3f463

fix: handle Docker Hub canonical prefixes in IsPortainerImage

d23efaf

fix: use colon-based detection for already-migrated portainer keys

0486e35

chore: add .worktrees/ to gitignore for isolated development

6ff8610

docs: add issue #62 fixes to changelog

f6b129c

docs: changelog entries for CA mismatch guidance and cluster troubles…

4a8f4e7

…hooting

fix: show stopped containers by default on dashboard (#63)

58ba99c

LoadSetting returns ("", nil) for missing keys, which the old code interpreted as show_stopped=false. Add a v != "" guard so a fresh database preserves the default-true behaviour. Closes #63

docs: add issue #63 fix to changelog

228d9d2

docs: add issue #64 fix to changelog

0afa779

build: rebuild frontend bundle with issue #64 fix

6820f66

The previous commit (fa32a2a) fixed the source JS but the bundled app.js was stale. Ran make frontend to produce the correct bundle with col-* classes on shutdown task rows.

docs: update issue #64 changelog with colspan root cause

57d5eb0

Will-Luck merged commit 4182fb5 into main Mar 13, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v2.12.0: Multi-instance Portainer, self-update, and bug fixes#75

v2.12.0: Multi-instance Portainer, self-update, and bug fixes#75
Will-Luck merged 67 commits intomainfrom
dev

Will-Luck commented Mar 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Will-Luck commented Mar 12, 2026

Summary

What changed

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants