Conversation
When accessing the dashboard via NPM domain (sen.lucknet.uk), the HTTP Host header was used for NPM ForwardHost matching. Since NPM stores IPs not domains, no proxy hosts matched and every port fell back to sen.lucknet.uk:<port>. For local containers, use Lookup() (matches against the resolver's configured sentinelHost IP) instead of LookupForHost() with the request domain. Cluster containers still use LookupForHost() with the remote host's IP.
NPM proxy hosts with wildcard domains like *.s3.garage.example.com produced broken URLs. The resolver now picks the first non-wildcard domain from the list, falling back to skipping the entry entirely if all domains are wildcards.
The filter bar on history, logs, and images pages was missing the bottom border that the dashboard had. Added explicit border-bottom to .filter-bar.
… queue entries Portainer settings now take effect immediately without a container restart, using the same factory pattern as the NPM connector. The connection test always recreates the provider from current DB settings so token changes are picked up. Portainer endpoints that point at the same Docker socket Sentinel monitors no longer produce duplicate queue entries (container IDs are compared against the local scan). Updated help text and integration descriptions for accuracy.
The dashboard stat card used a local-only pending count that excluded Portainer queue items, while the nav badge used the full queue length. Both now use the queue length directly so they always match. Removed the checkmark icon from the zero-state as it added no value.
The detail page handler only knew about local and cluster containers. Portainer containers (host=portainer:N) fell through to the cluster lookup which returned "not found". Added a Portainer branch that extracts the endpoint ID, fetches containers from the Portainer API, and builds the detail view with policy, version, and queue info.
Portainer and cluster container detail pages looked up history and snapshots using the bare container name, but records are stored under scoped keys (e.g. "portainer:3::name"). Now uses hostFilter::name when a host filter is present.
Covers data model, local socket detection, connector UI, dashboard integration, engine changes, and migration path.
13 tasks across 7 chunks: store CRUD, migration, engine multi-instance scanning, local socket detection, web API, dashboard host groups, and frontend connector cards.
Adds MigratePortainerSettings() which converts the flat portainer_url/ portainer_token/portainer_enabled settings keys into a PortainerInstance record (id "p1", name "Portainer") and clears the old keys. Safe to call multiple times: skips if any instances already exist. Also adds DeleteSetting() to bolt.go.
Replace single-instance Portainer settings (flat portainer_enabled/url/token
keys) with a full instance CRUD API backed by PortainerInstanceStore. The
PortainerProvider interface now takes instanceID parameters on all methods.
New routes: GET/POST /api/portainer/instances, PUT/DELETE instances/{id},
POST instances/{id}/test, GET instances/{id}/endpoints,
PUT instances/{id}/endpoints/{epid}.
Existing container detail handler updated to parse the new
"portainer:instanceID:epID" host filter format while remaining backwards
compatible with the legacy "portainer:epID" format.
Note: cmd/sentinel/ adapters will not compile until Task 7 updates them.
Three bugs found during live testing on the test cluster: 1. Portainer instances added via the API had no live scanner (only boot-time instances worked). Added ConnectInstance/DisconnectInstance to PortainerProvider, called from create/update/delete handlers. 2. Portainer self-signed certs caused TLS verification failures. Added InsecureSkipVerify to the Portainer HTTP client (standard for homelab and private network setups). 3. csrfToken is a function reference (window.csrfToken = getCSRFToken) but connectors.html passed it as a value. Changed all 11 occurrences to csrfToken() calls.
IsLocalSocket() was defined but never called. Now applied in two places: 1. Scanner.Endpoints() filters out local socket endpoints so the engine never scans them (defence in depth). 2. Test Connection handler auto-marks new local socket endpoints as blocked with reason "local Docker socket (duplicates direct monitoring)" so users see why the endpoint is disabled. Updated scanner tests to use TCP URLs for mock endpoints (empty URL + EndpointDocker type now correctly triggers IsLocalSocket).
Only auto-block unix:// endpoints when the Portainer instance runs on the same host as Sentinel. Previously all unix:// endpoints were blocked regardless of host, which incorrectly disabled remote Portainer instances. - Add isLocalPortainerInstance() to compare Portainer URL against local IPs - Remove over-aggressive IsLocalSocket filter from Scanner.Endpoints() - Wire engine into multiPortainerAdapter so runtime-added instances are scanned without restart - Reconnect engine after endpoint config changes (test, update) - Add unit tests for local detection helpers
DetectLocalAddrs was including container-internal addresses (172.17.x.x, localhost, hostname) which never match NPM ForwardHost values. This caused Lookup() to silently filter out all proxies when SENTINEL_HOST was not set, making port chips fall back to raw IP:port links. Now only includes routable addresses: explicit SENTINEL_HOST values and Docker host IP via host.docker.internal. Returns an empty set when neither is available, which disables filtering (safe fallback matching all proxies).
…t shadowing When hostAddr from the HTTP request is a valid IP (direct IP access or SENTINEL_HOST), use LookupForHost to match only NPM proxies forwarding to that specific host. Falls back to Lookup() when accessed via domain. Fixes regression from 1c18dec where empty localAddrs disabled all filtering, allowing port 8080 on host A to shadow port 8080 on host B.
…ction
Two bugs fixed:
- API control/queue handlers routed Portainer hostIDs to cluster branch
(hostID != "" was too broad). Added portainer: prefix exclusion to all
7 guards in api_control.go and reordered api_queue.go routing.
- Portainer containers passed empty digest to CheckVersionedWithDigest,
causing digestsMatch("", remoteDigest) to always report false updates.
Now fetches real repo digests via Portainer image inspect API.
PullImage now properly drains the Docker streaming response to ensure the image pull completes before container creation. Previously the response body was closed without reading, causing create to fail with "no such image" because the pull hadn't finished. Also adds success history recording for remote updates (Portainer, cluster agent, swarm). Previously only failures were recorded; the success path only existed in the local UpdateContainer function. Debug logging retained in Portainer scan path for ongoing diagnostics (semverScope, digest, isLocal, up-to-date status).
Inside Docker, the gRPC server only sees container bridge IPs in its network interfaces. Agents connecting via the host's external IP fail TLS verification because that IP isn't in the server certificate SANs. New SENTINEL_CLUSTER_ADVERTISE env var (or cluster_advertise DB setting) accepts comma-separated IPs/hostnames to include as additional SANs in the ephemeral server certificate. Also exposed in the cluster settings API for UI configuration.
Detect when the same Docker host is reachable via multiple sources (local socket, cluster agent, Portainer connector) and auto-block the lower-priority source. Priority: local > cluster > Portainer. - Proto: add engine_id field to StateReport (field 6) - Agent: collect Engine ID on startup, include in state reports - Hub: store local Engine ID as DB setting on boot - Registry: persist agent Engine IDs in host state - Portainer: probe endpoint Engine ID via Docker info API - Web: findEngineOverlap checks local/cluster before Portainer - Auto-block overlapping endpoints on Test Connection - Connectors UI: show overlap reason + Force Enable button - SSE source_overlap event for real-time dashboard notifications - ForceAllow user override clears auto-block per endpoint
Two CSS issues when cluster hosts are present: - tbody tr:last-child removed border-bottom from last row of each host-group, leaving no separator between groups - section-divider border-top was on inner div (inset by td padding) instead of on the td itself
Swarm service rows (svc-header, svc-task-row) had an extra <td class="col-actions"> that regular container rows and the thead did not have. With table-layout:fixed, this created a phantom column that stole ~300px of width, pushing the entire table layout left and preventing dividers from spanning the full UI width. Closes #62
When approving updates from the Pending Updates page and navigating
to the Dashboard, containers could show "Updating" indefinitely.
The update completed and cleared the maintenance flag, but the SSE
event was published before the Dashboard's EventSource connection
was established -- a missed-event race between server-side page
render and client-side SSE subscribe.
Added a catch-up fetch in the SSE connected handler: on initial
connect, the Dashboard scans for any rows with .badge-updating and
re-fetches their current state from /api/containers/{name}/row,
picking up the cleared maintenance flag.
Agent side: detect TLS certificate errors (x509 unknown authority) in the reconnection loop and log a clear ERROR message once with the fix steps (stop agent, delete cluster data dir, re-enroll with fresh token). Subsequent reconnect attempts log only the WARN without repeating the guidance. Cluster page: show troubleshooting section for ALL disconnected hosts, not just those with a known disconnect category. When the category is empty (e.g. host loaded from store after restart with no prior disconnect event), show a generic section covering the three common causes: agent not running, network issue, and CA certificate mismatch after volume recreation.
LoadSetting returns ("", nil) for missing keys, which the old code
interpreted as show_stopped=false. Add a v != "" guard so a fresh
database preserves the default-true behaviour.
Closes #63
JS-created task rows had 7 cells (extra actions column) vs the 6-column table, pushing status/ports right. Also missing col-status/col-policy classes so centering rules didn't apply. Match the HTML template's 6-cell structure and add the correct column classes. Closes #64
The previous commit (fa32a2a) fixed the source JS but the bundled app.js was stale. Ran make frontend to produce the correct bundle with col-* classes on shutdown task rows.
The "Service scaled to 0" placeholder had colspan="6" but starts at column 2 (after the checkbox cell), making the browser allocate 7 columns in a 6-column table. With table-layout:fixed this caused the table to shrink when any swarm service was stopped. Changed to colspan="5" so the total (1 + 5) matches the 6-column colgroup. The JS path in swarm.js already had the correct value.
) When a service is scaled to 0, Docker removes all tasks within seconds. Added an in-memory task cache to swarmAdapter that preserves last-seen running tasks per service and serves them as "shutdown" when Docker returns none. This ensures task rows with node names and SHUTDOWN badges survive full page refreshes instead of showing a generic placeholder.
…check - Scan() accessed u.portainerInstances without portainerMu lock at two sites (prune loop and len guard) while HTTP handlers mutate the slice concurrently. Snapshot under RLock, matching scanPortainerInstances. - SavePortainerInstance and convertStoreInstance dropped EngineID and ForceAllow when converting between web and store types. ForceAllow loss caused manually unblocked endpoints to get re-blocked on reconnect. - Replace zero-value struct comparison with map ok idiom for endpoint existence check.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
What changed
portainer_instanceswith CRUD + migration from legacy single-instance settingsIsLocalSocket) to prevent scanning the host Docker twiceTest plan
🤖 Generated with Claude Code