Skip to content

Private Container Registry Causes Edge Issues #7466

@kevin-vitro

Description

@kevin-vitro

Expected Behavior

I would expect the Images that have previously been downloaded to the machine to run as they have been whether the container registry is accessible or not.

Current Behavior

Currently, edgeAgent requires that the Container Registry is accessible before using the images that are downloaded locally.

Steps to Reproduce

Provide a detailed set of steps to reproduce the bug.

  1. Deploy containers that come from a custom Container Registry
  2. Update the password to the CR user
  3. Restart the machine so the containers need to be restarted
  4. EdgeAgent will show an unauthorized error and the containers will not start.

Context (Environment)

Output of iotedge check

Click here

Configuration checks
--------------------
√ config.yaml is well-formed - OK
√ config.yaml has well-formed connection string - OK
√ container engine is installed and functional - OK
√ config.yaml has correct hostname - OK
√ config.yaml has correct URIs for daemon mgmt endpoint - OK
√ latest security daemon - OK
√ host time is close to real time - OK
√ container time is close to host time - OK
‼ DNS server - Warning
    Container engine is not configured with DNS server setting, which may impact connectivity to IoT Hub.
    Please see https://aka.ms/iotedge-prod-checklist-dns for best practices.
    You can ignore this warning if you are setting DNS server per module in the Edge deployment.
‼ production readiness: certificates - Warning
    The Edge device is using self-signed automatically-generated development certificates.
    They will expire in 89 days (at 2025-11-09 12:33:00 UTC) causing module-to-module and downstream device communication to fail on an active deployment.
    After the certs have expired, restarting the IoT Edge daemon will trigger it to generate new development certs.
    Please consider using production certificates instead. See https://aka.ms/iotedge-prod-checklist-certs for best practices.
‼ production readiness: logs policy - Warning
    Container engine is not configured to rotate module logs which may cause it run out of disk space.
    Please see https://aka.ms/iotedge-prod-checklist-logs for best practices.
    You can ignore this warning if you are setting log policy per module in the Edge deployment.
‼ production readiness: Edge Agent's storage directory is persisted on the host filesystem - Warning
    The edgeAgent module is not configured to persist its /tmp/edgeAgent directory on the host filesystem.
    Data might be lost if the module is deleted or updated.
    Please see https://aka.ms/iotedge-storage-host for best practices.
‼ production readiness: Edge Hub's storage directory is persisted on the host filesystem - Warning
    The edgeHub module is not configured to persist its /tmp/edgeHub directory on the host filesystem.
    Data might be lost if the module is deleted or updated.
    Please see https://aka.ms/iotedge-storage-host for best practices.

Connectivity checks
-------------------
√ host can connect to and perform TLS handshake with IoT Hub AMQP port - OK
√ host can connect to and perform TLS handshake with IoT Hub HTTPS / WebSockets port - OK
√ host can connect to and perform TLS handshake with IoT Hub MQTT port - OK
√ container on the default network can connect to IoT Hub AMQP port - OK
√ container on the default network can connect to IoT Hub HTTPS / WebSockets port - OK
√ container on the default network can connect to IoT Hub MQTT port - OK
√ container on the IoT Edge module network can connect to IoT Hub AMQP port - OK
√ container on the IoT Edge module network can connect to IoT Hub HTTPS / WebSockets port - OK
√ container on the IoT Edge module network can connect to IoT Hub MQTT port - OK

17 check(s) succeeded.
5 check(s) raised warnings. Re-run with --verbose for more details.

Device Information

  • Host OS [e.g. Ubuntu 22.04, Windows Server IoT 2019]: Ubuntu 20.04.4 LTS
  • Architecture [e.g. amd64, arm32, arm64]: x86_64
  • Container OS [e.g. Linux containers, Windows containers]: Linux containers

Runtime Versions

  • aziot-edged [run iotedge version]: iotedge 1.1.15
  • Edge Agent [image tag (e.g. 1.0.0)]: 1.5
  • Edge Hub [image tag (e.g. 1.0.0)]: 1.5
  • Docker/Moby [run docker version]: 24.0.0

Note: when using Windows containers on Windows, run docker -H npipe:////./pipe/iotedge_moby_engine version instead

Logscat

aziot-edged logs

-- No entries --

edge-agent logs

<6> 2025-08-11 11:15:39.882 -04:00 [INF] - Executing command: "Command Group: (\n  [Command Group: (\n  [Prepare module {container_name}]\n  [Create module {container_name}]\n)]\n  [Start module {container_name}]\n)"
<6> 2025-08-11 11:15:39.882 -04:00 [INF] - Executing command: "Command Group: (\n  [Prepare module {container_name}]\n  [Create module {container_name}]\n)"
<3> 2025-08-11 11:15:58.212 -04:00 [ERR] - Executing command for operation ["create"] failed.
Microsoft.Azure.Devices.Edge.Agent.Edgelet.EdgeletCommunicationException- Message:Error calling prepare update for module {container_name}: Could not prepare update for module "{container_name}"
        caused by: Could not pull image {custom_cr}/{container_name}:0.1.32
        caused by: Head "https://{custom_cr}/v2/{container_name}/manifests/0.1.32": unauthorized: Invalid clientid or client secret., StatusCode:500, at:   at Microsoft.Azure.Devices.Edge.Agent.Edgelet.Version_2020_07_07.ModuleManagementHttpClient.HandleException(Exception exception, String operation) in /mnt/vss/_work/1/s/edge-agent/src/Microsoft.Azure.Devices.Edge.Agent.Edgelet/version_2020_07_07/ModuleManagementHttpClient.cs:line 232
   at Microsoft.Azure.Devices.Edge.Agent.Edgelet.Versioning.ModuleManagementHttpClientVersioned.Execute[T](Func`1 func, String operation) in /mnt/vss/_work/1/s/edge-agent/src/Microsoft.Azure.Devices.Edge.Agent.Edgelet/versioning/ModuleManagementHttpClientVersioned.cs:line 168
   at Microsoft.Azure.Devices.Edge.Agent.Edgelet.Version_2020_07_07.ModuleManagementHttpClient.PrepareUpdateAsync(ModuleSpec moduleSpec) in /mnt/vss/_work/1/s/edge-agent/src/Microsoft.Azure.Devices.Edge.Agent.Edgelet/version_2020_07_07/ModuleManagementHttpClient.cs:line 214
   at Microsoft.Azure.Devices.Edge.Agent.Edgelet.ModuleManagementHttpClient.<>c__DisplayClass26_0.<<Throttle>b__0>d.MoveNext() in /mnt/vss/_work/1/s/edge-agent/src/Microsoft.Azure.Devices.Edge.Agent.Edgelet/ModuleManagementHttpClient.cs:line 150
--- End of stack trace from previous location ---
   at Microsoft.Azure.Devices.Edge.Agent.Edgelet.ModuleManagementHttpClient.Throttle[T](Func`1 identityOperation) in /mnt/vss/_work/1/s/edge-agent/src/Microsoft.Azure.Devices.Edge.Agent.Edgelet/ModuleManagementHttpClient.cs:line 165
   at Microsoft.Azure.Devices.Edge.Agent.Core.Commands.GroupCommand.ExecuteAsync(CancellationToken token) in /mnt/vss/_work/1/s/edge-agent/src/Microsoft.Azure.Devices.Edge.Agent.Core/commands/GroupCommand.cs:line 35
   at Microsoft.Azure.Devices.Edge.Agent.Core.LoggingCommandFactory.LoggingCommand.ExecuteAsync(CancellationToken token) in /mnt/vss/_work/1/s/edge-agent/src/Microsoft.Azure.Devices.Edge.Agent.Core/LoggingCommandFactory.cs:line 61
<3> 2025-08-11 11:15:58.213 -04:00 [ERR] - Executing command for operation ["Command Group: (\n  [Command Group: (\n  [Prepare module {container_name}]\n  [Create module {container_name}]\n)]\n  [Start module {container_name}]\n)"] failed.
Microsoft.Azure.Devices.Edge.Agent.Edgelet.EdgeletCommunicationException- Message:Error calling prepare update for module {container_name}: Could not prepare update for module "{container_name}"
        caused by: Could not pull image {custom_cr}/{container_name}:0.1.32
        caused by: Head "https://{custom_cr}/v2/{container_name}/manifests/0.1.32": unauthorized: Invalid clientid or client secret., StatusCode:500, at:   at Microsoft.Azure.Devices.Edge.Agent.Edgelet.Version_2020_07_07.ModuleManagementHttpClient.HandleException(Exception exception, String operation) in /mnt/vss/_work/1/s/edge-agent/src/Microsoft.Azure.Devices.Edge.Agent.Edgelet/version_2020_07_07/ModuleManagementHttpClient.cs:line 232
   at Microsoft.Azure.Devices.Edge.Agent.Edgelet.Versioning.ModuleManagementHttpClientVersioned.Execute[T](Func`1 func, String operation) in /mnt/vss/_work/1/s/edge-agent/src/Microsoft.Azure.Devices.Edge.Agent.Edgelet/versioning/ModuleManagementHttpClientVersioned.cs:line 168
   at Microsoft.Azure.Devices.Edge.Agent.Edgelet.Version_2020_07_07.ModuleManagementHttpClient.PrepareUpdateAsync(ModuleSpec moduleSpec) in /mnt/vss/_work/1/s/edge-agent/src/Microsoft.Azure.Devices.Edge.Agent.Edgelet/version_2020_07_07/ModuleManagementHttpClient.cs:line 214
   at Microsoft.Azure.Devices.Edge.Agent.Edgelet.ModuleManagementHttpClient.<>c__DisplayClass26_0.<<Throttle>b__0>d.MoveNext() in /mnt/vss/_work/1/s/edge-agent/src/Microsoft.Azure.Devices.Edge.Agent.Edgelet/ModuleManagementHttpClient.cs:line 150
--- End of stack trace from previous location ---
   at Microsoft.Azure.Devices.Edge.Agent.Edgelet.ModuleManagementHttpClient.Throttle[T](Func`1 identityOperation) in /mnt/vss/_work/1/s/edge-agent/src/Microsoft.Azure.Devices.Edge.Agent.Edgelet/ModuleManagementHttpClient.cs:line 165
   at Microsoft.Azure.Devices.Edge.Agent.Core.Commands.GroupCommand.ExecuteAsync(CancellationToken token) in /mnt/vss/_work/1/s/edge-agent/src/Microsoft.Azure.Devices.Edge.Agent.Core/commands/GroupCommand.cs:line 35
   at Microsoft.Azure.Devices.Edge.Agent.Core.LoggingCommandFactory.LoggingCommand.ExecuteAsync(CancellationToken token) in /mnt/vss/_work/1/s/edge-agent/src/Microsoft.Azure.Devices.Edge.Agent.Core/LoggingCommandFactory.cs:line 61
   at Microsoft.Azure.Devices.Edge.Agent.Core.Commands.GroupCommand.ExecuteAsync(CancellationToken token) in /mnt/vss/_work/1/s/edge-agent/src/Microsoft.Azure.Devices.Edge.Agent.Core/commands/GroupCommand.cs:line 35
   at Microsoft.Azure.Devices.Edge.Agent.Core.LoggingCommandFactory.LoggingCommand.ExecuteAsync(CancellationToken token) in /mnt/vss/_work/1/s/edge-agent/src/Microsoft.Azure.Devices.Edge.Agent.Core/LoggingCommandFactory.cs:line 61
<3> 2025-08-11 11:15:58.214 -04:00 [ERR] - Step failed in deployment 33, continuing execution. Failure when running command Command Group: (
  [Command Group: (
  [Prepare module {container_name}]
  [Create module {container_name}]
)]
  [Start module {container_name}]
). Will retry in -04s.
edge-hub logs These logs are unavailable because when the issue was resolved, the edgeHub was re-deployed with a new container

Additional Information

Please let me know if there is any other information that would be helpful. edgeHub and aziot-edged both gave pretty unhelpful logs for this because I fixed the issue before capturing the logs but I will try to catch them next time before updating the container registry credentials.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions