-
Notifications
You must be signed in to change notification settings - Fork 210
Description
Issue: NIM Container Incompatibility on DGX Spark (ARM64 Architecture)
Environment
- Hardware: NVIDIA DGX Spark
- CPU Architecture: ARM64 (aarch64)
- CPU Model: Cortex-X925 + Cortex-A725
- CPU Cores: 20
- OS: DGX OS 7.3.1 (Ubuntu-based)
- Kernel: 6.14.0-1015-nvidia (aarch64)
- Docker: Docker Engine with NVIDIA Container Runtime
- Project: workbench-example-agentic-rag
- Location: China
Problem Description
The local NIM container (local-nim service in compose.yaml) fails to start on DGX Spark due to architecture incompatibility.
Issue 1: NGC Official Repository Access Denied (China Region)
When attempting to use the official NGC image:
image: nvcr.io/nim/meta/llama-3.1-8b-instruct:latestError:
Error response from daemon: pull access denied for nvcr.io/nim/meta/llama-3.1-8b-instruct,
repository does not exist or may require 'docker login': denied:
CND 必需
自 2025 年 2 月 15 日起,所有来自中国 IP 地址用于下载 NIM 镜像和模型的 API/CLI 请求将被拒绝
对于中国用户,请使用以下链接跳转到我们中国 NIM 合作伙伴提供的地址下载 NIM 镜像和模型:
https://catalog.ngc.nvidia.com/china-nim-distributors
Translation: As of February 15, 2025, all API/CLI requests from Chinese IP addresses to download
NIM images and models will be rejected. For Chinese users, please use the China NIM partner link.
Root Cause: Geographic IP-based access restriction implemented by NVIDIA NGC for Chinese users.
Issue 2: Architecture Incompatibility with China Mirror
Following NVIDIA's guidance, we obtained credentials for the China partner registry (io.chancloud.com) and attempted to use:
image: io.chancloud.com/cnd-enterprise/llama-3.1-8b-instruct-s3:1.8Error:
The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8)
...
exec /opt/nvidia/nvidia_entrypoint.sh: exec format error
Root Cause: The China mirror NIM image is compiled for AMD64 (x86_64) architecture, which is incompatible with DGX Spark's ARM64 (aarch64) architecture.
Technical Verification
1. System Architecture Confirmation
$ uname -m
aarch64
$ lscpu | grep Architecture
Architecture: aarch642. Image Architecture Inspection
$ docker image inspect io.chancloud.com/cnd-enterprise/llama-3.1-8b-instruct-s3:1.8 | grep Architecture
"Architecture": "amd64",3. Manifest Verification
$ docker manifest inspect io.chancloud.com/cnd-enterprise/llama-3.1-8b-instruct-s3:1.8Returns a single-architecture manifest (v2), not a multi-architecture manifest list. This confirms the image only supports AMD64.
4. Container Startup Failure
$ docker compose up -d
local-nim The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8)
$ docker compose logs local-nim
exec /opt/nvidia/nvidia_entrypoint.sh: exec format errorEven attempting to run basic shell commands fails:
$ docker run --rm --entrypoint /bin/sh io.chancloud.com/cnd-enterprise/llama-3.1-8b-instruct-s3:1.8 -c "uname -m"
exec /bin/sh: exec format errorImpact
For users in China with DGX Spark (or any ARM64-based system):
- Cannot access official NGC NIM images due to geographic restrictions
- Cannot use China partner NIM images due to architecture incompatibility
- Cannot run local NIM inference as described in the "Advanced Mode" documentation
Current Workaround
The project functions correctly using NVIDIA cloud API endpoints (Easy Mode):
- Uses
NVIDIA_API_KEYfor cloud-based inference via build.nvidia.com - Embedding service works via NVIDIA API endpoints
- No local GPU inference required
However, this requires:
- Stable internet connectivity
- API quota management
- Higher latency compared to local inference
Suggested Solutions
Option 1: Multi-Architecture NIM Images (Recommended)
Build and distribute ARM64 versions of NIM containers via:
- Official NGC repository:
nvcr.io/nim/meta/llama-3.1-8b-instruct:latest - China partner registry:
io.chancloud.com/cnd-enterprise/llama-3.1-8b-instruct-s3
Implement multi-architecture support using Docker manifest lists:
docker manifest create nvcr.io/nim/meta/llama-3.1-8b-instruct:latest \
--amend nvcr.io/nim/meta/llama-3.1-8b-instruct:latest-amd64 \
--amend nvcr.io/nim/meta/llama-3.1-8b-instruct:latest-arm64Option 2: Documentation Update
Update the project documentation to clarify:
- DGX Spark uses ARM64 architecture
- Current NIM images only support AMD64
- For DGX Spark users:
- Use cloud API mode (Easy Mode) - currently supported ✅
- Use Ollama with ARM64-compatible models (alternative local inference)
- Wait for ARM64-compatible NIM releases
Option 3: Ollama Integration Guide
Provide official guidance for using Ollama as a local inference alternative:
# Install Ollama (native ARM64 support)
curl -fsSL https://ollama.com/install.sh | sh
# Download Llama 3.1 8B
ollama pull llama3.1:8b
# Configure app to use Ollama endpoint
# Endpoint: http://localhost:11434/v1
# Model: llama3.1:8bReferences
- DGX Spark Official Documentation: https://docs.nvidia.com/dgx/dgx-spark/
- DGX Spark Architecture: ARM Cortex-X925 + Cortex-A725 (aarch64)
- Arm Learning Paths - DGX Spark: https://learn.arm.com/learning-paths/laptops-and-desktops/dgx_spark_llamacpp/
- NGC China Distributors: https://catalog.ngc.nvidia.com/china-nim-distributors
Additional Context
DGX Spark represents NVIDIA's expansion into ARM-based AI computing with the GB10 Grace-Blackwell Superchip. As this platform gains adoption, ARM64 support for NIM containers will become increasingly important for the NVIDIA AI ecosystem, particularly for:
- Edge AI deployments
- Power-efficient data centers
- Regions with limited cloud connectivity
- Users subject to geographic API restrictions
Environment Details
System: Linux spark-7663 6.14.0-1015-nvidia #15-Ubuntu SMP PREEMPT_DYNAMIC
Architecture: aarch64
CPU: ARM Cortex-X925 + Cortex-A725 (20 cores)
Docker Version: (output of docker --version)
NVIDIA Container Runtime: enabled
Project Version: main branch (latest)
Would appreciate guidance on:
- Timeline for ARM64 NIM image availability
- Official recommendations for DGX Spark users in China
- Whether this is a known limitation in the project documentation