This document provides instructions on how to run the Immich Machine Learning (ML) service natively on a Mac (leveraging
Apple's Core ML for performance) and how to integrate such an instance into a load-balanced setup using the
immich_ml_balancer Docker container.
Run the ML service natively so ONNX Runtime feeds Apple’s Core ML provider—no Docker needed. This is ideal for leveraging the Neural Engine on Apple Silicon Macs for faster ML inference.
-
System Preparation (once)
xcode-select --install # compilers & headers brew install git pyenv pipx uv # uv = fast dep solver pyenv install 3.12.2 && pyenv global 3.12.2
-
Grab Immich’s ML Code
git clone --depth 1 https://github.com/immich-app/immich.git cd immich/machine-learning -
Force Core ML to be Tried First Edit
immich_ml/models/constants.pyand move"CoreMLExecutionProvider"to the top ofSUPPORTED_PROVIDERS. -
Clean Virtual Environment with Only Runtime Dependencies
uv venv .venv --python $(pyenv which python) UV_HTTP_TIMEOUT=120 uv sync --extra cpu --no-dev source .venv/bin/activate
-
Verify the Provider List
python - <<'PY' import onnxruntime as ort print(ort.get_available_providers()) # ['CoreMLExecutionProvider', 'CPUExecutionProvider'] PY
-
Launch the Service
export MACHINE_LEARNING_CACHE_FOLDER=$HOME/.immich-model-cache mkdir -p "$MACHINE_LEARNING_CACHE_FOLDER" python -m immich_ml # listens on :3003
-
Smoke Test
curl -s http://192.168.0.<mac-ip>:3003/ping # → pong
That’s it—your M1’s M2’s M3’s... Neural Engine now handles all Immich ML tasks at full speed.
This section explains how to integrate your natively running Mac ML instance (or any other immich_ml instance) into a
load-balanced setup using the immich_ml_balancer Docker container.
The immich_ml_balancer is a lightweight Nginx-based Docker image designed to distribute ML inference requests across
multiple immich_ml service instances. It supports dynamic backend configuration and gracefully handles instances that
might be sporadically available.
To build the Docker image for the balancer, navigate to the immich_ml_balancer directory (where this README is
located) and run the following command:
docker build --platform linux/amd64 -t apetersson/immich_ml_balancer:latest .Replace apetersson with your Docker Hub username if you plan to push it.
After building, you can push the image to your Docker Hub repository to make it publicly available:
- Log in to Docker Hub (if not already logged in):
docker login
- Push the image:
docker push apetersson/immich_ml_balancer:latest
To use the balancer, you'll update your main Immich docker-compose.yml file. This example assumes you have a local
immich-machine-learning instance (e.g., on your NAS) and your Mac (desktop.local) as backends.
Then, add the immich-ml-balancer service and configure its backends using the IMML_BACKENDS environment variable.
Also, ensure your local immich-machine-learning instance is defined as a service that the balancer can reach.
services:
# ... your existing immich-server, redis, database, backup services ...
immich-ml-balancer:
container_name: immich_ml_balancer
image: apetersson/immich_ml_balancer:latest # Your custom balancer image
environment:
# Comma-separated list of immich_ml instance hostnames/IPs.
# Format: "hostname_or_ip[:port]"
# If port is omitted, it defaults to 3003.
IMML_BACKENDS: "immich-machine-learning,192.168.0.123:3003,desktop.local:3003" # Example with local and a
# No ports mapping needed here, as it's accessed via the internal network by immich server on port 80
#expose avahi for name resolution on your lan:
volumes:
- /var/run/dbus:/var/run/dbus # DBus system bus
- /var/run/avahi-daemon/socket:/var/run/avahi-daemon/socket
depends_on:
- immich-machine-learning # Depend on at least one ML instance for startup order
restart: always
#your existing immich-machine-learning docker container
immich-machine-learning:
container_name: immich_machine_learning
image: ghcr.io/immich-app/immich-machine-learning:release-openvino
volumes:
- ${MODEL_CACHE_LOCATION}:/cache
devices:
- /dev/dri:/dev/dri
device_cgroup_rules:
- 'c 189:* rmw'
env_file:
- .env
restart: always
#optionally limit ram to keep the machine happy
deploy:
resources:
limits:
memory: 5G
# No ports mapping needed here, as it's accessed via the internal network by the balancerThen, ensure your immich-server service points to the balancer. You can do this in the yaml or via an .env file
services:
immich-server:
# ... existing configuration ...
environment:
# Point the Immich server to the balancer
IMMICH_MACHINE_LEARNING_URL: http://immich-ml-balancer:80
# ... rest of immich-server config ...This variable is a comma-separated string of your immich_ml backend instances. Each entry can be:
hostname_or_ip: If no port is specified, the balancer will assume theimmich_mlinstance is listening on port3003.hostname_or_ip:port: To specify a custom port for a backend.
Examples:
IMML_BACKENDS: "my-local-ml,192.168.1.100:8000,another-server.local"my-local-mlwill be accessed on port3003.192.168.1.100will be accessed on port8000.another-server.localwill be accessed on port3003.
The Nginx configuration includes proxy_next_upstream directives. This means if a backend instance listed in
IMML_BACKENDS is temporarily unavailable (e.g., offline, unresponsive), the balancer will automatically try the next
available instance in the list. When the sporadic instance comes back online, Nginx will eventually detect it and resume
sending requests.
Once the hosts are receiving jobs, tweak the "Smart Search concurrency" setting — personally, I am getting best values with 8 parallel jobs.
sudo powermetrics --samplers ane_power,cpu_power,gpu_power -i 1000should output something like
CPU Power: 10075 mW
GPU Power: 34 mW
ANE Power: 990 mW
Combined Power (CPU + GPU + ANE): 11098 mW
note that thermal throttling may set in rather quickly on some devices.