Skip to content

[Issue]: Atom fails to run #192

@amnamasood-amd

Description

@amnamasood-amd

Problem Description

Hi, I am trying to run DeepSeek -R1 model using the command in ReadMe. However I get the following error:
Traceback (most recent call last):
File "/usr/lib/python3.12/multiprocessing/process.py", line 314, in _bootstrap
self.run()
File "/usr/lib/python3.12/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/opt/venv/lib/python3.12/site-packages/atom/model_engine/async_proc.py", line 59, in init
runner_class = resolve_obj_by_qualname(runner_qualname) # type: ignore
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/venv/lib/python3.12/site-packages/atom/utils/init.py", line 552, in resolve_obj_by_qualname
module = importlib.import_module(module_name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/importlib/init.py", line 90, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "", line 1387, in _gcd_import
File "", line 1360, in _find_and_load
File "", line 1331, in _find_and_load_unlocked
File "", line 935, in _load_unlocked
File "", line 995, in exec_module
File "", line 488, in _call_with_frames_removed
File "/opt/venv/lib/python3.12/site-packages/atom/model_engine/model_runner.py", line 24, in
from atom.model_loader.loader import load_model
File "/opt/venv/lib/python3.12/site-packages/atom/model_loader/loader.py", line 28, in
from atom.models.deepseek_mtp import (
File "/opt/venv/lib/python3.12/site-packages/atom/models/deepseek_mtp.py", line 18, in
from .deepseek_v2 import DeepseekV2DecoderLayer
File "/opt/venv/lib/python3.12/site-packages/atom/models/deepseek_v2.py", line 48, in
from aiter.ops.triton.fused_mxfp4_quant import (
ImportError: cannot import name 'fused_reduce_rms_mxfp4_quant' from 'aiter.ops.triton.fused_mxfp4_quant' (/opt/venv/lib/python3.12/site-packages/aiter/ops/triton/fused_mxfp4_quant.py)
Traceback (most recent call last):
File "/usr/lib/python3.12/weakref.py", line 666, in _exitfunc
f()
File "/usr/lib/python3.12/weakref.py", line 590, in call
return info.func(*info.args, **(info.kwargs or {}))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/venv/lib/python3.12/site-packages/atom/model_engine/async_proc.py", line 69, in exit
for el in self.runners:
^^^^^^^^^^^^
AttributeError: 'AsyncIOProc' object has no attribute 'runners'

Operating System

Ubuntu 24.04

CPU

AMD EPYC 9575F 64-Core Processor

GPU

AMD Instinct MI355X

ROCm Version

ROCm 7.0.2

ROCm Component

No response

Steps to Reproduce

No response

(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support

docker pull rocm/pytorch:rocm7.0.2_ubuntu24.04_py3.12_pytorch_release_2.8.0
docker run -it --network=host
--device=/dev/kfd
--device=/dev/dri
--group-add video
--cap-add=SYS_PTRACE
--security-opt seccomp=unconfined
-v $HOME:/home/$USER
-v /mnt:/mnt
-v /data:/data
--shm-size=16G
--ulimit memlock=-1
--ulimit stack=67108864
rocm/pytorch:rocm7.0.2_ubuntu24.04_py3.12_pytorch_release_2.8.0
pip install amd-aiter
git clone https://github.com/ROCm/ATOM.git; cd ./ATOM; pip install .
python -m atom.entrypoints.openai_server --kv_cache_dtype fp8 -tp 8 --model deepseek-ai/DeepSeek-R1

Additional Information

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions