-
Notifications
You must be signed in to change notification settings - Fork 281
Description
Hi Team,
I'm successfully using OpenMoonRay 1.7 in Gentoo on an AMD EPYC 9654 workstation (Ebuild here, patches to OMR here). I'm working on building out a render farm, and I hope to use the well-priced NVIDIA GH200 platform on VULTR as an on-demand Arras compute node.
I've built an OMR docker image for ARM64 Neocortex-V2 with Optix and CUDA (-march=armv9-a -mcpu=neoverse-v2 -mtune=neoverse-v2), the chips used in the NVIDIA GH200. The patches I've made against OMR's source are here and the ebuild, slightly modified from the previous example, is here.
I'm launching my docker container like so:
docker run -it -v /root:/root --runtime=nvidia --gpus=all -e NVIDIA_DRIVER_CAPABILITIES=graphics,compute,utility openmoonray-arm64
My bash environment looks like this:
NVIDIA_VISIBLE_DEVICES=all
REZ_MOONRAY_ROOT=/opt/openmoonray
PWD=/root/example_scenes/pbrt_scenes/country_kitchen
NVIDIA_DRIVER_CAPABILITIES=graphics,compute,utility
HOME=/root
LS_COLORS=<trimmed>
RDL2_DSO_PATH=/opt/openmoonray/rdl2dso
MOONRAY_ROOT=/opt/openmoonray
MOONRAY_CLASS_PATH=/opt/openmoonray/shader_json
TERM=xterm
SHLVL=1
ARRAS_SESSION_PATH=/opt/openmoonray/sessions
PATH=/opt/openmoonray/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
OLDPWD=/root/example_scenes
_=/usr/sbin/env
The processor looks like this in /proc/cpuinfo
processor : 71
BogoMIPS : 2000.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm ssbs sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh bti
CPU implementer : 0x41
CPU architecture: 8
CPU variant : 0x0
CPU part : 0xd4f
CPU revision : 0
I'm running the test render on the country kitchen scene with
moonray -debug -exec_mode xpu -in scene.rdla -in scene.rdlb -out arm64.exr
And the output is in the attached file
kitchen.log
I'd love to contribue a coherent patch once I get this working. The majority of the changes I've made to try and get this working are all about changing __APPLE__ to __ARM_NEON__ in the appropriate places, and separating out the concerns between ARM on Darwin and ARM on Linux. It's been a whirlwind trying to get this far, and compiling on qemu had made this process slower than usual :)
Where is a good place to start with debugging this? Since scalar works, I imagine I made some mistakes in my patching as it relates to vector and XPU. I also assume after reading the code that Apple hasn't been tested with Optix at all and we're in uncharted waters.
Looking forward to working with everyone! Thanks!