Nvidia GH200 / ARM64: SIGSEGV in XPU and Vector, but not Scalar modes

Hi Team,

I'm successfully using OpenMoonRay 1.7 in Gentoo on an AMD EPYC 9654 workstation (Ebuild [here](https://gitlab.com/ruxbat/ruxbat/-/blob/x86-64/media-gfx/openmoonray/openmoonray-9999.ebuild?ref_type=heads), patches to OMR [here](https://gitlab.com/ruxbat/ruxbat/-/blob/x86-64/media-gfx/openmoonray/files/02-1.7.0.0.patch?ref_type=heads)). I'm working on building out a render farm, and I hope to use the well-priced [NVIDIA GH200 platform on VULTR](https://www.vultr.com/products/cloud-gpu/nvidia-gh200/) as an on-demand Arras compute node.

I've built an OMR docker image for ARM64 Neocortex-V2 with Optix and CUDA (`-march=armv9-a -mcpu=neoverse-v2 -mtune=neoverse-v2`), the chips used in the NVIDIA GH200. The patches I've made against OMR's source are [here](https://gitlab.com/ruxbat/ruxbat/-/blob/arm64/media-gfx/openmoonray/files/02-1.7.0.0-arm64.patch?ref_type=heads) and the ebuild, slightly modified from the previous example, is [here](https://gitlab.com/ruxbat/ruxbat/-/blob/arm64/media-gfx/openmoonray/files/02-1.7.0.0-arm64.patch?ref_type=heads).

I'm launching my docker container like so:
`docker run -it -v /root:/root --runtime=nvidia --gpus=all -e NVIDIA_DRIVER_CAPABILITIES=graphics,compute,utility openmoonray-arm64`

My bash environment looks like this:
```
NVIDIA_VISIBLE_DEVICES=all
REZ_MOONRAY_ROOT=/opt/openmoonray
PWD=/root/example_scenes/pbrt_scenes/country_kitchen
NVIDIA_DRIVER_CAPABILITIES=graphics,compute,utility
HOME=/root
LS_COLORS=<trimmed>
RDL2_DSO_PATH=/opt/openmoonray/rdl2dso
MOONRAY_ROOT=/opt/openmoonray
MOONRAY_CLASS_PATH=/opt/openmoonray/shader_json
TERM=xterm
SHLVL=1
ARRAS_SESSION_PATH=/opt/openmoonray/sessions
PATH=/opt/openmoonray/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
OLDPWD=/root/example_scenes
_=/usr/sbin/env
```

The processor looks like this in /proc/cpuinfo
```
processor       : 71
BogoMIPS        : 2000.00
Features        : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm ssbs sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh bti
CPU implementer : 0x41
CPU architecture: 8
CPU variant     : 0x0
CPU part        : 0xd4f
CPU revision    : 0
```

I'm running the test render on the country kitchen scene with
`moonray -debug -exec_mode xpu -in scene.rdla -in scene.rdlb -out arm64.exr`

And the output is in the attached file 
[kitchen.log](https://github.com/user-attachments/files/18384614/kitchen.log)

I'd love to contribue a coherent patch once I get this working. The majority of the changes I've made to try and get this working are all about changing `__APPLE__` to `__ARM_NEON__` in the appropriate places, and separating out the concerns between ARM on Darwin and ARM on Linux. It's been a whirlwind trying to get this far, and compiling on qemu had made this process slower than usual :)

Where is a good place to start with debugging this? Since scalar works, I imagine I made some mistakes in my patching as it relates to vector and XPU. I also assume after reading the code that Apple hasn't been tested with Optix at all and we're in uncharted waters.

Looking forward to working with everyone! Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Nvidia GH200 / ARM64: SIGSEGV in XPU and Vector, but not Scalar modes #180

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Nvidia GH200 / ARM64: SIGSEGV in XPU and Vector, but not Scalar modes #180

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions