-
Notifications
You must be signed in to change notification settings - Fork 6
Open
Description
We have a research team asking for vLLM support with Omniprobe, which is more complicated than it sounds because this support has a slew of Triton implications.
Based on the vLLM installation docs, install requires Triton flash attention for ROCm which is tied to a non-default (triton-mlir) triton branch. Given that supporting vLLM means changing our Triton version, and thus our LLVM version, I'm thinking may be a non-trivial upgrade.
Open to thoughts if anyone has any creative solutions on how to circumvent this...
CC: @rwvo @keithloweryamd
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels