-
Notifications
You must be signed in to change notification settings - Fork 28
Description
After decompose_reduce_ops (pass 35), the attention kernel contains extract operations (distinct from wave.extract_slice which is already supported). These are serialized as wave.extract in MLIR. The FX importer raises ValueError("Unsupported op in MLIR-to-FX conversion: wave.extract").
Extract ops appear after reduction decomposition because the decomposed reductions produce tuples of values, and individual results are extracted with wave.extract. GEMM's reductions are simpler and don't produce these ops.
This blocks passes 35-47 for attention.
Probable Fix:
Add a _handle_extract_op function to fx_emitter.py that creates the corresponding Extract FX node, and register it in the _convert_ops match block. The handler needs to extract the index and source operand and reconstruct the FX node with the correct result type.