Hello, I'm really interested in the "custom-kernel-fusion-rewriter" feature of XLA, I've read many relevant source code but I still have two questions:
-
Does custom rewriter do fusions based on any fixed rules or patterns? Or does it rewrite with more dynamic or heuristic mechanism?
-
If there are some fixed patterns, at which level are they defined? Are they defined in XLA, or frameworks like PyTorch and JAX that use XLA , or even let application level users do it?
I only find some code looks like patterns in xla/service/gpu/kernels/cutlass_gemm_fusion_test.cc and xla/service/gpu/transforms/custom_kernel_fusion_rewriter_test.cc but I guess these test files aren't really a part of the running XLA right?
Thank you very much for reading this!!