Skip to content

Comments

[AIROCMLIR-447] Lower migraphx.conv into Linalg#2241

Open
Mr-Anyone wants to merge 16 commits intodevelopfrom
pr-template-migraphx-to-linalg-conv
Open

[AIROCMLIR-447] Lower migraphx.conv into Linalg#2241
Mr-Anyone wants to merge 16 commits intodevelopfrom
pr-template-migraphx-to-linalg-conv

Conversation

@Mr-Anyone
Copy link
Member

@Mr-Anyone Mr-Anyone commented Feb 17, 2026

Motivation

Being able to lower migraphx.conv into linalg.

Technical Details

Forward convolution is lowered in three steps:

  1. Apply padding to the input when the op has non-zero padding.
  2. Expand the channel dimension into (group, channel_per_group), introducing a group dimension G. Input becomes NGC* (e.g. NGCL, NGCHW, NGCDHW) and filter becomes GFC* (e.g. GFCL, GFCHW, GFCDHW), matching the group attr.
  3. Emit the grouped linalg convolution (1D/2D/3D), then collapse the result back to the original NFHW/NFDHW shape for the type converter.

The linalg.generic for Conv1D implements this python loop essentially

def my_conv_1d(x, filters, group, strides=1, dilation=1):
    # NGCHW 
    batch = x.shape[0]
    group = x.shape[1]
    channel = x.shape[2]
    input_height = x.shape[3]
    # GFCHW
    group = filters.shape[0]
    filter_count = filters.shape[1]
    channel = filters.shape[2]
    kernel_height = filters.shape[3]
    
    output_height = int((input_height - kernel_height -(kernel_height - 1)*(dilation - 1))/strides) + 1 
    result = np.zeros((batch, group, filter_count, output_height))
    for n in range(batch):    
        for g in range(group):
            for f in range(filter_count):
                for ho in range(output_height):
                    # starting reduction here 
                    # computing the convolution 
                    for c in range(channel):
                        for k in range(kernel_height):
                            input_width_access = ho*strides+ k*dilation
                            result[n, g, f, ho] += x[n, g,c, input_width_access] * filters[g, f, c, k]
    return result 

The linalg.generic for Conv3D implements this python loop essentially

def my_conv_3d(x, filters, group, strides=(1, 1, 1), dilation=(1, 1, 1)):
    # NGCHW 
    batch = x.shape[0]
    group = x.shape[1]
    channel = x.shape[2]
    input_height = x.shape[3]
    input_width = x.shape[4]
    input_depth = x.shape[5]
    
    # GFCHW
    group = filters.shape[0]
    filter_count = filters.shape[1]
    channel = filters.shape[2]
    kernel_height = filters.shape[3]
    kernel_width = filters.shape[4]
    kernel_depth = filters.shape[5]
    
    output_height = int((input_height - kernel_height -(kernel_height - 1)*(dilation[0] - 1))/strides[0]) + 1 
    output_width = int((input_width - kernel_width - (kernel_width - 1)*(dilation[1] - 1) )/strides[1]) + 1 
    output_depth = int((input_depth - kernel_depth - (kernel_depth - 1)*(dilation[2] - 1) )/strides[2]) + 1 

    result = np.zeros((batch, group, filter_count, output_height, output_width, output_depth))
    for n in range(batch):
        for g in range(group): # paralle
            for f in range(filter_count):
                for ho in range(output_height):
                    for wo in range(output_width):
                        for do in range(output_depth):
                            # starting reduction here 
                            # computing the convolution 
                            for c in range(channel):
                                for k in range(kernel_height):
                                    for j in range(kernel_width):
                                        for z in range(kernel_depth):
                                            input_height_access = ho*strides[0] + k*dilation[0]
                                            input_width_access  = wo*strides[1] + j*dilation[1]
                                            input_depth_access  = do*strides[2] + z*dilation[2]
                                            result[n, g, f, ho, wo, do] += (
                                                x[n, g, c, input_height_access, input_width_access, input_depth_access]
                                                * filters[g, f, c, k, j, z]
                                            )
        
    return result 

Test Plan

Added both e2e test and lit test. The e2e test value are verified by pytorch.

Test Result

Passed both e2e test and lit test.

Submission Checklist

@Mr-Anyone Mr-Anyone requested a review from causten as a code owner February 17, 2026 21:03
@Mr-Anyone Mr-Anyone force-pushed the pr-template-migraphx-to-linalg-conv branch from 9704ab3 to 14d20ad Compare February 17, 2026 21:13
@Mr-Anyone Mr-Anyone marked this pull request as draft February 18, 2026 02:24
@Mr-Anyone Mr-Anyone marked this pull request as draft February 18, 2026 02:24
@Mr-Anyone Mr-Anyone force-pushed the pr-template-migraphx-to-linalg-6 branch from 3d4478e to 3d9c23f Compare February 18, 2026 15:07
@Mr-Anyone Mr-Anyone force-pushed the pr-template-migraphx-to-linalg-conv branch 6 times, most recently from fe2bbc8 to 0f84c0a Compare February 18, 2026 22:07
@Mr-Anyone Mr-Anyone marked this pull request as ready for review February 18, 2026 22:08
@Mr-Anyone Mr-Anyone changed the title [AIROCMLIR-447] WIP (needs testcase) Lower migraphx.conv into Linalg [AIROCMLIR-447] Lower migraphx.conv into Linalg Feb 18, 2026
@@ -0,0 +1,35 @@
// RUN: rocmlir-gen -fut conv -arch %arch --clone-harness %s | rocmlir-driver --host-pipeline=migraphx-linalg,highlevel | rocmlir-gen -ph -print-results -rand_type float -rand_min 0 -rand_max 0 -fut conv_wrapper --verifier clone - | xmir-runner --shared-libs=%linalg_test_lib_dir/libmlir_rocm_runtime%shlibext,%conv_validation_wrapper_library_dir/libconv-validation-wrappers%shlibext,%linalg_test_lib_dir/libmlir_runner_utils%shlibext,%linalg_test_lib_dir/libmlir_float16_utils%shlibext,%linalg_test_lib_dir/libmlir_c_runner_utils%shlibext,%linalg_test_lib_dir/libmlir_async_runtime%shlibext --entry-point-result=void | FileCheck %s
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you actually want to set random values use a meaningful range. E.g., something like -rand_min -1 -rand_max 1

@Mr-Anyone Mr-Anyone force-pushed the pr-template-migraphx-to-linalg-conv branch 2 times, most recently from db757be to 18b0a33 Compare February 19, 2026 16:19
Copy link
Contributor

@pabloantoniom pabloantoniom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IR looks pretty good!

}
};

// Step 2: expand group dimension (NCHW -> NGCHW, FCHW -> GFCHW).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this needed? The description is missing some motivation to explain why we need this.

@@ -0,0 +1,44 @@
// RUN: rocmlir-opt -split-input-file --migraphx-to-linalg -verify-diagnostics %s | FileCheck %s
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in this PR you support all the convolution flavors like dilation, padding, stride, and group. Is that correct? If so, I think we need to:

  1. Make it clear in the description
  2. Add proper testing for all that functionality

In particular, I think it would be beneficial to split this test into 4. 1D, 2D, 3D and unsupported cases (e.g., 4D, or other cases we don't support). And, in the 1,2,3D tests, we should add tests for dilation, padding (non-padding), stride and groups.

patterns.add<AsUnderlyingShapeConverter, AsLogicalShapeOpConverter>(
typeConverter, patterns.getContext());

// mhal.launch can be generated through rocmlir-gen, so we need a way to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand, can you give an example to show why we need this? Please add it to the description to motivate why we need it

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

During e2e testing, the rocmlir-gen seems to generate a wrapper that has mhal.launch.

Example (from test case below):

root@14298acabab2 ~/rocMLIR/build-release (pr-template-migraphx-to-linalg-conv)$ cat main.mlir
func.func @conv_1d_group(%in: !migraphx.shaped<10x8x123xf32, 984x123x1>, %fil: !migraphx.shaped<12x2x7xf32, 14x7x1>) -> !migraphx.shaped<10x12x53xf32, 636x53x1> {
  %out = migraphx.convolution %in, %fil {dilation = [4], group = 4 : i64, padding = [3,3], padding_mode = 0 : i64, stride = [2]} :
    <10x8x123xf32, 984x123x1>, <12x2x7xf32, 14x7x1> -> <10x12x53xf32,  636x53x1>
  func.return %out : !migraphx.shaped<10x12x53xf32,  636x53x1>
}
root@14298acabab2 ~/rocMLIR/build-release (pr-template-migraphx-to-linalg-conv)$ ./bin/rocmlir-gen main.mlir --clone-harness --arch gfx950 -fut conv_1d_group
module {
  func.func @conv_1d_group(%arg0: !migraphx.shaped<10x8x123xf32, 984x123x1> {mhal.read_access}, %arg1: !migraphx.shaped<12x2x7xf32, 14x7x1> {mhal.read_access}) -> (!migraphx.shaped<10x12x53xf32, 636x53x1> {mhal.write_access}) {
    %0 = migraphx.convolution %arg0, %arg1 {dilation = [4], group = 4 : i64, padding = [3, 3], padding_mode = 0 : i64, stride = [2]} : <10x8x123xf32, 984x123x1>, <12x2x7xf32, 14x7x1> -> <10x12x53xf32, 636x53x1>
    return %0 : !migraphx.shaped<10x12x53xf32, 636x53x1>
  }
  func.func @conv_1d_group_wrapper(%arg0: !migraphx.shaped<10x8x123xf32, 984x123x1>, %arg1: !migraphx.shaped<12x2x7xf32, 14x7x1>) -> !migraphx.shaped<10x12x53xf32, 636x53x1> {
    %token, %results = mhal.launch @conv_1d_group (%arg0, %arg1) : (!migraphx.shaped<10x8x123xf32, 984x123x1>, !migraphx.shaped<12x2x7xf32, 14x7x1>) -> !migraphx.shaped<10x12x53xf32, 636x53x1>
    mhal.await %token : !mhal.token
    return %results : !migraphx.shaped<10x12x53xf32, 636x53x1>
  }
  module @__xmodule_ attributes {mhal.arch = "gfx950", mhal.module} {
    func.func @conv_1d_group(%arg0: !migraphx.shaped<10x8x123xf32, 984x123x1> {mhal.read_access}, %arg1: !migraphx.shaped<12x2x7xf32, 14x7x1> {mhal.read_access}) -> (!migraphx.shaped<10x12x53xf32, 636x53x1> {mhal.write_access}) attributes {kernel, original_func = @conv_1d_group} {
      %0 = migraphx.convolution %arg0, %arg1 {dilation = [4], group = 4 : i64, padding = [3, 3], padding_mode = 0 : i64, stride = [2]} : <10x8x123xf32, 984x123x1>, <12x2x7xf32, 14x7x1> -> <10x12x53xf32, 636x53x1>
      return %0 : !migraphx.shaped<10x12x53xf32, 636x53x1>
    }
  }
}

If we don't have this line above, passing through ./bin/rocmlir-driver seems to result in failure due to type not being legalized yet.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you post the rocmlir-driver command as well? I cannot reproduce your issue

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applying this diff:

diff --git a/mlir/lib/Conversion/MIGraphXToLinalg/MIGraphXToLinalg.cpp b/mlir/lib/Conversion/MIGraphXToLinalg/MIGraphXToLinalg.cpp
index 742d2b29056d..61f746ae9414 100644
--- a/mlir/lib/Conversion/MIGraphXToLinalg/MIGraphXToLinalg.cpp
+++ b/mlir/lib/Conversion/MIGraphXToLinalg/MIGraphXToLinalg.cpp
@@ -799,7 +799,7 @@ void mlir::migraphx::populateMIGraphXFuncBoundaryToLinalgConversionPatterns(

   // mhal.launch can be generated through rocmlir-gen, so we need a way to
   // legalize it
-  populateMIGraphXToLinalgMHALLauncherConversion(patterns, typeConverter);
+  // populateMIGraphXToLinalgMHALLauncherConversion(patterns, typeConverter);
   populateAnyFunctionOpInterfaceTypeConversionPattern(patterns, typeConverter);
   populateReturnOpTypeConversionPattern(patterns, typeConverter);
   populateCallOpTypeConversionPattern(patterns, typeConverter);
diff --git a/mlir/lib/Conversion/MIGraphXToLinalg/MIGraphXToLinalgPass.cpp b/mlir/lib/Conversion/MIGraphXToLinalg/MIGraphXToLinalgPass.cpp
index 75f5588d2024..9fd00ff82ed6 100644
--- a/mlir/lib/Conversion/MIGraphXToLinalg/MIGraphXToLinalgPass.cpp
+++ b/mlir/lib/Conversion/MIGraphXToLinalg/MIGraphXToLinalgPass.cpp
@@ -52,11 +52,11 @@ void mlir::migraphx::populateMIGraphXToLinalgBoundaryDialectConversion(
   target.addDynamicallyLegalOp<func::FuncOp>([&](func::FuncOp op) {
     return typeConverter.isSignatureLegal(op.getFunctionType());
   });
-  target.addDynamicallyLegalOp<mhal::LaunchOp>(
-      [=](mhal::LaunchOp op) -> std::optional<bool> {
-        return typeConverter.isLegal(op.getResultTypes()) &&
-               typeConverter.isLegal(op.getOperandTypes());
-      });
+  // target.addDynamicallyLegalOp<mhal::LaunchOp>(
+  //     [=](mhal::LaunchOp op) -> std::optional<bool> {
+  //       return typeConverter.isLegal(op.getResultTypes()) &&
+  //              typeConverter.isLegal(op.getOperandTypes());
+  //     });
   target.addDynamicallyLegalOp<func::ReturnOp>(
       [&](func::ReturnOp op) { return typeConverter.isLegal(op); });
   target.addDynamicallyLegalOp<func::CallOp>(

Running:

./bin/rocmlir-gen -fut conv_3d -arch gfx950 --clone-harness <filename> | ./bin/rocmlir-driver --host-pipeline=migraphx-linalg,highlevel --kernel-pipeline=migraphx-linalg,highlevel

On:

func.func @conv_3d(%arg0: !migraphx.shaped<2x4x2x2x2xf32, 32x8x4x2x1>, %arg1: !migraphx.shaped<2x3x5x5x5xf32, 375x125x25x5x1>, %arg2: !migraphx.shaped<4x3x2x2x2xf32, 24x8x4x2x1>) -> !migraphx.shaped<2x4x2x2x2xf32, 32x8x4x2x1>  {
  %0 = migraphx.convolution %arg1, %arg2 {dilation = [2, 2, 2], group = 1 : i64, padding = [0, 0, 0, 0, 0, 0], padding_mode = 0 : i64, stride = [2, 2, 2]} : <2x3x5x5x5xf32, 375x125x25x5x1>, <4x3x2x2x2xf32, 24x8x4x2x1> -> <2x4x2x2x2xf32, 32x8x4x2x1>
  %1 = migraphx.add %0, %arg0 : <2x4x2x2x2xf32, 32x8x4x2x1>, <2x4x2x2x2xf32, 32x8x4x2x1> -> <2x4x2x2x2xf32, 32x8x4x2x1>
  return %1 : !migraphx.shaped<2x4x2x2x2xf32, 32x8x4x2x1>
}

I got this on my end?

loc("<stdin>":7:138): error: failed to legalize unresolved materialization from ('tensor<96xf32>') to ('!migraphx.shaped<4x3x2x2x2xf32, 24x8x4x2x1>') that remained live after conversion
Lowering failed.

// converter.
Location loc = op.getLoc();
int64_t group = op.getGroupAttr().getInt();
int64_t dim = cast<RankedTensorType>(input.getType()).getRank() -
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we have a migraphx::ConvolutionOp method to get the rank?

@@ -0,0 +1,44 @@
// RUN: rocmlir-opt -split-input-file --migraphx-to-linalg -verify-diagnostics %s | FileCheck %s
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, does this PR support input and output with different types? For example f16 inputs with f32 outputs. If so, we need to test for that. If not, probably add it as a negative test (to the 4th test case if you follow my suggestion above)

SmallVector<int64_t, 4> newShape;
int64_t n = resultType.getDimSize(0);
int64_t newF = resultType.getDimSize(1) / group;
assert(resultType.getDimSize(1) % group == 0 &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's generate a proper failure error (LogicalResult) instead of using asserts

Copy link
Member Author

@Mr-Anyone Mr-Anyone Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My original intention was to move this to the verifier, and make this an invariant of the operation.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements lowering of the migraphx.convolution operation to the Linalg dialect, enabling an alternative compilation path for MIGraphX convolutions. The implementation supports 1D, 2D, and 3D convolutions with configurable padding, stride, dilation, and grouping parameters.

Changes:

  • Added ConvConverter to lower migraphx.convolution to Linalg operations (generic ops for 1D/3D, named op for 2D)
  • Introduced new "migraphx-linalg" pipeline option in the rocmlir-driver
  • Added comprehensive lit tests and end-to-end tests with PyTorch validation

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
mlir/tools/rocmlir-driver/rocmlir-driver.cpp Added "migraphx-linalg" pipeline option and updated help text
mlir/test/Conversion/MIGraphXToLinalg/migraphx-to-linalg-not-implemented.mlir Removed convolution test since it's now implemented
mlir/test/Conversion/MIGraphXToLinalg/migraphx-to-linalg-conv.mlir Added lit tests for 1D, 2D, and 3D convolution lowering
mlir/test/Conversion/MIGraphXToLinalg/e2e/migraphx-to-linalg-conv-cpu.e2e.mlir Added end-to-end test with PyTorch-validated golden values
mlir/lib/Conversion/MIGraphXToTosa/MIGraphXToTosa.cpp Added populateMIGraphXToLinalgMHALLauncherConversion function
mlir/lib/Conversion/MIGraphXToLinalg/MIGraphXToLinalgPass.cpp Added mhal.launch legalization support
mlir/lib/Conversion/MIGraphXToLinalg/MIGraphXToLinalg.cpp Implemented ConvConverter with padding, group expansion, and linalg lowering
mlir/include/mlir/Conversion/MIGraphXToTosa/MIGraphXToTosa.h Added function declaration for MHALLauncher conversion

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 70 to 71
void populateMIGraphXToLinalgMHALLauncherConversion(
RewritePatternSet &target, TypeConverter &typeConverter);
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function populateMIGraphXToLinalgMHALLauncherConversion is declared in the wrong header file. It is declared in MIGraphXToTosa.h but should be declared in MIGraphXToLinalg.h since it's specifically used for the Linalg conversion. The function is implemented in MIGraphXToTosa.cpp and used by MIGraphXToLinalgPass.cpp, which includes both headers. While this works functionally, it creates confusing organization where Linalg-specific functionality is declared in a Tosa header file.

Copilot uses AI. Check for mistakes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

@Mr-Anyone Mr-Anyone Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have just moved it to MIGraphXToLinalg.h now.

That was somewhat intentional. MHALLauncher class is a dependency from libMLIRMIGraphXToTosa.a. I want all declaration in MIGraphXToTosa.cpp to be in MIGraphXToTosa.h originally.

@Mr-Anyone Mr-Anyone force-pushed the pr-template-migraphx-to-linalg-6 branch from 3d9c23f to 2246225 Compare February 20, 2026 18:44
@Mr-Anyone Mr-Anyone force-pushed the pr-template-migraphx-to-linalg-conv branch from 18b0a33 to cf79235 Compare February 20, 2026 18:45
@Mr-Anyone Mr-Anyone force-pushed the pr-template-migraphx-to-linalg-6 branch from 2246225 to 56be60a Compare February 20, 2026 19:01
@Mr-Anyone Mr-Anyone force-pushed the pr-template-migraphx-to-linalg-conv branch from cf79235 to 648cc4f Compare February 20, 2026 19:01
@Mr-Anyone Mr-Anyone force-pushed the pr-template-migraphx-to-linalg-6 branch 2 times, most recently from 3de417c to 7ef3371 Compare February 20, 2026 19:59
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 7 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@Mr-Anyone Mr-Anyone force-pushed the pr-template-migraphx-to-linalg-conv branch from f1a3c51 to 6f67460 Compare February 23, 2026 14:56
@Mr-Anyone
Copy link
Member Author

Some changes has been made since the last revision

  1. This PR now emits attributes for rock lowering (padding, striders, op name, etc)
  2. conv2d is now converted into linalg.generic
  3. 1D/2D/3D convolution with non trivial parameters (!= 0 padding, >1 strides, >1group, >1 dilation) give same result as the tosa pipeline.


/// Emit Conv1D expects input shape to be (batch, group, channel, height),
/// filter to be (group, filter, channel, height)
static Value emitGroupedConv1D(ConversionPatternRewriter &rewriter,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we now that we emit generic for all 3 cases can easily refactor emitGroupedConv1D/2D/3D into a common emitGroupedConv for 1/2/3D right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It turns out that extending to ND is that not hard.

patterns.add<AsUnderlyingShapeConverter, AsLogicalShapeOpConverter>(
typeConverter, patterns.getContext());

// mhal.launch can be generated through rocmlir-gen, so we need a way to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you post the rocmlir-driver command as well? I cannot reproduce your issue

// Convert optional attributes
if (auto attr = (*op).template getAttrOfType<StringAttr>("perf_config"))
newOp->setAttr("perf_config", attr);
newOp->setAttr("conv_op", convOpName);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather have this attribute defined in RockAttrDefs and referenced here with its name instead of hardcoding a string like this. Specially because when you work on the linalg-to-rock side of things you will need to reference the attribute again, and hardcoding strings all over the place should be avoided

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants