Skip to content

Conversation

@sdatkinson
Copy link
Owner

Optimize grouped convolutions in the case where in_channel == out_channels == groups (depthwise convolution).

Related to #215

Squashed commit of the following:

commit 79e9f31415cde3ec1430229121751429eb7eff25
Merge: 4d1fd5d 12f93a2
Author: Steven Atkinson <steven@atkinson.mn>
Date:   Thu Jan 29 00:22:38 2026 -0800

    Merge branch 'main' into 215-group-2

commit 4d1fd5d
Author: Steven Atkinson <steven@atkinson.mn>
Date:   Thu Jan 29 00:17:36 2026 -0800

    Enhance Conv1x1 and Conv1D classes to support depthwise convolutions. Introduced logic to differentiate between depthwise and non-depthwise configurations, optimizing weight storage and processing methods accordingly. Updated weight setting and processing functions to handle depthwise operations efficiently, ensuring correct handling of input channels and weights.

commit 2ad9dec
Author: Steven Atkinson <steven@atkinson.mn>
Date:   Wed Jan 28 23:56:35 2026 -0800

    Improve grouped convolutions for Conv1D by...ignoring them for now.

commit e3be255
Author: Steven Atkinson <steven@atkinson.mn>
Date:   Wed Jan 28 23:46:36 2026 -0800

    Revert "Implement std::vector grouped_weights"

    This reverts commit e78e191.

commit e78e191
Author: Steven Atkinson <steven@atkinson.mn>
Date:   Wed Jan 28 23:41:45 2026 -0800

    Implement std::vector grouped_weights

commit 546f820
Author: Steven Atkinson <steven@atkinson.mn>
Date:   Wed Jan 28 23:31:28 2026 -0800

    Improve speed of small grouped convolutions with single GEMM

commit c20fb86
Author: Steven Atkinson <steven@atkinson.mn>
Date:   Wed Jan 28 23:23:28 2026 -0800

    Zero out conv weight matrices after resize
@sdatkinson sdatkinson merged commit a5d75d7 into main Jan 29, 2026
2 checks passed
@sdatkinson sdatkinson deleted the 215-depthwise branch January 29, 2026 08:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants