Skip to content

Conversation

@JanCSEM
Copy link

@JanCSEM JanCSEM commented Dec 2, 2025

This PR fixes multiple bugs:

  1. Scaling factors were previously converted to floating point scalar regardless of the shape, which made the exportBrevitas function error.

This adds a basic check on the shape of the scaling factor tensor, and only converts to a scalar if the tensor has only 1 element.
Added single Conv layer and simple CNN with channel-wise weights quantization models to the tests to validate.

  1. In QuantDivider, each node argument is assumed to be a Tensor. For residuals, and multi-branch concatenations, the argument might be a Tuple or a List.

Added support for 1-level nested node arguments.

  1. UnrolledMHA was previously returning a single tensor (the attention outputs), which is not consistent with MHA as implemented in pytorch and Brevitas where a tuple is returned (attention outputs, attention weights).

Fixed the implementation to always return a tuple, with the value of the attention weights or None, depending on the provided option.

@JanCSEM JanCSEM changed the title Fix: Add support for channel wise scales Fix: Multiple bug fixes Dec 2, 2025
@JanCSEM
Copy link
Author

JanCSEM commented Dec 2, 2025

@Victor-Jung tagging you for visibility

Copy link
Member

@Victor-Jung Victor-Jung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Jan, these changes look good and are very useful. I ran the tests locally and got several errors (see below). Are you locally passing tests this current HEAD?

FAILED Tests/TestConvChannelWise.py::deepQuantTestConv - RuntimeError: Cannot insert a Tensor that requires grad as a constant. Consider making it a parameter or input, or detaching the gradient
FAILED Tests/TestMHSA.py::deepQuantTestMHSA - AttributeError: 'IntQuantTensor' object has no attribute 'sum'
FAILED Tests/TestSimpleCNNChannelWise.py::deepQuantTestSimpleCNN - RuntimeError: Cannot insert a Tensor that requires grad as a constant. Consider making it a parameter or input, or detaching the gradient

Comment on lines +79 to +81
if arg is None or not isinstance(arg, fx.Node):
newLinArgs.append(arg)
continue
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could display a warning when an arg is None or non-node. I'm just scared that this change backfire at some point.

# Licensed under the Apache License, Version 2.0, see LICENSE for details.
# SPDX-License-Identifier: Apache-2.0
#
# Federico Brancasi <fbrancasi@ethz.ch>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You deserve the authorship of this test 😁

# SPDX-License-Identifier: Apache-2.0
#
# Victor Jung <jungvi@iis.ee.ethz.ch>
# Federico Brancasi <fbrancasi@ethz.ch>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You deserve the authorship of this test 😁

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants