Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions algorithms/pact/pact_ops.py
Original file line number Diff line number Diff line change
Expand Up @@ -112,8 +112,8 @@ def forward(ctx, x, mul, add, div, signed, n_levels_out, cmsis_requant):
# division. Division is with flooring.
else:
y = x * mul + add
# Avoid round to even behaviour, friggin pytorch
y = torch.floor((y / div) + 0.5)
# LMACAN: Dory doesn't like the `+ 0.5` fix
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please motivate this more, are you sure you are not making a mistake at some other point?
This is critical code for many applications.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is hard for me to motivate this more with my limited knowledge of quantlib. Do you have an alternative way to handle this?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This creates an extra addition layer in the produced onnx graph and DORY expects a precise pattern of layers to recognize requantization.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will definitely break compatibility with the RQS strategy in Deeploy where we do rounding by default. I suggest that you make arithmetic rounding in the RequantShift layer configurable and disable it in flows targetting DORY as the backend.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am unfortunately not very familiar with DORY but for Deeploy we (or at least I) export fused RQS nodes directly.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably talk with @da-gazzi as well - I know that Georg implemented rounding by adding "half the shift" to the bias; it seems to me like adding .5 here does pretty much the same. We should disentangle this a bit before merging, but if there are multiple places where rounding biases are added we should fold that into one spot.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I concur with @Scheremo the only "good" solution is to fuse the rounding with the bias value and not expose this +0.5 here. I do not know in Deeploy how that is handled, but as this is anyways an addition of 0.5 happening after requantization, it can not really represent an integer op in a bit-true fashion.
Fusing this inside QuantLib avoids any confusion.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems also related to this issue

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I concur with @Scheremo the only "good" solution is to fuse the rounding with the bias value and not expose this +0.5 here. I do not know in Deeploy how that is handled, but as this is anyways an addition of 0.5 happening after requantization, it can not really represent an integer op in a bit-true fashion. Fusing this inside QuantLib avoids any confusion.

Agree - the idea of the "RequantShift" layer is that it represents the integer operations performed on the device 1:1. The activation rounding is handled by statically adding half an eps to the bias value; adding 0.5 here would achieve the same thing but it breaks the exported net if you don't use custom nodes. Is there something keeping us from just using the "integer rounding" approach in all cases? It is already configurable; i.e., you can turn it on/off as desired with the rounding flag to PACTActivation classes.

y = torch.floor(y / div)

if not signed:
# if unsigned: clip y to interval (0, n_levels-1)
Expand Down
1 change: 0 additions & 1 deletion backends/cutie/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,5 +19,4 @@
# limitations under the License.
#

from . import grrules
from .cutie_export import convert_net, export_net
23 changes: 0 additions & 23 deletions backends/cutie/grrules/__init__.py

This file was deleted.

28 changes: 0 additions & 28 deletions backends/cutie/grrules/ana/__init__.py

This file was deleted.

1,110 changes: 0 additions & 1,110 deletions backends/cutie/grrules/ana/dporules.py

This file was deleted.

127 changes: 0 additions & 127 deletions backends/cutie/grrules/ana/folding.py

This file was deleted.

50 changes: 0 additions & 50 deletions backends/cutie/grrules/ana/lutactivation.py

This file was deleted.

21 changes: 17 additions & 4 deletions backends/dory/dory_passes.py
Original file line number Diff line number Diff line change
Expand Up @@ -116,7 +116,7 @@ def __init__(self):
class DORYAdder(nn.Module):
class DORYAdderFun(torch.autograd.Function):
@staticmethod
def forward(ctx, x1, rq1, x2, rq2, rq_out):
def forward(ctx, x1, rq1, x2, rq2, rq_out, out_n_levels):
if rq1:
x1 = rq1(x1)
if rq2:
Expand All @@ -127,10 +127,20 @@ def forward(ctx, x1, rq1, x2, rq2, rq_out):
if rq_out:
x_sum = rq_out(x_sum)

out_signed = rq_out.signed if rq_out else False
if out_signed:
out_min = -(out_n_levels // 2)
out_max = (out_n_levels - 1) // 2
else:
out_min = 0
out_max = out_n_levels - 1

x_sum = torch.clamp(x_sum, out_min, out_max)

return x_sum

@staticmethod
def symbolic(g, x1, rq1, x2, rq2, rq_out):
def symbolic(g, x1, rq1, x2, rq2, rq_out, out_n_levels):

params = {}
out_signed_inferred = False
Expand All @@ -146,7 +156,10 @@ def symbolic(g, x1, rq1, x2, rq2, rq_out):
mul = 1
add = 0
shift = 0
n_l = 256
if name == "out":
n_l = out_n_levels
else:
n_l = 256
requant = 0
if name == "out":
if module:
Expand Down Expand Up @@ -175,7 +188,7 @@ def __init__(self, in1_requant : Optional[nn.Module], in2_requant : Optional[nn.


def forward(self, x1, x2):
return self.DORYAdderFun.apply(x1, self.in1_requant, x2, self.in2_requant, self.out_requant)
return self.DORYAdderFun.apply(x1, self.in1_requant, x2, self.in2_requant, self.out_requant, self.out_n_levels)


class DORYReplaceAddersPass(OpTreeReplacementPass):
Expand Down
23 changes: 0 additions & 23 deletions backends/twn_accelerator/grrules/__init__.py

This file was deleted.

Loading