Skip to content

Remove non-async cluster methods & go full-async#366

Open
gmarcosb wants to merge 9 commits intoproject-chip:mainfrom
gmarcosb:new-tests-2
Open

Remove non-async cluster methods & go full-async#366
gmarcosb wants to merge 9 commits intoproject-chip:mainfrom
gmarcosb:new-tests-2

Conversation

@gmarcosb
Copy link
Contributor

@gmarcosb gmarcosb commented Jan 27, 2026

Methods were already being wrapped in async helpers to translate the method to async, so this removes a lot of code maintenance

Rust & LLVM should be able to optimize away any async method which don't actually suspend - but sadly does not, filed rust-lang/rust#152141

Allowing all methods to be async simplifies long-running tasks such as writing to disk, scanning networks, etc.

In C++ this is currently done via a split between request message received + response message sent; the response message (containing e.g. wifi networks) is sent at a later time via an ad-hock callback

Rust having everything be async will provide a consistent & clean interface for all clusters vs hacky callbacks in C++

Methods were already being wrapped in async helpers to translate the method to async, so this removes a lot of code maintenance

Rust & LLVM *should* be able to optimize away any async method which don't actually suspend, resulting in similar footprints

Allowing all methods to be async simplifies long-running tasks such as writing to disk, scanning networks, etc.

In C++ this is currently done via a split - the response message (containing e.g. wifi networks) is sent at a later time via an ad-hock callback

Rust having everything be async will provide a consistent & clean interface for all clusters vs hacky callbacks in C++
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @gmarcosb, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request transitions the codebase to a fully asynchronous model by removing non-async cluster methods and adapting existing handlers. This change streamlines the interface for long-running tasks and simplifies code maintenance, while leveraging Rust's optimization capabilities to minimize performance impact.

Highlights

  • Async Conversion: Removes non-async cluster methods to fully embrace asynchronous operations.
  • Code Optimization: Simplifies code by removing the need for async helpers, reducing maintenance overhead.
  • Consistent Interface: Provides a consistent and clean interface for all clusters, eliminating ad-hoc callbacks.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

pub const fn tlv_element(&self) -> &rs_matter_crate::tlv::TLVElement<'a> {
&self.0
}
pub fn amount(&self) -> Result<rs_matter_crate::im::Money, rs_matter_crate::error::Error> {
Copy link
Contributor Author

@gmarcosb gmarcosb Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ivmarkov this was weird; unrelated to anything I changed here, almost as though this test wasn't being run in CI? (the change is correct, the type is im:Money not money; just don't understand how this test wasn't failing)

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request successfully refactors the codebase to a fully asynchronous model by removing non-async cluster methods and the Async wrapper. The changes are consistently applied across various files, including handler definitions, adaptor implementations, and example usages. This refactoring simplifies the API, reduces code maintenance, and aligns with the stated goal of providing a consistent and clean interface for all clusters. The transition to async fn and impl Future return types for delegate methods is well-executed, ensuring that the system now operates uniformly in an asynchronous manner. No issues were found that require specific review comments.

@gmarcosb
Copy link
Contributor Author

PR #366: Size comparison from 9917d2c to 3bc8ec9

Increases above 0.2%:

platform target config section 9917d2c 3bc8ec9 change % change
(core) riscv32imac-unknown-none-elf infodefmt-optz-ltofat FLASH 381628 403402 21774 5.7
thumbv6m-none-eabi infodefmt-optz-ltofat FLASH 321772 347052 25280 7.9
thumbv7em-none-eabi infodefmt-optz-ltofat FLASH 295332 311416 16084 5.4
x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 817087 880055 62968 7.7
dimmable-light x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 1873696 1950232 76536 4.1
onoff-light x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 1801488 1879592 78104 4.3
onoff-light-bt x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 3112760 3218656 105896 3.4
Full report (8 builds for (core), dimmable-light, onoff-light, onoff-light-bt, speaker)
platform target config section 9917d2c 3bc8ec9 change % change
(core) riscv32imac-unknown-none-elf infodefmt-optz-ltofat FLASH 381628 403402 21774 5.7
RAM 64904 64904 0 0.0
thumbv6m-none-eabi infodefmt-optz-ltofat FLASH 321772 347052 25280 7.9
RAM 61376 61376 0 0.0
thumbv7em-none-eabi infodefmt-optz-ltofat FLASH 295332 311416 16084 5.4
RAM 60860 60864 4 0.0
x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 817087 880055 62968 7.7
RAM 64488 64488 0 0.0
dimmable-light x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 1873696 1950232 76536 4.1
RAM 46904 46904 0 0.0
onoff-light x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 1801488 1879592 78104 4.3
RAM 46896 46896 0 0.0
onoff-light-bt x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 3112760 3218656 105896 3.4
RAM 9304 9304 0 0.0
speaker x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 800824 800312 -512 -0.1
RAM 2832 2832 0 0.0

@gmarcosb
Copy link
Contributor Author

Well damn, there's a size increase 🤔😭

@gmarcosb
Copy link
Contributor Author

PR #366: Size comparison from 9917d2c to ffa5a5a

Increases above 0.2%:

platform target config section 9917d2c ffa5a5a change % change
(core) riscv32imac-unknown-none-elf infodefmt-optz-ltofat FLASH 381628 403402 21774 5.7
thumbv6m-none-eabi infodefmt-optz-ltofat FLASH 321772 347052 25280 7.9
thumbv7em-none-eabi infodefmt-optz-ltofat FLASH 295332 311416 16084 5.4
x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 817087 880055 62968 7.7
dimmable-light x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 1873696 1952120 78424 4.2
onoff-light x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 1801488 1880120 78632 4.4
onoff-light-bt x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 3112760 3219184 106424 3.4
Full report (8 builds for (core), dimmable-light, onoff-light, onoff-light-bt, speaker)
platform target config section 9917d2c ffa5a5a change % change
(core) riscv32imac-unknown-none-elf infodefmt-optz-ltofat FLASH 381628 403402 21774 5.7
RAM 64904 64904 0 0.0
thumbv6m-none-eabi infodefmt-optz-ltofat FLASH 321772 347052 25280 7.9
RAM 61376 61376 0 0.0
thumbv7em-none-eabi infodefmt-optz-ltofat FLASH 295332 311416 16084 5.4
RAM 60860 60864 4 0.0
x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 817087 880055 62968 7.7
RAM 64488 64488 0 0.0
dimmable-light x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 1873696 1952120 78424 4.2
RAM 46904 46904 0 0.0
onoff-light x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 1801488 1880120 78632 4.4
RAM 46896 46896 0 0.0
onoff-light-bt x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 3112760 3219184 106424 3.4
RAM 9304 9304 0 0.0
speaker x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 800824 800312 -512 -0.1
RAM 2832 2832 0 0.0

@github-actions
Copy link

PR #366: Size comparison from 9917d2c to 3f627ce

Increases above 0.2%:

platform target config section 9917d2c 3f627ce change % change
(core) riscv32imac-unknown-none-elf infodefmt-optz-ltofat FLASH 381628 403402 21774 5.7
thumbv6m-none-eabi infodefmt-optz-ltofat FLASH 321772 347052 25280 7.9
thumbv7em-none-eabi infodefmt-optz-ltofat FLASH 295332 311416 16084 5.4
x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 817087 880055 62968 7.7
dimmable-light x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 1873696 1945304 71608 3.8
onoff-light x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 1801488 1877832 76344 4.2
onoff-light-bt x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 3112760 3191648 78888 2.5
Full report (8 builds for (core), dimmable-light, onoff-light, onoff-light-bt, speaker)
platform target config section 9917d2c 3f627ce change % change
(core) riscv32imac-unknown-none-elf infodefmt-optz-ltofat FLASH 381628 403402 21774 5.7
RAM 64904 64904 0 0.0
thumbv6m-none-eabi infodefmt-optz-ltofat FLASH 321772 347052 25280 7.9
RAM 61376 61376 0 0.0
thumbv7em-none-eabi infodefmt-optz-ltofat FLASH 295332 311416 16084 5.4
RAM 60860 60864 4 0.0
x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 817087 880055 62968 7.7
RAM 64488 64488 0 0.0
dimmable-light x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 1873696 1945304 71608 3.8
RAM 46904 46904 0 0.0
onoff-light x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 1801488 1877832 76344 4.2
RAM 46896 46896 0 0.0
onoff-light-bt x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 3112760 3191648 78888 2.5
RAM 9304 9304 0 0.0
speaker x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 800824 800312 -512 -0.1
RAM 2832 2832 0 0.0

@ivmarkov
Copy link
Contributor

Well damn, there's a size increase 🤔😭

I'm not sure what we should do.

On one hand, 5% to 7% code size increase is not exactly small.

On the other hand, it is not like 20% or 30% and the change simplifies stuff a lot and I really like it.

What do you think?

@ivmarkov
Copy link
Contributor

On the good side, we see 0% RAM increase (right?).

@gmarcosb
Copy link
Contributor Author

@ivmarkov I'm inclined to agree with you, but if you have any ideas about how we can further optimize (you're the pro, I'm still a noob), I think it'd be worthwhile - is there anything I missed? Any nifty tricks that may trigger the compiler to optimize?

In C++, for example, when using templates, the compiler sometimes thinks "I won't inline this, that way it can be efficiently reused!" - but if you force an inline (with __always_inline + forced-template), then it inlines the code & then is smart about optimizing lots of things away (& the end result is far more optimal than a re-use)

Is there anything like that in rust?

My gut feel is that the HandlerAdaptor (what was before called HandlerAsyncAdaptor) methods like read, with the giant match statement, might be what's tripping the compiler in not being able to optimize away the async - wonder if there's anything we can do there to trick it into optimizing

@ivmarkov
Copy link
Contributor

ivmarkov commented Jan 28, 2026

My gut feel is that the HandlerAdaptor (what was before called HandlerAsyncAdaptor) methods like read, with the giant match statement, might be what's tripping the compiler in not being able to optimize away the async - wonder if there's anything we can do there to trick it into optimizing

Initially I also assumed - for HandlerAdaptor & similar - that the compiler would optimize out the .await sites.

My thinking was as follows:

  • If an async method does not have even one .await point inside it, then effectively it is not async and the compiler should
  • (a) generate normal code for it
  • (b) optimize all .await-calls on that method to be direct calls

BUT: now that I'm thinking about this... this optimization cannot work!
As we have discussed, an async fn foo() {} is just a syntax sugar for fn foo() -> impl Future {}. and that's true even for async methods that don't have a single .await call inside them!

So, the compiler cannot really optimize and turn async fn foo() {} into fn foo() {} because callsites like this:

let fut /*: Future*/ = foo();

... would fail. The compiler must return a "future" over the in-reality-non-awaiting "foo". Even if that future resolves immediately when polled for the first time!

So perhaps this ^^^ is the reason for the code bloat we are seeing?

Now - and as you say - maybe, if aggressive inlining is turned on (opt-level = 3 BUT we don't want this - we want "optimize for size") and the compiler - prior to LLVM because I don't think LLVM "sees" the async stuff at all - can inline the call to async foo into the caller and thus get rid of the future?

So in theory putting #[inline(always)] on those simple methods that don't .await in the handlers' impls would do?

But then we hit the next bump - we have to google because there was a bug that #[inline(always)] did not work for methods written using the async syntax. The explanation was that they are

async fn foo() -> u32 { 42 }

<=>

fn foo() -> impl Future<Output = u32> { 
    async {
        42 
    }
}

I.e. [inline(always)] would get rid of the call to foo, but would preserve/move into the caller the

async {
    42
}

which is your Future :( (UPDATE: ... written with the async syntax which is just a way to create anonymous futures = state machines = values implementing the Future trait)

I.e. we don't gain much... because that future needs code, and even if small (because the future resolves right away on the first poll) it would never be just "42".

@gmarcosb
Copy link
Contributor Author

gmarcosb commented Jan 28, 2026

From what I have read, it can optimize - at the LLVM level

Rust cannot optimize obviously because the ABI requires that the method expose a Future

But when LLVM sees that the Future is statically allocated against a no-op future (which is hopefully what rust is doing), i.e. that the impl is simply a completed future with a value, it should be able to optimize that away

So, at link time, as long as rust is outputting the correct types which are basically no-ops, LLVM is supposed to be able to see "oh, this returned future is just a value so let me do away with all the logic"

That's the theory, anyway

Let me try [inline(always)] & opt-level = 3 - can't hurt!

@ivmarkov
Copy link
Contributor

From what I have read, it can optimize - at the LLVM level

Rust cannot optimize obviously because the ABI requires that the method expose a Future

But when LLVM sees that the Future is statically allocated against a no-op future (which is hopefully what rust is doing), i.e. that the impl is simply a completed future with a value, it should be able to optimize that away

So, at link time, as long as rust is outputting the correct types which are basically no-ops, LLVM is supposed to be able to see "oh, this returned future is just a value so let me do away with all the logic"

That's the theory, anyway

Hmmm, but a future that resolves immediately is not a no-op?

It is basically something like this (there is a ready type for that incore::future::ready](https://doc.rust-lang.org/std/future/fn.ready.html):

struct MyFuture;

impl Future for MyFuture {
    type Output = u32;

    fn poll(self: Pin<&mut Self>) -> Poll<Self::Output> {
        Poll::Ready(42)
    }
}

... and then calling .await on that future is essentially something like (skipping much more extra code)

/// A lot of extra code above this match
match fut.poll() {
    Poll::Ready(value) => value,
    Poll::Pending => ...
}

So the compiler (or LLVM) will have to inline the call to fut.poll() and then optimize the whole match statement. Whether it does that, I have no idea.

@ivmarkov
Copy link
Contributor

So, at link time, as long as rust is outputting the correct types which are basically no-ops, LLVM is supposed to be able to see "oh, this returned future is just a value so let me do away with all the logic"

If such optimizations are due ^^^ they are definitely not at link time though?

@gmarcosb
Copy link
Contributor Author

gmarcosb commented Jan 28, 2026

Sorry, both at link time + compile time; the idea (speaking from C++ experience & translating to rust) is that if at link-time you have something like this (bastardized C++/rust code)

struct DoneFuture<T> { fn is_done { true } -> bool }
fn some_fun(...) -> DoneFuture<Result>

...
let result = some_fun(...).await()

Then while linking the library containing some_fun (or maybe this happens while compiling? not sure, but we have a dep on a different create containing some_fun), the compiler sees "ok so you're doing this await which checks is_done, but statically is_done is always true so I'm just going to do away with all that"

At least, that's the theory, if everything is implemented correctly (but I'm honestly not sure if it's link-time or compile-time or both)

@ivmarkov
Copy link
Contributor

It is compile time, and the code that the compiler has to "see through" to figure out it is all just the magic constant "42" seems a bit larger to me than in your example. If you look at my MyFuture example you'll see. There are no .awaits that the compiler sees once the future generator had done its thing.

But regardless - apparently this opt does not trigger. :(

@ivmarkov
Copy link
Contributor

BTW opt-level 3 would be a disaster I bet. It aggressively inlines, but this means code size blow up too.

opt-level 3 is "optimize for speed", not for size.

@gmarcosb
Copy link
Contributor Author

PR #366: Size comparison from 9917d2c to ab79ead

Increases above 0.2%:

platform target config section 9917d2c ab79ead change % change
(core) riscv32imac-unknown-none-elf infodefmt-optz-ltofat FLASH 381628 609208 227580 59.6
RAM 64904 65168 264 0.4
thumbv6m-none-eabi infodefmt-optz-ltofat FLASH 321772 526568 204796 63.6
thumbv7em-none-eabi infodefmt-optz-ltofat FLASH 295332 516980 221648 75.1
x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 817087 1203783 386696 47.3
dimmable-light x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 1873696 1945320 71624 3.8
onoff-light x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 1801488 1877816 76328 4.2
onoff-light-bt x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 3112760 3191640 78880 2.5
Full report (8 builds for (core), dimmable-light, onoff-light, onoff-light-bt, speaker)
platform target config section 9917d2c ab79ead change % change
(core) riscv32imac-unknown-none-elf infodefmt-optz-ltofat FLASH 381628 609208 227580 59.6
RAM 64904 65168 264 0.4
thumbv6m-none-eabi infodefmt-optz-ltofat FLASH 321772 526568 204796 63.6
RAM 61376 61392 16 0.0
thumbv7em-none-eabi infodefmt-optz-ltofat FLASH 295332 516980 221648 75.1
RAM 60860 60864 4 0.0
x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 817087 1203783 386696 47.3
RAM 64488 64480 -8 -0.0
dimmable-light x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 1873696 1945320 71624 3.8
RAM 46904 46904 0 0.0
onoff-light x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 1801488 1877816 76328 4.2
RAM 46896 46896 0 0.0
onoff-light-bt x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 3112760 3191640 78880 2.5
RAM 9304 9304 0 0.0
speaker x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 800824 800312 -512 -0.1
RAM 2832 2832 0 0.0

@ivmarkov
Copy link
Contributor

I might be missing something, but "link time optimizations" are basically "remove dead code". But this is code which never gets called basically. I'm not sure the linker does code optimizations - just dead code removal. And then also a bit - which linker?

@gmarcosb
Copy link
Contributor Author

50% size increase with opt-level = 3 so that's a big no on that one 🤣 (noting here for posterity in case we ever want to try again in future)

@github-actions
Copy link

PR #366: Size comparison from 9917d2c to 93e70a9

Increases above 0.2%:

platform target config section 9917d2c 93e70a9 change % change
(core) riscv32imac-unknown-none-elf infodefmt-optz-ltofat FLASH 381628 403402 21774 5.7
thumbv6m-none-eabi infodefmt-optz-ltofat FLASH 321772 347060 25288 7.9
thumbv7em-none-eabi infodefmt-optz-ltofat FLASH 295332 311416 16084 5.4
x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 817087 880055 62968 7.7
dimmable-light x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 1873696 1945320 71624 3.8
onoff-light x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 1801488 1877816 76328 4.2
onoff-light-bt x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 3112760 3191640 78880 2.5
Full report (8 builds for (core), dimmable-light, onoff-light, onoff-light-bt, speaker)
platform target config section 9917d2c 93e70a9 change % change
(core) riscv32imac-unknown-none-elf infodefmt-optz-ltofat FLASH 381628 403402 21774 5.7
RAM 64904 64904 0 0.0
thumbv6m-none-eabi infodefmt-optz-ltofat FLASH 321772 347060 25288 7.9
RAM 61376 61376 0 0.0
thumbv7em-none-eabi infodefmt-optz-ltofat FLASH 295332 311416 16084 5.4
RAM 60860 60860 0 0.0
x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 817087 880055 62968 7.7
RAM 64488 64488 0 0.0
dimmable-light x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 1873696 1945320 71624 3.8
RAM 46904 46904 0 0.0
onoff-light x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 1801488 1877816 76328 4.2
RAM 46896 46896 0 0.0
onoff-light-bt x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 3112760 3191640 78880 2.5
RAM 9304 9304 0 0.0
speaker x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 800824 800312 -512 -0.1
RAM 2832 2832 0 0.0

@gmarcosb
Copy link
Contributor Author

#[inline(always)] seems to have made no difference; size deltas remain the same

@gmarcosb gmarcosb force-pushed the new-tests-2 branch 2 times, most recently from 53f9e62 to 22f25ab Compare January 28, 2026 20:33
@gmarcosb
Copy link
Contributor Author

PR #366: Size comparison from 9917d2c to 22f25ab

Increases above 0.2%:

platform target config section 9917d2c 22f25ab change % change
(core) riscv32imac-unknown-none-elf infodefmt-optz-ltofat FLASH 381628 403402 21774 5.7
thumbv6m-none-eabi infodefmt-optz-ltofat FLASH 321772 347060 25288 7.9
thumbv7em-none-eabi infodefmt-optz-ltofat FLASH 295332 311416 16084 5.4
x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 817087 880055 62968 7.7
dimmable-light x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 1873696 1945320 71624 3.8
onoff-light x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 1801488 1877816 76328 4.2
onoff-light-bt x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 3112760 3191640 78880 2.5
Full report (8 builds for (core), dimmable-light, onoff-light, onoff-light-bt, speaker)
platform target config section 9917d2c 22f25ab change % change
(core) riscv32imac-unknown-none-elf infodefmt-optz-ltofat FLASH 381628 403402 21774 5.7
RAM 64904 64904 0 0.0
thumbv6m-none-eabi infodefmt-optz-ltofat FLASH 321772 347060 25288 7.9
RAM 61376 61376 0 0.0
thumbv7em-none-eabi infodefmt-optz-ltofat FLASH 295332 311416 16084 5.4
RAM 60860 60860 0 0.0
x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 817087 880055 62968 7.7
RAM 64488 64488 0 0.0
dimmable-light x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 1873696 1945320 71624 3.8
RAM 46904 46904 0 0.0
onoff-light x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 1801488 1877816 76328 4.2
RAM 46896 46896 0 0.0
onoff-light-bt x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 3112760 3191640 78880 2.5
RAM 9304 9304 0 0.0
speaker x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 800824 800312 -512 -0.1
RAM 2832 2832 0 0.0

@gmarcosb
Copy link
Contributor Author

gmarcosb commented Jan 28, 2026

Played around a bit in rust playground... didn't find anything interesting, though, the ASM view for release in the playground does seem to indicate that the async is adding a bit...

Not sure if playground runs through LLVM
https://play.rust-lang.org/?version=stable&mode=release&edition=2024&gist=a85aa2a4b26df392e8d9942de01647ad

@gmarcosb
Copy link
Contributor Author

Played around with core::future::ready, returning it from the ClusterHandler impl - made the size worse 🤣🤦‍♂️

@gmarcosb gmarcosb force-pushed the new-tests-2 branch 2 times, most recently from 50050d1 to 1f796c8 Compare January 29, 2026 23:28
… AsyncClusterHandler

This allows for a cluster to easily combine both sync methods + async methods, while reducing the size of the binary
@gmarcosb
Copy link
Contributor Author

PR #366: Size comparison from 9917d2c to 1f796c8

Increases above 0.2%:

platform target config section 9917d2c 1f796c8 change % change
(core) riscv32imac-unknown-none-elf infodefmt-optz-ltofat FLASH 381628 396334 14706 3.9
thumbv6m-none-eabi infodefmt-optz-ltofat FLASH 321772 339720 17948 5.6
thumbv7em-none-eabi infodefmt-optz-ltofat FLASH 295332 306276 10944 3.7
x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 817087 861431 44344 5.4
dimmable-light x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 1873696 1921216 47520 2.5
onoff-light x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 1801488 1856336 54848 3.0
onoff-light-bt x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 3112760 3167864 55104 1.8
Full report (8 builds for (core), dimmable-light, onoff-light, onoff-light-bt, speaker)
platform target config section 9917d2c 1f796c8 change % change
(core) riscv32imac-unknown-none-elf infodefmt-optz-ltofat FLASH 381628 396334 14706 3.9
RAM 64904 64904 0 0.0
thumbv6m-none-eabi infodefmt-optz-ltofat FLASH 321772 339720 17948 5.6
RAM 61376 61368 -8 -0.0
thumbv7em-none-eabi infodefmt-optz-ltofat FLASH 295332 306276 10944 3.7
RAM 60860 60856 -4 -0.0
x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 817087 861431 44344 5.4
RAM 64488 64488 0 0.0
dimmable-light x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 1873696 1921216 47520 2.5
RAM 46904 46904 0 0.0
onoff-light x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 1801488 1856336 54848 3.0
RAM 46896 46896 0 0.0
onoff-light-bt x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 3112760 3167864 55104 1.8
RAM 9304 9304 0 0.0
speaker x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 800824 800312 -512 -0.1
RAM 2832 2832 0 0.0

@gmarcosb
Copy link
Contributor Author

Welp, the latest round of using core::future::ready is definitely better... but still an increase

@github-actions
Copy link

PR #366: Size comparison from 9917d2c to 528fe92

Increases above 0.2%:

platform target config section 9917d2c 528fe92 change % change
(core) riscv32imac-unknown-none-elf infodefmt-optz-ltofat FLASH 381628 396334 14706 3.9
thumbv6m-none-eabi infodefmt-optz-ltofat FLASH 321772 339720 17948 5.6
thumbv7em-none-eabi infodefmt-optz-ltofat FLASH 295332 306276 10944 3.7
x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 817087 861431 44344 5.4
dimmable-light x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 1873696 1921216 47520 2.5
onoff-light x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 1801488 1856336 54848 3.0
onoff-light-bt x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 3112760 3167864 55104 1.8
Full report (8 builds for (core), dimmable-light, onoff-light, onoff-light-bt, speaker)
platform target config section 9917d2c 528fe92 change % change
(core) riscv32imac-unknown-none-elf infodefmt-optz-ltofat FLASH 381628 396334 14706 3.9
RAM 64904 64904 0 0.0
thumbv6m-none-eabi infodefmt-optz-ltofat FLASH 321772 339720 17948 5.6
RAM 61376 61368 -8 -0.0
thumbv7em-none-eabi infodefmt-optz-ltofat FLASH 295332 306276 10944 3.7
RAM 60860 60856 -4 -0.0
x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 817087 861431 44344 5.4
RAM 64488 64488 0 0.0
dimmable-light x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 1873696 1921216 47520 2.5
RAM 46904 46904 0 0.0
onoff-light x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 1801488 1856336 54848 3.0
RAM 46896 46896 0 0.0
onoff-light-bt x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 3112760 3167864 55104 1.8
RAM 9304 9304 0 0.0
speaker x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 800824 800312 -512 -0.1
RAM 2832 2832 0 0.0

@github-actions
Copy link

github-actions bot commented Feb 4, 2026

Note from @gmarcosb: This is the size report with my attempted optimizations to force the compiler to see the future as ready & not generate unnecessary .await machinery; it failed

PR #366: Size comparison from 9917d2c to 11ab6a1

Increases above 0.2%:

platform target config section 9917d2c 11ab6a1 change % change
(core) riscv32imac-unknown-none-elf infodefmt-optz-ltofat FLASH 381628 399402 17774 4.7
thumbv6m-none-eabi infodefmt-optz-ltofat FLASH 321772 342028 20256 6.3
thumbv7em-none-eabi infodefmt-optz-ltofat FLASH 295332 306592 11260 3.8
x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 817087 876999 59912 7.3
dimmable-light x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 1873696 1940792 67096 3.6
onoff-light x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 1801488 1871864 70376 3.9
onoff-light-bt x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 3112760 3210760 98000 3.1
Full report (8 builds for (core), dimmable-light, onoff-light, onoff-light-bt, speaker)
platform target config section 9917d2c 11ab6a1 change % change
(core) riscv32imac-unknown-none-elf infodefmt-optz-ltofat FLASH 381628 399402 17774 4.7
RAM 64904 64936 32 0.0
thumbv6m-none-eabi infodefmt-optz-ltofat FLASH 321772 342028 20256 6.3
RAM 61376 61408 32 0.1
thumbv7em-none-eabi infodefmt-optz-ltofat FLASH 295332 306592 11260 3.8
RAM 60860 60896 36 0.1
x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 817087 876999 59912 7.3
RAM 64488 64488 0 0.0
dimmable-light x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 1873696 1940792 67096 3.6
RAM 46904 46904 0 0.0
onoff-light x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 1801488 1871864 70376 3.9
RAM 46896 46896 0 0.0
onoff-light-bt x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 3112760 3210760 98000 3.1
RAM 9304 9304 0 0.0
speaker x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 800824 800392 -432 -0.1
RAM 2832 2832 0 0.0

…s things worse rather than better

This reverts commit 11ab6a1.
@gmarcosb
Copy link
Contributor Author

gmarcosb commented Feb 4, 2026

Reverted failed optimization attempt (size report showed an increase); leaving it in the commit history just in case it's useful in the future

I've filed with rust to see if this is an optimization they could work on:

https://internals.rust-lang.org/t/async-await-optimizations-could-make-language-even-more-powerful/23973

rust-lang/rust#152141

As it stands, I think this is as optimal as we're going to get

The question now is: are we willing to take a 5% size increase hit to make our API flexible for all the endpoints which may be async

I think it's worthwhile because:

  1. The true percentage is probably far smaller: these binaries aren't done yet, and a real-world impl with networking & such, the % is likely to be smaller
  2. If rust folks ever take the time to optimize, we get immediate savings while keeping a clean API

@gmarcosb gmarcosb marked this pull request as ready for review February 4, 2026 21:59
@github-actions
Copy link

github-actions bot commented Feb 4, 2026

PR #366: Size comparison from 9917d2c to c505ffd

Increases above 0.2%:

platform target config section 9917d2c c505ffd change % change
(core) riscv32imac-unknown-none-elf infodefmt-optz-ltofat FLASH 381628 396334 14706 3.9
thumbv6m-none-eabi infodefmt-optz-ltofat FLASH 321772 339712 17940 5.6
thumbv7em-none-eabi infodefmt-optz-ltofat FLASH 295332 306272 10940 3.7
x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 817087 861431 44344 5.4
dimmable-light x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 1873696 1921200 47504 2.5
onoff-light x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 1801488 1856320 54832 3.0
onoff-light-bt x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 3112760 3167984 55224 1.8
Full report (8 builds for (core), dimmable-light, onoff-light, onoff-light-bt, speaker)
platform target config section 9917d2c c505ffd change % change
(core) riscv32imac-unknown-none-elf infodefmt-optz-ltofat FLASH 381628 396334 14706 3.9
RAM 64904 64904 0 0.0
thumbv6m-none-eabi infodefmt-optz-ltofat FLASH 321772 339712 17940 5.6
RAM 61376 61376 0 0.0
thumbv7em-none-eabi infodefmt-optz-ltofat FLASH 295332 306272 10940 3.7
RAM 60860 60864 4 0.0
x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 817087 861431 44344 5.4
RAM 64488 64488 0 0.0
dimmable-light x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 1873696 1921200 47504 2.5
RAM 46904 46904 0 0.0
onoff-light x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 1801488 1856320 54832 3.0
RAM 46896 46896 0 0.0
onoff-light-bt x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 3112760 3167984 55224 1.8
RAM 9304 9304 0 0.0
speaker x86_64-unknown-linux-gnu infologs-optz-ltofat FLASH 800824 800392 -432 -0.1
RAM 2832 2832 0 0.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants