wip: groupcast support for node addressing#383
wip: groupcast support for node addressing#383shreyash-b wants to merge 15 commits intoproject-chip:mainfrom
Conversation
Summary of ChangesHello @shreyash-b, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces group addressing support to the rs-matter stack, aligning with the Matter Specification 1.4.2. It includes the implementation of the Groups cluster, enhancements to group key management, and IPv6 multicast registration. These changes enable devices to be controlled and managed in groups, improving the overall functionality and scalability of the rs-matter ecosystem. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces groupcast support, which is a significant feature. The changes are extensive and touch many parts of the codebase, from the data model and clusters to the transport layer. The implementation correctly handles groupcast-specific logic, such as suppressing responses and handling ephemeral group sessions. The addition of the Groups and GroupKeyManagement cluster implementations is thorough. I've found a few issues that should be addressed.
69b8241 to
61f40c4
Compare
rs-matter/src/group_keys.rs
Outdated
| pub struct GrpKeySetEntry { | ||
| pub group_key_set_id: u16, | ||
| pub group_key_security_policy: u8, | ||
| pub epoch_key0: OctetsOwned<16>, |
There was a problem hiding this comment.
Rather than hard-coding 16, ideally we should use the AEAD_CANON_KEY_LEN const instead.
Also, unless there is a good reason not to (please explain) rather than OctetsOwned we should use CanonAeadKey instead.
| group_name: heapless::String<16>, | ||
| } | ||
|
|
||
| let mut groups: heapless::Vec<GroupInfo, 24> = heapless::Vec::new(); |
There was a problem hiding this comment.
This is a big on-stack allocation (~ 1K). We should try to avoid it. We have a similar - solved - case in Cluster::encode_generated_command_ids which might be even a bit more complex than what is required here (as it did require returning sorted data)? In any case the absolute minimum is to annotate the large allocation with a TODO comment as I do elsewhere (i.e. // TODO: LARGE BUFFER) so that we don't forget it. But then again, I think it is not overly difficult to get-by without this on-stack intermediate type?
| } | ||
|
|
||
| // Validate EpochKey0 length must be 16 | ||
| if epoch_key_0_val.0.len() != 16 { |
There was a problem hiding this comment.
Also here a magic constant
| pub has_epoch_key2: bool, | ||
| pub epoch_key2: OctetsOwned<16>, | ||
| pub epoch_start_time2: u64, | ||
| } |
There was a problem hiding this comment.
Not strictly necessary to do immediately (we can just leave a TODO to implement this later / if ever) - but can't we simplify by modeling the keys with a Vec<GroupEpochKeyEntry, MAX_KEYS = 3>, where
struct GroupEpochKeyEntry {
key: CanonAeadKey,
start_time: u64,
}?
rs-matter/src/lib.rs
Outdated
| mut network: N, | ||
| ) -> ! { | ||
| const MAX_ADDRS: usize = MAX_FABRICS * MAX_GROUPS_PER_FABRIC; | ||
| let mut joined: heapless::Vec<Ipv6Addr, MAX_ADDRS> = heapless::Vec::new(); |
There was a problem hiding this comment.
Another large array... here the problem is a bit worse as the method is async, so joined will end up being part of the future, blowing up a bit our .bss RAM consumption.
Would be interesting to see by how much (but for that I need the CI on this PR to pass).
| } | ||
| } | ||
|
|
||
| /// A trait to listen for IPv6 multicast on supported network types |
There was a problem hiding this comment.
Shall we constrain this to just Ipv6?
While the Matter Core spec talks (or used to talk) about "Ipv6 only", in reality, the C++ SDK supports matter-over-ipv4 communication as well, including for multicast mDns.
So I wonder, why not NetworkMulticast instead and then just join / leave taking regular IpAddrs?
Of course that would only work if the matter spec (or the C++ SDK) defines an ipv4 multicast addr...
|
|
||
| pub fn get_dst_unicast_nodeid(&self) -> Option<u64> { | ||
| if self.flags.contains(MsgFlags::DSIZ_UNICAST_NODEID) { | ||
| if self.flags.bits() & 0x03 == MsgFlags::DSIZ_UNICAST_NODEID.bits() { |
|
|
||
| pub fn get_dst_groupcast_nodeid(&self) -> Option<u16> { | ||
| if self.flags.contains(MsgFlags::DSIZ_GROUPCAST_NODEID) { | ||
| if self.flags.bits() & 0x03 == MsgFlags::DSIZ_GROUPCAST_NODEID.bits() { |
| } | ||
|
|
||
| pub fn is_group_session(&self) -> bool { | ||
| self.sec_flags & 0x01 != 0 |
| } else { | ||
| Err(ErrorCode::InvalidAAD)?; | ||
| } | ||
| // AAD: the unencrypted header of this packet (variable length) |
There was a problem hiding this comment.
Please do NOT delete "// TODO: Get rid of the temporary buffers". It is still valid.
It is looking quite nice already.
Yes, I've returned comments on the spec, though I think once CI is passing, I'll do another pass.
We don't really have a mutable notion of an "endpoint". Our One option is to do exactly what you did - keep the groups stuff inside the fabric. Another option is to just have a top-level Need to check whether groups support was mandatory in the spec? If it wasn't, that's another argument for going with a top-level Also, it is not strictly true that "a" PSM only stores the fabric manager - the networks of the Wifi/Thread management clusters also need to be persisted, yet they are not stored as part of the fabric. Perhaps you can think about the pros and cons and then we decide?
This won't run for the BTP protocol (BLE) and is the reason the CI fails. |
Quoting from Matter Core Spec 1.4.2, section 2.11.1.2:
Interpreting it literally, the groups support does seem optional. Not sure this will affect usage in real life in real ecosystems. About having a global Groups type, I'll try creating and passing it down to FabricMgr from Matter::init(), feels like that should do it.
True but having all the fabric related info being loaded from/to FabricMgr seems logical and 1 less point of failure.
Sounds good, will try that. Also I'm assuming this PR will be squash-merged so I'm spamming commits, please LMK if you prefer the changes squashed in earlier commits. |
I'm most afraid about the price we'll pay in terms of extra RAM necessary (and to a lesser extent extra flash).
That's not the idea. If you do that, we can just as well live with the current arrangement, which is equivalent. Rather, if we go with a global
Now - whether the above nirvana is possible - I don't know, but that's what I had in mind. The problematic points would likely be the places where we encrypt/decrypt packets, i.e. at least the sessions should be somehow aware of them damn group keys, right? This is pretty much equivalent as to how the
See above - that's why I wanted the
Also true. But then you will always have each fabric's footprint extended with the groups Vec, regardless whether the user wants groups support or not. But I guess we can't have both simplicity and perfect memory footprint. As I said - if you explore just a little bit more the "Groups as a separate type and then somehow encryption/decryption of group packets solved (?!)" idea we'll soon-ish know whether it is worth it or not at all.
No no, we'll squash-merge at the end, no worries about that. |
866d609 to
a056169
Compare
|
PR #383: Size comparison from 63631cd to a056169 Increases above 0.2%:
Full report (8 builds for (core), dimmable-light, onoff-light, onoff-light-bt, speaker)
|
CI is passing you can check now (the numbers do seem not-so-good 😬).
This would be better than the current arrangement because currently everything consumes memory whether the application uses it or not.
I agree that application shouldn't pay for the feature it does not use.
Yes I do have a feeling this should be possible - at the cost of increasing the complexity a bit. The major problem is should be something that is tied to the number of endpoints implementing the group cluster, which is kinda tricky. And also the message decryption as you mentioned. |
Indeed, 10% in RAM usage just for groups is NOK.
Just keep in mind that optimizing temporary buffers which live on stack would not help the reported numbers (but would help ram usage anyway as we won't have to account for (a) larger stack size).
Interesting. For the examples, it is basically almost a no-go - 20% RAM increase just for groups?
If not possible, then we'll have to keep the groups stuff in the fabrics, and optimize the existing layout. |
…and enable the test
- Change trait name from NetworkIPv6Multicast to `NetworkMulticast` - Update UDP implementation to support IPv6 as well as IPv6 for multicast
…lticast - Matter::run() takes multicast address as a Option<> argument - refactoring for watch_group_membership function
This reverts commit cbd7d8f.
…n, so reduce the the wasted space and also update the session manager to build the operational key every time instead of precomuting it
bd38c07 to
1326e03
Compare
|
PR #383: Size comparison from 96b3d51 to 1326e03 Increases above 0.2%:
Full report (8 builds for (core), dimmable-light, onoff-light, onoff-light-bt, speaker)
|
|
@shreyash-b Thanks for trying out an alternative approach. Injecting (a) If the If you look here, it will be no longer possible to just have an additional So it needs to be passed to the I think we really have to choose between two extremes:
|
Hmm makes sense. I was just trying to explore what can be done to minimize the impact on the users who are not using Groups, wasn't done yet.
Maybe we can achieve 1 by passing it from Matter::run() and Groups/GroupKeyManagement Cluster can access it as a part of it's context argument. Will try that next. |
This is currently a draft MR, for supporting group addressing functionality as per Matter Spec 1.4.2
Raising this to get review comments on the implementation so far(it's almost complete), and to ask a few queries.
Group node addressing works using chip-tool.
The
TestGroupKeyManagementClusterTestGroupsClusterTestGroupMessagingYAML tests from ConnectedHomeIP are passing as well.However I have a few doubts for finalising the implemtation:
Where should the group_id -> endpoint mapping be stored?
Matter Core Spec 2.11.1.2 requires to support minimum 4 group table entries per endpoint. Where should these be Maintained?
Currently these are being aggregated in
Fabricstruct - across all the endpoints. It requires pre-allocating the space for theGroupTableEntry, which needs to be maintained individually(but still related to the number of registered endpoints).Ideally these should be stored in Endpoints struct - per-endpoint, but this makes storing it using PSM complicated since it only stores the fabric manager.
How should registering for multicast IPv6 be handled?
I've defined a trait for IPv6 multicast registration, which is enforced for
recvnetwork interface used in Matter.run(). This may or may not be ideal. (Since other traits are more transport specific than IP)