Skip to content

feat(framework:cc): Update C++ SDK for Flower 1.27.0 inflatable objects protocol#6913

Open
uncleDecart wants to merge 2 commits intoflwrlabs:mainfrom
uncleDecart:cpp-update
Open

feat(framework:cc): Update C++ SDK for Flower 1.27.0 inflatable objects protocol#6913
uncleDecart wants to merge 2 commits intoflwrlabs:mainfrom
uncleDecart:cpp-update

Conversation

@uncleDecart
Copy link
Copy Markdown

@uncleDecart uncleDecart commented Mar 31, 2026

Description

The C++ SDK was stuck on the pre-1.13 protocol and could not communicate with modern Flower SuperLinks. This updates it to support the 1.27.0 "inflatable objects" protocol where message payloads are stored in an ObjectStore and transferred via PullObject/PushObject RPCs.

Key changes:

  • Regenerate proto stubs from current .proto definitions (adds message, recorddict, heartbeat, run, fab protos; removes stale recordset/task)
  • Implement inflate/deflate in serde.cc for bottom-up object reconstruction and top-down serialization with SHA-256 object IDs
  • Rewrite communicator.cc to use PullMessages/PushMessages/PullObject/ PushObject/ConfirmMessageReceived Fleet API RPCs
  • Rewrite grpc_rere.cc with ECDSA node authentication and new RPC stubs
  • Update typing.h with Message/RecordDict/Array types matching Python SDK
  • Add Docker quickstart (Dockerfile, docker-compose.yml, pyproject.toml) for easy testing with SuperLink + 2 C++ clients + Python ServerApp
  • Update README with Docker instructions

Running example seems to work,

cd examples/quickstart-cpp
docker compose up --build

Notes

  1. I'm yet to try more than a toy example (ggml or something from MLIR), I did the changes to my best understanding of the framework.
  2. I know that there's a migration to hub/apps (refactor(examples): Migrate examples to hub/apps & update links #6866) so not sure if I need to create app on flower hub, changes to C++ libraries are still useful in framework.

Related issues/PRs

Checklist

  • Implement proposed change
  • Write tests
  • Update documentation
  • Address LLM-reviewer comments, if applicable (e.g., GitHub Copilot)
  • Make CI checks pass
  • Ping maintainers on Slack (channel #contributions)

…ts protocol

The C++ SDK was stuck on the pre-1.13 protocol and could not communicate
with modern Flower SuperLinks. This updates it to support the 1.27.0
"inflatable objects" protocol where message payloads are stored in an
ObjectStore and transferred via PullObject/PushObject RPCs.

Key changes:
- Regenerate proto stubs from current .proto definitions (adds message,
  recorddict, heartbeat, run, fab protos; removes stale recordset/task)
- Implement inflate/deflate in serde.cc for bottom-up object reconstruction
  and top-down serialization with SHA-256 object IDs
- Rewrite communicator.cc to use PullMessages/PushMessages/PullObject/
  PushObject/ConfirmMessageReceived Fleet API RPCs
- Rewrite grpc_rere.cc with ECDSA node authentication and new RPC stubs
- Update typing.h with Message/RecordDict/Array types matching Python SDK
- Add Docker quickstart (Dockerfile, docker-compose.yml, pyproject.toml)
  for easy testing with SuperLink + 2 C++ clients + Python ServerApp
- Update README with Docker instructions
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8ccb6830a4

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread framework/cc/flwr/src/communicator.cc
Comment thread framework/cc/flwr/src/serde.cc Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the Flower C++ SDK and quickstart example to interoperate with modern Flower SuperLinks (Flower 1.27.0), including the “inflatable objects” protocol and the Fleet API RPC set.

Changes:

  • Regenerates/updates C++ protobuf and gRPC stubs for newer protocol definitions (message/recorddict/heartbeat/run/fab, updated node/transport).
  • Reworks the C++ client runtime to use Fleet API node lifecycle, message pull/push, object pull/push, and message receipt confirmation.
  • Updates the C++ quickstart example with Docker Compose-based end-to-end setup and a minimal Python app scaffold.

Reviewed changes

Copilot reviewed 40 out of 50 changed files in this pull request and generated 11 comments.

Show a summary per file
File Description
framework/cc/flwr/src/start.cc Client startup flow updated (register/activate/heartbeat thread/message loop/cleanup).
framework/cc/flwr/src/message_handler.cc Switch from TaskIns/TaskRes handling to Message/RecordDict-based handling.
framework/cc/flwr/src/grpc_rere.cc Implements Fleet RPCs and node auth metadata signing.
framework/cc/flwr/src/communicator.cc Implements PullMessages/PushMessages and inflatable object pull/push/inflate/deflate flow.
framework/cc/flwr/include/typing.h Updates SDK typing to RecordDict/Message/Metadata and new record value variants.
framework/cc/flwr/include/start.h Removes legacy transport gRPC include and aligns includes with new flow.
framework/cc/flwr/include/serde.h Adds RecordDict/Message serde APIs + inflatable object helpers.
framework/cc/flwr/include/message_handler.h Updates handler API to handle_message(Message).
framework/cc/flwr/include/grpc_rere.h Updates communicator interface to Fleet RPCs + node auth support.
framework/cc/flwr/include/flwr/proto/transport.pb.h Regenerated transport proto (adds Scalar uint64 oneof arm).
framework/cc/flwr/include/flwr/proto/transport.pb.cc Regenerated transport proto implementation for Scalar changes.
framework/cc/flwr/include/flwr/proto/run.grpc.pb.h Adds run gRPC stub header (regenerated).
framework/cc/flwr/include/flwr/proto/run.grpc.pb.cc Adds run gRPC stub implementation (regenerated).
framework/cc/flwr/include/flwr/proto/recorddict.grpc.pb.h Adds recorddict gRPC stub header (regenerated).
framework/cc/flwr/include/flwr/proto/recorddict.grpc.pb.cc Adds recorddict gRPC stub implementation (regenerated).
framework/cc/flwr/include/flwr/proto/node.pb.h Regenerated node proto (uint64 node_id, adds NodeInfo).
framework/cc/flwr/include/flwr/proto/node.pb.cc Regenerated node proto implementation (adds NodeInfo).
framework/cc/flwr/include/flwr/proto/message.grpc.pb.h Adds message gRPC stub header (regenerated).
framework/cc/flwr/include/flwr/proto/message.grpc.pb.cc Adds message gRPC stub implementation (regenerated).
framework/cc/flwr/include/flwr/proto/heartbeat.pb.h Adds heartbeat proto header (regenerated).
framework/cc/flwr/include/flwr/proto/heartbeat.pb.cc Adds heartbeat proto implementation (regenerated).
framework/cc/flwr/include/flwr/proto/heartbeat.grpc.pb.h Adds heartbeat gRPC stub header (regenerated).
framework/cc/flwr/include/flwr/proto/heartbeat.grpc.pb.cc Fixes heartbeat gRPC stub source/header includes (regenerated).
framework/cc/flwr/include/flwr/proto/fab.pb.cc Adds fab proto implementation (regenerated).
framework/cc/flwr/include/flwr/proto/fab.grpc.pb.h Adds fab gRPC stub header (regenerated).
framework/cc/flwr/include/flwr/proto/fab.grpc.pb.cc Adds fab gRPC stub implementation (regenerated).
framework/cc/flwr/include/communicator.h Updates Communicator interface to Fleet API and Message-based exchange.
framework/cc/flwr/CMakeLists.txt Adds new protos to generation and links ssl/crypto for node auth.
examples/quickstart-cpp/src/main.cc Updates CLI help text and binary name.
examples/quickstart-cpp/README.md Updates instructions (Docker Compose + local run) and binary naming.
examples/quickstart-cpp/pyproject.toml Adds Python project metadata and flwr run app configuration.
examples/quickstart-cpp/Dockerfile Adds Docker image to build C++ client + Python deps for quickstart.
examples/quickstart-cpp/docker-compose.yml Adds Compose setup for SuperLink + two C++ clients + runner.
examples/quickstart-cpp/CMakeLists.txt Updates build to link ssl/crypto and renames executable.
examples/quickstart-cpp/client.py Adds placeholder Python ClientApp to satisfy app config.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread examples/quickstart-cpp/client.py Outdated
Comment thread framework/cc/flwr/src/communicator.cc Outdated
Comment thread framework/cc/flwr/src/communicator.cc
Comment thread framework/cc/flwr/src/message_handler.cc Outdated
Comment thread framework/cc/flwr/src/message_handler.cc Outdated
Comment thread framework/cc/flwr/src/grpc_rere.cc
Comment thread framework/cc/flwr/src/grpc_rere.cc
Comment thread framework/cc/flwr/src/grpc_rere.cc Outdated
Comment thread framework/cc/flwr/CMakeLists.txt
Comment thread examples/quickstart-cpp/CMakeLists.txt
@github-actions github-actions bot added the Contributor Used to determine what PRs (mainly) come from external contributors. label Mar 31, 2026
- Don't confirm message receipt when objects fail to pull or inflate,
  preventing permanent message loss on incomplete object transfers
- Fix stype parsing in serde.cc that extracted the next JSON key name
  instead of the actual stype value
- Replace uninstantiable NumPyClient placeholder with error-raising
  client_fn in quickstart-cpp example
- Add return value checks for all OpenSSL crypto APIs and validate EC
  private key scalar is in range [1, order-1]
- Return error reply when train/evaluate messages have no content
  instead of silently sending empty responses
- Use sleep_duration from handle_message for reconnect backoff
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 886ac71609

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

reply.metadata.reply_to_message_id = message.metadata.message_id;
reply.metadata.group_id = message.metadata.group_id;
reply.metadata.message_type = msg_type;
reply.metadata.ttl = 3600.0;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Compute reply TTL from incoming message expiry

Setting every reply to a fixed ttl=3600 can make otherwise valid replies get dropped by SuperLink whenever the incoming message has a shorter remaining lifetime (or is already close to expiry). The reply validator in framework/py/flwr/server/superlink/linkstate/in_memory_linkstate.py rejects replies whose TTL exceeds the maximum allowed window, so this can cause PushMessages to store no result and leave rounds waiting indefinitely. The TTL should be derived from the incoming message (created_at + ttl - now) rather than hardcoded.

Useful? React with 👍 / 👎.

Comment on lines +46 to +47
if (msg_type == "reconnect") {
keep_going = false;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Return a valid disconnect reply for reconnect messages

The reconnect branch only flips keep_going and never populates reply content/error or sleep duration, so the client immediately exits without sending the expected disconnect acknowledgement payload (and without honoring server-requested reconnect delay). This produces an invalid/empty reconnect reply path and breaks reconnect semantics compared to the protocol’s control-message flow, which can leave the server side without a usable response for that instruction.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Contributor Used to determine what PRs (mainly) come from external contributors.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants