-
Notifications
You must be signed in to change notification settings - Fork 8
Ping without reconfig #11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Removed __pycache__ - Added results - Removed config.py - Added cmake to shell.nix to compile endorser-openenclave
…running it, more difficult as we need to change a lot of paths in the instruction and also nix-shell is causing problems with PATH variables prob.
Dependencies for Azurite
…omments at the top
…oordinator and endorsers
…nd related functions
…rocess management in autoscheduler
…onal paths from config.py
@microsoft-github-policy-service agree |
sga001
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leaving some comments.
| }, | ||
| _ => {}, | ||
| } | ||
| let signature = self.private_key.sign(&nonce).unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not safe. A malicious actor could abuse this method to trick the endorser into signing arbitrary data.
| } | ||
|
|
||
| /// Pings the endorser with the given nonce. | ||
| async fn ping(&self, req: Request<PingReq>) -> Result<Response<PingResp>, Status> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This public API is not safe.
|
|
||
| #[test] | ||
| pub fn check_endorser_new_ledger_and_get_tail() { | ||
| pub fn check_endorser_new_ledger_and_greceiptet_tail() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what's going on here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds a ping functionality to the coordinator that enables both automatic periodic pinging and manual pinging of endorsers through the coordinator_ctrl tool. The goal is to make detection of failed endorsers safer and more convenient. The PR also includes fixes to previously broken benchmark scripts and adds a test script for validating the ping functionality.
Key changes:
- Implemented periodic health checking of endorsers via a ping mechanism with configurable intervals and failure thresholds
- Added new gRPC endpoints (PingAllEndorsers, GetTimeoutMap, AddEndorsers) to both coordinator and endpoint services
- Fixed benchmark scripts that were using incorrect tool names (wrk vs wrk2) and had issues with command quoting
Reviewed changes
Copilot reviewed 28 out of 30 changed files in this pull request and generated 21 comments.
Show a summary per file
| File | Description |
|---|---|
| proto/endorser.proto | Added Ping RPC and related message types; reformatted existing messages |
| proto/coordinator.proto | Added PingAllEndorsers, GetTimeoutMap, and AddEndorsers RPCs with message types |
| proto/endpoint.proto | Added same three new RPCs to endpoint service |
| coordinator/src/coordinator_state.rs | Implemented core ping logic with failure tracking, auto-scheduler, and timeout management |
| coordinator/src/main.rs | Added gRPC and REST handlers for new ping endpoints; added CLI arguments for ping configuration |
| coordinator/src/errors.rs | Added FailedToGetTimeoutMap error variant |
| coordinator/Cargo.toml | Added dependencies for clokwerk scheduler, time, log, and async-lock |
| endorser/src/endorser_state.rs | Implemented ping method that signs nonces; added documentation and test |
| endorser/src/main.rs | Added ping RPC handler; fixed typo in error message |
| endpoint/src/lib.rs | Added client methods for new coordinator APIs |
| endpoint/src/errors.rs | Added error variants for new operations |
| endpoint_rest/src/main.rs | Added REST handlers for new endpoints |
| coordinator_ctrl/src/main.rs | Added CLI options for pinging and getting timeout map |
| experiments/*.py | Updated benchmark scripts to use wrk2, added logging, fixed command quoting issues |
| experiments/*.lua | Changed uuid library import from "uuid" to "uuidgen" |
| experiments/testing_ping.py | New test script for validating ping functionality |
| runNNTBenchmark.sh | New benchmark script for Hadoop NameNode throughput testing |
| experiments/HadoodBenchmarks.py | New Python-based Hadoop benchmark using pydoop |
| README.md | Added documentation for new ping-related command-line arguments |
| .github/workflows/rust_build.yml | Added GitHub Actions workflow for Rust CI |
| .gitignore | Added Python cache directory |
| .gitattributes | Added line ending normalization |
| SUPPORT.md | Whitespace formatting changes |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| .default_value("10"), | ||
| ) | ||
| .arg( | ||
| Arg::with_name("ping_inverval") |
Copilot
AI
Dec 21, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The argument name "ping_inverval" is misspelled - it should be "ping_interval".
| Arg::with_name("ping_inverval") | |
| Arg::with_name("ping_interval") |
| LISTEN_IP_COORDINATOR = "127.0.0.1" | ||
| PORT_COORDINATOR = "8080" | ||
| PORT_COORDINATOR_CTRL = "8090" # control plane | ||
| PORT_COORDINATOR_CTRL = "8090" # control pane |
Copilot
AI
Dec 21, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment contains a typo. "control pane" should be "control plane".
| PORT_COORDINATOR_CTRL = "8090" # control pane | |
| PORT_COORDINATOR_CTRL = "8090" # control plane |
| /// * `min_alive_percentage` - The minimum percentage of alive endorsers. | ||
| /// * `quorum_size` - The desired quorum size. | ||
| /// * `ping_interval` - The interval for pinging endorsers in seconds. | ||
| /// * `deactivate_auto_reconfig` - Whether to deactivate auto reconfiguration. |
Copilot
AI
Dec 21, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function parameter documentation contains a typo. The parameter name is listed as "min_alive_percentage" and "quorum_size" but these parameters are not actually present in the function signature. The documentation should only list the actual parameters: max_failures, request_timeout, and ping_interval.
| /// * `min_alive_percentage` - The minimum percentage of alive endorsers. | |
| /// * `quorum_size` - The desired quorum size. | |
| /// * `ping_interval` - The interval for pinging endorsers in seconds. | |
| /// * `deactivate_auto_reconfig` - Whether to deactivate auto reconfiguration. | |
| /// * `ping_interval` - The interval for pinging endorsers in seconds. |
| let dead_endorsers_count = DEAD_ENDORSERS.load(SeqCst); | ||
| println!("Debug: active_endorsers_count = {}", active_endorsers_count); | ||
| println!("Debug: dead_endorsers_count = {}", dead_endorsers_count); | ||
| alive_endorser_percentage = 100 - ((dead_endorsers_count * 100) / active_endorsers_count); |
Copilot
AI
Dec 21, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The calculation for alive_endorser_percentage has a potential division by zero issue if active_endorsers_count is 0. This should be checked before performing the division.
| alive_endorser_percentage = 100 - ((dead_endorsers_count * 100) / active_endorsers_count); | |
| if active_endorsers_count > 0 { | |
| alive_endorser_percentage = | |
| 100 - ((dead_endorsers_count * 100) / active_endorsers_count); | |
| } else { | |
| // No active endorsers; avoid division by zero and treat as 0% alive. | |
| alive_endorser_percentage = 0; | |
| } |
| _request: Request<PingAllReq>, // Accept the gRPC request | ||
| ) -> Result<Response<PingAllResp>, Status> { | ||
| // Call the state method to perform the ping task (no return value) | ||
| println!("Pining all endorsers now from main.rs"); |
Copilot
AI
Dec 21, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a typo in the comment. "Pining" should be "Pinging".
| println!("Pining all endorsers now from main.rs"); | |
| println!("Pinging all endorsers now from main.rs"); |
| #[derive(Debug, Serialize, Deserialize)] | ||
| struct GetTimeoutMapResp { | ||
| #[serde(rename = "timeout_map")] | ||
| pub timeout_map: HashMap<String, u64>, |
Copilot
AI
Dec 21, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The timeout_map field uses a redundant name pattern. Since it's already in GetTimeoutMapResp, the field could be simply named "map" or "timeouts" rather than "timeout_map".
| pub timeout_map: HashMap<String, u64>, | |
| pub timeouts: HashMap<String, u64>, |
| import subprocess | ||
| import logging | ||
| from datetime import datetime | ||
| from setup_nodes import * |
Copilot
AI
Dec 21, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Import pollutes the enclosing namespace, as the imported module setup_nodes does not define 'all'.
| from setup_nodes import * | |
| import time | |
| from setup_nodes import setup_output_folder, teardown, setup, ssh_cmd, collect_results |
| import logging | ||
| from datetime import datetime | ||
| from setup_nodes import * | ||
| from config import * # Assuming your configuration is correctly set up |
Copilot
AI
Dec 21, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Import pollutes the enclosing namespace, as the imported module config does not define 'all'.
| import subprocess | ||
| import logging | ||
| from datetime import datetime | ||
| from setup_nodes import * |
Copilot
AI
Dec 21, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Import pollutes the enclosing namespace, as the imported module setup_nodes does not define 'all'.
| import logging | ||
| from datetime import datetime | ||
| from setup_nodes import * | ||
| from config import * |
Copilot
AI
Dec 21, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Import pollutes the enclosing namespace, as the imported module config does not define 'all'.
This PR adds a ping functionality to the coordinator, allowing both regular automatic pings as well as manual pinging through coordinator_ctrl.
Its goal is to make detecting of failed endorsers safer and more convenient.
Also included is a test script to validate the function.
Changed benchmark scripts which seemed to be broken.