Skip to content

Formalize ITestSuite extension contract with registry pattern and fix run_test.sh/extract_data/generate_index bugs#26

Open
Copilot wants to merge 24 commits intomainfrom
copilot/formalize-extension-contract-ittest-suite
Open

Formalize ITestSuite extension contract with registry pattern and fix run_test.sh/extract_data/generate_index bugs#26
Copilot wants to merge 24 commits intomainfrom
copilot/formalize-extension-contract-ittest-suite

Conversation

Copy link
Contributor

Copilot AI commented Mar 14, 2026

Receiver IP resolution logic was duplicated across 5 test suites, client/server defaulting across 4, and adding a new test tool required editing the load_test_tool() if/elif cascade in runner.py. This PR centralizes shared logic, introduces a registry pattern, and documents the extension contract.

Base class helpers (src/testsuites/test.py)

  • PROXY_PROTOCOLS frozenset — single source of truth for proxy-style protocols (KCP, QUIC)
  • resolve_receiver(network, proto_info) → (ip, port) — protocol-aware receiver resolution replacing duplicated logic in IperfTest, RTTTest, ScpTest, SSHPingTest, RegularTest
  • _default_client_server(network) — replaces identical 3-line block in 4 suites

Registry-based dispatch

  • @register_test_suite decorator + from_tool_dict() classmethod on each suite
  • load_test_suite_from_registry() — exact match first, then contains match, returns None for fallback
  • load_test_tool() in runner.py reduced from 30-line if/elif cascade to registry delegation

Adding a new tool (e.g. quic_perf) becomes a single-file change:

@register_test_suite('quic_perf', test_type=TestType.throughput)
class QuicPerfTest(ITestSuite):
    @classmethod
    def from_tool_dict(cls, tool, test_name, root_path):
        config = TestConfig(name=tool['name'], ..., test_type=TestType.throughput)
        return cls(config)

    def _run_test(self, network, proto_info):
        receiver_ip, receiver_port = self.resolve_receiver(network, proto_info)
        ...

All 7 concrete suites refactored

Each now uses resolve_receiver() / _default_client_server(), has @register_test_suite, and owns its YAML→config mapping via from_tool_dict().

SSHPingTest timeout fix

The old load_test_tool() had no case for 'sshping', so it fell through to RegularTest which ran the binary in background with & and a kill-after-timeout. The registry now correctly routes to SSHPingTest, which previously ran sshping synchronously via cmd() (blocking indefinitely if SSH handshake was slow). SSHPingTest now runs sshping in background with & and uses _wait_and_kill() to enforce a configurable timeout (interval_num × interval + 1 seconds), with proper cleanup on all hosts in the multi-host loop.

Tests

39 new unit tests covering registry resolution, receiver IP routing for TCP/KCP/QUIC/BTP, from_tool_dict() for all suite classes, and load_test_tool() integration.

Documentation

docs/test_suite_refactoring.md updated with implementation status (§9).

Bug fixes in run_test.sh

Three pre-existing bugs were discovered and fixed when adding the Performance matrix CI step:

  • Flag files never deleted: delete_flag_files() prepended $current_base_path/ to variables that already contained the full absolute path from init_all_variables(), creating a doubled path that never matched. Both test.failed and test.success could accumulate across sequential CI test runs.
  • Test failures never detected: Mismatched quoting on the flag-file check line embedded ] || [ ! -f ... into the -f filename argument, making it a single [ -f ] test on a nonsensical path. Since the mangled path never exists, the script always fell through to the "success" branch regardless of actual test outcome.
  • Post-processing output in wrong directory: When test_name is not provided via CLI (the default in CI), extract_data.py and generate_index.py received test_results/ instead of per-test subdirectories like test_results/test1000/. Fixed by discovering test subdirectories automatically when test_name is empty.

Bug fixes in extract_data.py

  • Multi-protocol support: With multiple target protocols (e.g. [btp, brtp_proxy, kcp, tcp-bbr]), the old code collected all perf data into a single flat list and wrote one table where only the first match per (loss_rate, latency) was emitted — mixing protocol data or silently dropping entries. Now groups data by protocol and generates a separate CSV section + markdown table per protocol, each under a ### protocol_name heading.
  • Column misalignment: When a (loss_rate, latency) cell had no matching data, no cell was emitted at all, shifting subsequent columns left. Fixed by writing empty placeholder cells for missing data points.

Bug fixes in generate_index.py

  • Broken topology anchor links: Topology section headings used id='0. topology' (display name) but the throughput table linked to href='#topology-0'. Fixed by using the directory name topology-N as the id attribute while keeping the human-readable display name as heading text.
  • Multi-table rendering: Updated render_throughput_table() with a new split_md_sections() helper to parse the multi-section markdown format (### headings followed by tables), rendering each protocol's data as a separate HTML table with its own <h3> heading.

💬 Send tasks to Copilot coding agent from Slack and Teams to turn conversations into code. Copilot posts an update in your thread when it's finished.

Copilot AI self-assigned this Mar 14, 2026
Co-authored-by: penglei0 <101799503+penglei0@users.noreply.github.com>
Copilot AI changed the title [WIP] Refactor to formalize extension contract for ITestSuite docs: ITestSuite extension contract analysis and refactoring plan Mar 14, 2026
Copilot AI requested a review from penglei0 March 14, 2026 08:49
Copilot AI and others added 4 commits March 14, 2026 09:13
…ern to ITestSuite

- Add resolve_receiver() and _default_client_server() helpers to ITestSuite base class
- Add @register_test_suite decorator and load_test_suite_from_registry()
- Add from_tool_dict() to all test suites (IperfTest, IperfBatsTest, RTTTest, PingTest, ScpTest, SSHPingTest, RegularTest)
- Refactor all test suites to use resolve_receiver() and _default_client_server()
- Replace load_test_tool() if/elif cascade with registry-based lookup
- Add PROXY_PROTOCOLS constant for proxy protocol detection
- Improve docstrings across all test suites

Co-authored-by: penglei0 <101799503+penglei0@users.noreply.github.com>
…and integration

Co-authored-by: penglei0 <101799503+penglei0@users.noreply.github.com>
Co-authored-by: penglei0 <101799503+penglei0@users.noreply.github.com>
Co-authored-by: penglei0 <101799503+penglei0@users.noreply.github.com>
Copilot AI changed the title docs: ITestSuite extension contract analysis and refactoring plan Formalize ITestSuite extension contract with registry pattern Mar 14, 2026
@penglei0 penglei0 marked this pull request as ready for review March 14, 2026 09:23
Copilot AI review requested due to automatic review settings March 14, 2026 09:23
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR formalizes the ITestSuite extension contract by centralizing shared receiver-resolution/defaulting logic and replacing runner.load_test_tool()’s if/elif dispatch with a registry-based lookup, with accompanying documentation and unit tests.

Changes:

  • Added shared helpers to ITestSuite (PROXY_PROTOCOLS, resolve_receiver(), _default_client_server()) and introduced a test-suite registry + loader.
  • Refactored concrete test suites to register themselves and to construct configs via from_tool_dict().
  • Updated runner.load_test_tool() to delegate to the registry and added unit tests + documentation describing the contract.

Reviewed changes

Copilot reviewed 11 out of 12 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
src/testsuites/test.py Adds shared helpers and implements registry + registry loader.
src/core/runner.py Switches tool loading to registry-based dispatch and adds module imports intended to trigger registration.
src/testsuites/test_iperf.py Registers iperf suite, adds from_tool_dict(), and uses shared receiver/defaulting helpers.
src/testsuites/test_iperf_bats.py Registers bats_iperf suite and adds from_tool_dict() + defaulting helper.
src/testsuites/test_rtt.py Registers rtt suite, adds from_tool_dict(), and uses shared helpers.
src/testsuites/test_scp.py Registers scp suite, adds from_tool_dict(), and uses shared receiver resolution.
src/testsuites/test_sshping.py Registers sshping suite and adds from_tool_dict() + receiver resolution.
src/testsuites/test_ping.py Registers ping suite and adds from_tool_dict().
src/testsuites/test_regular.py Adds from_tool_dict() and uses shared receiver resolution in single-pair mode.
src/testsuites/tests/test_suite_unittest.py Adds unit/integration tests for registry behavior, helpers, and from_tool_dict() mappings.
src/testsuites/tests/init.py Package marker for testsuites tests.
docs/test_suite_refactoring.md Adds/updates documentation describing the extension contract and refactor status.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

penglei0 and others added 7 commits March 14, 2026 17:37
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Copilot AI and others added 2 commits March 14, 2026 09:44
load_test_suite_from_registry() now emits a logging.warning() when a
matched registry entry class does not provide from_tool_dict(), instead
of silently returning None. This surfaces registration mistakes early.

Also fixes pylint import-order and unused-variable warnings in runner.py.

Co-authored-by: penglei0 <101799503+penglei0@users.noreply.github.com>
…guard

The _TESTSUITES_DISCOVERED module-level flag is a valid pattern for
lazy initialization; suppress the W0603 warning inline to keep CI green.

Co-authored-by: penglei0 <101799503+penglei0@users.noreply.github.com>
@penglei0
Copy link
Owner

Co-authored-by: penglei0 <101799503+penglei0@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Mar 14, 2026

@copilot fix ci failure https://github.com/penglei0/oasis/actions/runs/23085967864/job/67062761446?pr=26#step:6:32

Fixed in 00fbd44. The CI failure was the new warning-log unit test on Python 3.8, so I made the assertion use assertLogs(level='WARNING'), which keeps the behavior under test the same and makes the capture work reliably in Actions. Full unittest discovery passes locally, and CodeQL found no alerts.

@penglei0
Copy link
Owner

@copilot investigate potential bugs in run_test.sh/extract_data.py;

Copilot AI and others added 2 commits March 14, 2026 13:18
…and post-processing directory

Co-authored-by: penglei0 <101799503+penglei0@users.noreply.github.com>
Co-authored-by: penglei0 <101799503+penglei0@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Mar 14, 2026

@copilot investigate potential bugs in run_test.sh/extract_data.py;

  • reproduce the bug with "bats-protocol-rtt-loss-test.yaml", it will test the performance matrix;
  • when test is d...

Fixed in c2b53a9. Found and fixed three bugs in run_test.sh:

Bug 1 — Flag files never deleted: delete_flag_files() prepended $current_base_path/ to variables that already contained the full absolute path (from init_all_variables()), creating a doubled path like /workspace//workspace/./test_results/test.failed that never matches. This is why both test.failed and test.success accumulate across runs.

Bug 2 — Test failures never detected: The quoting on line 223 ("${current_base_path}/"$test_failed_flag_file" ] || [ ...) causes the shell to embed ] || [ ! -f ... into the filename argument of a single [ -f ] test. Since the mangled path never exists, -f always returns false and the script always falls through to the "success" branch — regardless of actual test outcome.

Bug 3 — CSV/MD/index.html in wrong directory: When test_name is empty (CI calls run_test.sh bats-protocol-rtt-loss-test.yaml without :test_name), test_result_dir becomes test_results/ instead of test_results/test1000/. extract_data.py and generate_index.py then write their output to the root test_results/ directory. Fixed by discovering test subdirectories when test_name is empty.

Copilot AI changed the title Formalize ITestSuite extension contract with registry pattern Formalize ITestSuite extension contract with registry pattern and fix run_test.sh bugs Mar 14, 2026
@penglei0
Copy link
Owner

@copilot investigate potential bugs in run_test.sh/extract_data.py;

  • reproduce the bug with "bats-protocol-rtt-loss-test.yaml", it will test the performance matrix;
  • when test is d...

Fixed in c2b53a9. Found and fixed three bugs in run_test.sh:

Bug 1 — Flag files never deleted: delete_flag_files() prepended $current_base_path/ to variables that already contained the full absolute path (from init_all_variables()), creating a doubled path like /workspace//workspace/./test_results/test.failed that never matches. This is why both test.failed and test.success accumulate across runs.

Bug 2 — Test failures never detected: The quoting on line 223 ("${current_base_path}/"$test_failed_flag_file" ] || [ ...) causes the shell to embed ] || [ ! -f ... into the filename argument of a single [ -f ] test. Since the mangled path never exists, -f always returns false and the script always falls through to the "success" branch — regardless of actual test outcome.

Bug 3 — CSV/MD/index.html in wrong directory: When test_name is empty (CI calls run_test.sh bats-protocol-rtt-loss-test.yaml without :test_name), test_result_dir becomes test_results/ instead of test_results/test1000/. extract_data.py and generate_index.py then write their output to the root test_results/ directory. Fixed by discovering test subdirectories when test_name is empty.

@copilot last commit fail the CI, see details in https://github.com/penglei0/oasis/actions/runs/23088751295/job/67069842797?pr=26#step:9:94

Copilot AI and others added 2 commits March 14, 2026 13:35
…gularTest with timeout)

Co-authored-by: penglei0 <101799503+penglei0@users.noreply.github.com>
…s[1]

Co-authored-by: penglei0 <101799503+penglei0@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Mar 14, 2026

@copilot investigate potential bugs in run_test.sh/extract_data.py;

  • reproduce the bug with "bats-protocol-rtt-loss-test.yaml", it will test the performance matrix;
    ...

Fixed in 189871f and 18fc0af.

Root cause: the old load_test_tool() had no case for 'sshping', so it fell through to RegularTest. RegularTest runs the command in background with & and kills it after interval_num × interval + 1 seconds (11s for the sshping config). The registry pattern correctly routes to SSHPingTest, but SSHPingTest._run_test() ran sshping synchronously via cmd() (no &), which blocked until the test runner killed it as "stuck" after 40s.

Fix: SSHPingTest now runs sshping in the background with & and uses _wait_and_kill() to enforce a timeout based on the configured interval, matching RegularTest's behavior. The multi-host loop also now properly cleans up all hosts (not just hosts[1]).

@penglei0
Copy link
Owner

@copilot Do the fix for the following issues:

  • Fix the bug in generate_index.py which gen the wrong url when click the number in the throughput table, for example, the incorrect url "test1000/index.html#topology-0";
  • extract_data.py works only when process the single protocol performance matrix test like bats-protocol-rtt-loss-test.yaml; but when process the tests with multiple target protocols, it generate the incorrect markdown table; with multiple target protocols, extract_data.py should generate a table for each protocol.

…protocol support

Co-authored-by: penglei0 <101799503+penglei0@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Mar 14, 2026

@copilot Do the fix for the following issues:

  • Fix the bug in generate_index.py which gen the wrong url when click the number in the throughput table, for example, the incorrect url...

Fixed in 884340b. Three issues addressed:

  1. generate_index.py broken URLs: Topology heading id was set to display name (e.g., 0. topology) but throughput table links used href='#topology-0'. Fixed by using t.name (topology-0) as the anchor ID.

  2. extract_data.py multi-protocol: Now groups perf data by protocol and generates a separate CSV section + markdown table per protocol (each under a ### protocol_name heading). Also fixed a pre-existing column misalignment bug where missing cells were silently skipped instead of emitting empty placeholders.

  3. generate_index.py multi-table rendering: render_throughput_table() now handles the new multi-section markdown format via split_md_sections(), rendering each protocol's table with its own <h3> heading.

Copilot AI changed the title Formalize ITestSuite extension contract with registry pattern and fix run_test.sh bugs Formalize ITestSuite extension contract with registry pattern and fix run_test.sh/extract_data/generate_index bugs Mar 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants