Skip to content

Conversation

@jmthomas
Copy link
Member

@jmthomas jmthomas commented Jan 18, 2026

Summary

This PR adds comprehensive throughput testing infrastructure and fixes a critical Python telemetry performance bottleneck, improving Python throughput by 10x.

Key Changes

  • Fix Python telemetry throughput bottleneck - JSONPath caching in JsonAccessor improves throughput from ~320 Hz to 3,545 Hz
  • Add throughput testing server - Standalone TCP/IP server for measuring COSMOS command/telemetry throughput
  • Add fire-and-forget command mode - Skip ACK waiting when timeout <= 0 for high-throughput scenarios
  • Add packet caching in TargetModel - Thread-safe caching with 10-second timeout reduces Redis lookups
  • Add UPDATE_INTERVAL tests - Verify queued writes functionality in both Ruby and Python

Performance Results

Metric Before After Improvement
Python telemetry ~320 Hz 3,545 Hz 10x
Ruby telemetry ~2,700 Hz ~2,700 Hz baseline

Python now outperforms Ruby at high telemetry rates while maintaining zero packet loss.

Root Cause Analysis

The Python performance issue was caused by jsonpath_ng.parse() recompiling JSONPath expressions on every call (~2.7ms per call). When identifying packets in unique_id_mode, this caused ~5.7ms overhead per packet. Adding lru_cache to cache parsed expressions reduced this to ~0.12µs (23,000x speedup per call).

Files Changed

Performance Fixes:

  • openc3/python/openc3/accessors/json_accessor.py - Add JSONPath caching with lru_cache
  • openc3/python/openc3/microservices/interface_microservice.py - Fix missing self.queued initialization
  • openc3/python/pyproject.toml - Add orjson as optional dependency

Throughput Testing Infrastructure:

  • examples/throughput_server/ - New standalone throughput testing server
  • openc3-cosmos-demo/targets/INST/procedures/throughput_test.rb - Ruby throughput test
  • openc3-cosmos-demo/targets/INST2/procedures/throughput_test.py - Python throughput test
  • openc3-cosmos-demo/targets/*/screens/throughput.txt - Throughput monitoring screens

Command/Telemetry Optimizations:

  • openc3/lib/openc3/topics/command_topic.rb - Fire-and-forget mode
  • openc3/python/openc3/topics/command_topic.py - Fire-and-forget mode
  • openc3/lib/openc3/models/target_model.rb - Packet caching
  • openc3/python/openc3/models/target_model.py - Packet caching

Test Coverage:

  • openc3/spec/microservices/interface_microservice_spec.rb - UPDATE_INTERVAL test
  • openc3/python/test/microservices/test_interface_microservice.py - UPDATE_INTERVAL tests
  • openc3/spec/models/target_model_spec.rb - Packet caching tests
  • openc3/python/test/models/test_target_model.py - Packet caching tests

Test plan

  • All Python protocol tests pass (211 tests)
  • Ruby interface_microservice tests pass (15 tests)
  • Python interface_microservice tests pass (10 tests)
  • Throughput tests verified with throughput_server
  • Manual testing with DEMO plugin

🤖 Generated with Claude Code

jmthomas and others added 6 commits January 16, 2026 12:00
Throughput Server (examples/throughput_server/):
- Standalone TCP/IP server for measuring COSMOS command/telemetry throughput
- Dual-port operation for INST (7778) and INST2 (7780) targets
- CCSDS packet encoding/decoding with configurable streaming rates
- Time-compensated streaming to maintain accurate rates up to 100kHz
- Pre-allocated buffers for minimal allocation in hot paths
- Raw TCP rate test scripts (Ruby/Python) achieving ~300-500k cmd/s

DEMO Plugin Changes:
- Add THROUGHPUT_STATUS telemetry packet with rate/count metrics
- Add throughput commands: START_STREAM, STOP_STREAM, GET_STATS, RESET_STATS
- Add throughput_test procedures for INST (Ruby) and INST2 (Python)
- Add throughput screen for real-time monitoring
- Add plugin variables to toggle between simulator and throughput server
- Configure LengthProtocol for CCSDS packet framing when using throughput server

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add fire-and-forget mode to CommandTopic.send_command when timeout <= 0
  to skip ACK waiting for high-throughput command scenarios
- Add thread-safe packet caching in TargetModel with 10-second timeout
  to reduce Redis lookups for repeated packet access
- Cache is automatically invalidated when set_packet is called
- Add unit tests for packet caching in both Ruby and Python

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Root cause: jsonpath_ng.parse() was recompiling JSONPath expressions on
every call, taking ~2.7ms per call. When identifying packets in
unique_id_mode (required for INST2 due to mixed CCSDS/JSON packet types),
this caused ~5.7ms overhead per packet, limiting throughput to ~320 Hz.

Changes:
- Add lru_cache to JSONPath parsing in JsonAccessor (103x speedup)
- Add orjson as optional dependency for faster JSON parsing
- Fix missing self.queued initialization in Python interface_microservice
- Update throughput test scripts for both Ruby and Python

Results:
- Python telemetry: 320 Hz → 3,545 Hz (10x improvement)
- Python now outperforms Ruby at high rates (3,545 Hz vs 2,627 Hz)
- Zero packet loss maintained at all tested rates

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Revert the bytearray optimization in Python protocols to maintain
backward compatibility for custom protocol implementations. The change
from bytes to bytearray could break user code that:
- Type checks self.data expecting bytes
- Relies on immutability of self.data
- Uses bytes-specific operations

The JSONPath caching fix (the real 10x performance improvement) remains
intact.

Also adds tests for UPDATE_INTERVAL option in both Ruby and Python
interface_microservice to verify the queued writes functionality.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@codecov
Copy link

codecov bot commented Jan 18, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 78.23%. Comparing base (6ba6927) to head (ece9f74).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2742      +/-   ##
==========================================
- Coverage   79.19%   78.23%   -0.96%     
==========================================
  Files         670      473     -197     
  Lines       54253    34508   -19745     
  Branches      734      734              
==========================================
- Hits        42967    26999   -15968     
+ Misses      11206     7429    -3777     
  Partials       80       80              
Flag Coverage Δ
python 96.03% <ø> (+15.01%) ⬆️
ruby-api 83.65% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@interface.options.each do |option_name, option_values|
if option_name.upcase == 'OPTIMIZE_THROUGHPUT'
# OPTIMIZE_THROUGHPUT was changed to UPDATE_INTERVAL to better represent the setting
if option_name.upcase == 'UPDATE_INTERVAL' or option_name.upcase == 'OPTIMIZE_THROUGHPUT'
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was just a bug ... missing keyword that we already changed in Python and in the docs

while True: # Loop until we get some data
try:
data = self.read_socket.recv(4096, socket.MSG_DONTWAIT)
data = self.read_socket.recv(65535, socket.MSG_DONTWAIT)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now sure how much of an optimization this is but it matches Ruby

@jmthomas
Copy link
Member Author

jmthomas commented Jan 18, 2026

Results calculated on my Macbook Pro M3 Max with 36GB RAM with Docker CPU limit 10 and memory limit 16GB RAM.

Ruby results:

2026/01/18 02:35:28.545 (throughput_test.rb:183): Command Throughput:
2026/01/18 02:35:28.545 (throughput_test.rb:184):   100 cmd burst:  442.5 cmd/s
2026/01/18 02:35:28.546 (throughput_test.rb:185):   500 cmd burst:  437.4 cmd/s
2026/01/18 02:35:28.546 (throughput_test.rb:186):   1000 cmd burst: 421.2 cmd/s
2026/01/18 02:35:28.547 (throughput_test.rb:188): 
2026/01/18 02:35:28.547 (throughput_test.rb:188): Telemetry Throughput:
2026/01/18 02:35:28.547 (throughput_test.rb:189):   100 Hz target:   99.0 Hz (0.0% loss)
2026/01/18 02:35:28.547 (throughput_test.rb:190):   1000 Hz target:  977.8 Hz (0.0% loss)
2026/01/18 02:35:28.547 (throughput_test.rb:191):   2000 Hz target:  1979.6 Hz (0.0% loss)
2026/01/18 02:35:28.547 (throughput_test.rb:192):   3000 Hz target:  2762.2 Hz (0% loss)
2026/01/18 02:35:28.548 (throughput_test.rb:193):   4000 Hz target:  2626.8 Hz (0% loss)

Python results:

2026-01-18T02:44:32.696048Z (throughput_test.py:191): Command Throughput:
2026-01-18T02:44:32.696222Z (throughput_test.py:192):   100 cmd burst:  586.0 cmd/s
2026-01-18T02:44:32.696430Z (throughput_test.py:193):   500 cmd burst:  623.3 cmd/s
2026-01-18T02:44:32.696663Z (throughput_test.py:194):   1000 cmd burst: 541.6 cmd/s
2026-01-18T02:44:32.696972Z (throughput_test.py:196): 
2026-01-18T02:44:32.696972Z (throughput_test.py:196): Telemetry Throughput:
2026-01-18T02:44:32.697271Z (throughput_test.py:197):   100 Hz target:   98.8 Hz (0% loss)
2026-01-18T02:44:32.697677Z (throughput_test.py:200):   1000 Hz target:  984.4 Hz (0% loss)
2026-01-18T02:44:32.697970Z (throughput_test.py:203):   2000 Hz target:  1978.2 Hz (0% loss)
2026-01-18T02:44:32.698207Z (throughput_test.py:206):   3000 Hz target:  2955.0 Hz (0% loss)
2026-01-18T02:44:32.699203Z (throughput_test.py:209):   4000 Hz target:  3460.0 Hz (0% loss)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants