Skip to content

Conversation

@BDHU
Copy link
Owner

@BDHU BDHU commented Jan 13, 2026

Key Changes

New Features:

  • Add unpack() method to bundle operations for extracting multiple fields
  • Enable unpack() to work with @ta.jit decorator
  • Introduce node executors for enhanced graph execution with better control flow
  • Add switch node and operator for multi-way branching in graph execution
  • Add traceable container protocol for nested graph values

Architecture & Refactoring:

  • Restructure core architecture into modular subpackages: core/execution, core/graph, core/tracing
  • Consolidate and reorganize distributed execution framework
  • Simplify service tracing by removing hook plugin architecture
  • Improve type annotations and type safety across the entire codebase

Testing & Examples:

  • Consolidate distributed examples and reorganize test structure
  • Add new comprehensive test suites: test_core.py, test_bundle_unpack.py
  • Improve benchmark tests for core and distributed execution
  • Update demo and example files for new API

Files Changed

  • Core restructuring: 94 files changed, +8355 insertions, -7012 deletions
  • New subpackages: core/execution/, core/graph/, core/tracing/
  • Simplified distributed backend with cleaner gRPC integration
  • Comprehensive test reorganization

Breaking Changes

This refactoring includes structural changes to the core API. Existing code using the distributed backend may need updates due to the simplified execution framework.

BDHU and others added 12 commits January 8, 2026 00:49
- Introduced new graph data structures for traced computation, including GraphNode, CallNode, CondNode, WhileNode, and ScanNode.
- Added OutputKind, InputRef, NodeRef, and OutputRef classes for better output management.
- Implemented TracedValue to represent values during tracing, enforcing constraints on operations during tracing.
- Created control flow operators (cond, while_loop, scan) for JAX-style tracing in computation graphs.
- Removed the old node wrapper and sequential pipeline implementation, replacing them with a more streamlined tracing approach.
- Established a caching mechanism for compiled graphs to optimize performance.
- Added tracing context management to facilitate graph capture during function execution.
- Introduced decorators for JIT compilation and node marking, enhancing the usability of the tracing framework.
- Removed the `ReplicaSpec` class and associated runtime configuration logic from `replica.py`.
- Introduced a new `replication` module with `pool.py`, `replicate.py`, and `__init__.py` to handle replica management and distribution.
- Implemented `Replica`, `LocalReplica`, and `RemoteReplica` classes for local and remote execution.
- Added `ReplicaPool` for managing multiple replicas and load balancing.
- Created a `replicate` decorator for marking functions for distributed execution.
- Removed the `runtime.py` file and its backend management functionality.
- Deleted the `scheduler.py` and `spawner.py` files, simplifying the architecture.
- Added a new entry point in `serve.py` for serving distributed functions.
- Added CallExecutor, CondExecutor, WhileExecutor, ScanExecutor, and SwitchExecutor to handle various execution strategies in the computation graph.
- Refactored GraphNode to Node, integrating the NodeExecutor protocol for better extensibility.
- Implemented service execution capabilities with ServiceCallExecutor and ServiceExecutionProvider for distributed service calls.
- Enhanced tracing with ServiceTracingHook to record service method calls during graph execution.
- Introduced Replica infrastructure for stateful, replicated actors, including ReplicaConfig and ReplicaHandle.
- Added service pooling for load balancing in distributed environments.
- Updated the mesh context to support service execution and tracing hooks.
- Improved error handling and validation across various execution paths.
Major architectural reorganization to improve code maintainability and separation of concerns:

**Core restructuring:**
- Split monolithic tracing.py (720 LOC) into focused modules: context, tracer, utils, validation
- Organize execution-related modules under thinkagain/core/execution/
- Group graph operations under thinkagain/core/graph/
- Introduce literal_refs and literals modules for better graph literal handling

**Distributed execution improvements:**
- Refactor mesh and replication pool for cleaner architecture
- Update execution hooks and service runtime for consistency
- Remove deprecated replicate.py in favor of integrated replica system
- Streamline gRPC server/client interactions

**API and documentation updates:**
- Update README to reflect @replica decorator (replacing @replicate)
- Consolidate distributed examples into single comprehensive example
- Remove outdated REMOTE_SETUP.md and legacy example files
- Clean up test suite by removing deprecated test_agents.py

**Code cleanup:**
- Improve executor and executors modules with better abstractions
- Update profiling integration with new tracing infrastructure
- Enhance graph operations and control flow primitives
- Remove obsolete benchmark_overhead.py

This refactor maintains backward compatibility while establishing a cleaner foundation for future development.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit introduces a new @trace decorator and traceable container system
to support custom types in traced graphs, along with comprehensive test
improvements and infrastructure refinements.

- Add core/traceable.py with @trace decorator and registration API
- Implement decompose/compose protocol for custom traceable containers
- Add auto-detection support for dataclasses with zero boilerplate
- Enhance literal_refs and executor to handle traceable containers via map_traceable_refs
- Add validation to reject TracedValues in dict keys and sets
- Improve test infrastructure with _clear_compiled_cache fixture and _graph_for helper
- Refactor test assertions to use cleaner helper patterns
- Rename test_core_tracing.py to test_core.py for broader scope
- Export trace decorator from top-level thinkagain module
- Add Union types to control flow operators (cond, while_loop, scan, switch) to properly represent both sync/async callables and TracedValue returns
- Add type: ignore comments in examples and tests for known traced value type issues
- Improve type annotations in distributed components (mesh, pool, execution_hook)
- Replace mutable default arguments with field(default_factory=list) in dataclasses
- Update pool registry documentation to clarify key tuple structures
- Updated tests in `test_distributed.py` to utilize the new `@bind_service` decorator for service handles in `@node` functions.
- Introduced `bind_service` decorator in `services.py` to attach service handle metadata for tracing and caching.
- Enhanced `replica.py` to enforce that replicas must be called from within `@node` functions, improving error handling during tracing.
- Added `Bundle` and `make_inputs` utilities in `inputs.py` for structured input handling in pipelines.
- Modified `node` decorator to support service bindings and ensure only async functions are accepted.
- Updated service execution protocols to streamline service call handling.
- Improved documentation and error messages related to service binding and execution patterns.
- Move apply_replica() and replica state handling to new replica_state.py module
- Remove TracingHook protocol and ServiceTracingHook plugin-based approach
- Simplify BoundReplicaMethod to delegate directly to service provider
- Remove global replica hook registry (register_replica_hook, unregister_replica_hook)
- Add get_service_bindings() and register_service_bindings() helpers for direct binding management
- Streamline ServiceCallExecutor removal and integrate directly with execution context
- Update graph tracing to reject replica calls without proper @node/@bind_service decoration
- Update examples documentation to reference replica_state.py

This refactoring removes unnecessary indirection from the tracing hook architecture while
maintaining the same service execution functionality through a more direct approach.
…fields

Adds BundleOps.unpack() static method that allows unpacking multiple fields
from a bundle or traced dataclass, returning them as a tuple. Includes
validation for missing keys with helpful error messages.
- Merge stateful_replica_patterns into distributed_example
- Move test_bundle_unpack.py to tests/ directory
- Update distributed module structure and dependencies
- Remove redundant literals.py module
- Add TYPE_CHECKING blocks for better static type analysis in bundle_example.py
- Remove unused imports (e.g., Context import in agent.py)
- Add type: ignore comments for unavoidable type mismatches in tests and distributed code
- Refactor type annotations in agents, core tracing, and distributed modules
- Improve type validation and context handling in tracer and related modules
@BDHU BDHU self-assigned this Jan 13, 2026
@BDHU BDHU added the enhancement New feature or request label Jan 13, 2026
@codspeed-hq
Copy link
Contributor

codspeed-hq bot commented Jan 13, 2026

CodSpeed Performance Report

Merging this PR will create unknown performance changes

Comparing graph (deef1a1) with main (e925acf)

Summary

🆕 29 new benchmarks
⏩ 16 skipped benchmarks1

Performance Changes

Benchmark BASE HEAD Efficiency
🆕 test_bench_concurrent_replica_calls N/A 13 µs N/A
🆕 test_bench_basic_replica_execution N/A 14.8 µs N/A
🆕 test_bench_captured_value_in_cond N/A 13.9 µs N/A
🆕 test_bench_replica_handle_switching N/A 13.1 µs N/A
🆕 test_bench_apply_replica N/A 13 µs N/A
🆕 test_bench_profiling_summary_generation N/A 72.6 µs N/A
🆕 test_bench_complex_pipeline N/A 14.7 µs N/A
🆕 test_bench_cached_graph_execution N/A 14.9 µs N/A
🆕 test_bench_profiling_overhead N/A 13 µs N/A
🆕 test_bench_replica_with_different_multipliers N/A 15.4 µs N/A
🆕 test_bench_multiple_replica_handles N/A 14 µs N/A
🆕 test_bench_frozen_dataclass N/A 14.8 µs N/A
🆕 test_bench_cond_execution N/A 16.1 µs N/A
🆕 test_bench_mesh_context_overhead N/A 13.8 µs N/A
🆕 test_bench_captured_value_in_while N/A 14 µs N/A
🆕 test_bench_graph_compilation N/A 15.4 µs N/A
🆕 test_bench_replica_multiple_calls N/A 13 µs N/A
🆕 test_bench_complex_distributed_pipeline N/A 13.1 µs N/A
🆕 test_bench_multiple_pipeline_executions N/A 13.8 µs N/A
🆕 test_bench_mixed_replicas_and_nodes N/A 13 µs N/A
... ... ... ... ...

ℹ️ Only the first 20 benchmarks are displayed. Go to the app to view all benchmarks.

Footnotes

  1. 16 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @BDHU, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request represents a significant architectural overhaul, transitioning the framework to a JAX-inspired graph compilation model. The primary goal is to provide a more robust, type-safe, and performant foundation for building distributed AI pipelines. By introducing explicit graph tracing and control flow primitives, the system gains greater optimization potential and clearer execution semantics, enabling more efficient and scalable pipeline development.

Highlights

  • JAX-Style Graph Execution Engine: The core graph execution engine has undergone a major refactoring, adopting a JAX-style API for building static computation graphs from async Python code. This includes new decorators like @jit for compilation and primitives like cond, while_loop, scan, and switch for in-graph control flow.
  • Modular Architecture: The codebase has been restructured into modular subpackages: core/execution, core/graph, and core/tracing, enhancing maintainability and separation of concerns.
  • Enhanced Type Safety and Data Handling: Significant improvements have been made to type annotations across the codebase, increasing type safety. New Bundle operations (unpack, extend, replace, get) and make_inputs provide a cleaner, type-safe API for managing pipeline inputs.
  • Simplified Distributed Execution: The distributed execution framework has been consolidated and reorganized, featuring a cleaner gRPC integration and a new Mesh concept for explicit resource topology. The old hook plugin architecture for service tracing has been removed.
  • Comprehensive Testing and Examples: New comprehensive test suites (test_core.py, test_bundle_unpack.py) have been added, and existing benchmark tests for core and distributed execution have been improved. Demo and example files are updated to reflect the new API.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e5d5bd6803

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +40 to +44
@property
def capture_values(self) -> dict[int, Any]:
return (
self.parent_values if self.parent_values is not None else self.node_values
)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Use immediate parent values for nested subgraphs

The capture source for subgraph inputs is chosen via capture_values, which returns parent_values whenever it is set. For a graph that is itself a subgraph (e.g., a while body), this means prepare_subgraph_args will append captures from the outer graph, not from the current subgraph’s node_values. If that subgraph executes another nested control-flow graph that captures a value computed inside the subgraph, Graph.append_captures will look up that node id in the wrong dict and raise a KeyError (or use a mismatched value if ids collide). This breaks nested control flow like while_loop containing a cond that closes over an intermediate node value.

Useful? React with 👍 / 👎.

Comment on lines +295 to +299
def _create_local() -> Replica:
instance = handle.replica_class(
*handle.init_args, **dict(handle.init_kwargs)
)
return LocalReplica(instance, None)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Honor replica setup for class handles

For class-based replica handles, _create_local instantiates the class and returns LocalReplica(instance, None), which skips the ReplicaConfig.setup hook entirely. As a result, @replica(setup=...) never runs for class handles, and any __call__ expecting the setup state (per the documented API) will fail at runtime because no state is injected. This also means the setup side effects (e.g., initializing shared resources) are silently skipped.

Useful? React with 👍 / 👎.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This is an impressive and extensive refactoring that significantly modernizes the library's core API to a JAX-style, graph-based execution model. The new architecture with clear separation between core and distributed components, along with the introduction of an extensible hook system, is a massive improvement. The code is well-structured and the new features like Bundle, switch, and Mesh are powerful additions.

My review focuses on a few high-impact areas:

  • Security: The new gRPC backend introduces critical security vulnerabilities, including a remote code execution risk due to insecure deserialization with pickle and a lack of transport encryption. These should be addressed with high priority.
  • Testing: The tests for the agent functionality have been removed, leaving a part of the library without test coverage.
  • Documentation: A minor inconsistency was found in the README regarding an example filename.

Overall, this is a fantastic step forward for the library. Addressing the security and testing concerns will make it ready for wider adoption.

"""
try:
# Deserialize arguments
args, kwargs = pickle.loads(request.args)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

Critical Security Vulnerability: Remote Code Execution via Insecure Deserialization

The server deserializes arguments using pickle.loads() directly from a network request. pickle is not secure and can be used to execute arbitrary code. An attacker who can send requests to this gRPC server can achieve remote code execution on the server machine.

It's highly recommended to switch to a safe serialization format like JSON. If using pickle is a requirement, you must ensure that the service is only exposed to a completely trusted network and add prominent security warnings to the documentation about this vulnerability.

# Lazy initialization of gRPC client
from ..grpc.client import GrpcClient

self._client = GrpcClient(self.endpoint)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Security Vulnerability: Unencrypted Communication

The GrpcClient is initialized using grpc.aio.insecure_channel, which means all communication between distributed nodes is unencrypted. This is a security regression, as the previous implementation appeared to support SSH for transport security.

For any production or multi-tenant environment, it's critical to use secure channels. Please add support for grpc.aio.secure_channel with TLS and consider making it the default configuration.

BDHU and others added 2 commits January 12, 2026 18:39
- Remove @node decorator from unpack() and use asyncio.gather() to call get() for each key
- This allows tuple unpacking syntax to work during JIT tracing
- Update bundle examples to use @ta.jit with unpack()
- Add test case verifying unpack works with @ta.jit
- Update documentation to reflect that unpack now works everywhere
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
@BDHU BDHU merged commit 9abf039 into main Jan 13, 2026
6 checks passed
@BDHU BDHU deleted the graph branch January 13, 2026 00:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants