Skip to content

Conversation

@zhusy54
Copy link
Contributor

@zhusy54 zhusy54 commented Jan 29, 2026

Summary

  • CCE codegen: Rename code generator to CCECodegen, generate pto-isa C++ from PyPTO IR (kernels + control flow).
  • Orchestration codegen: New shared orchestration_codegen module for host-side task-graph C++; kernel vs orchestration split by Program and FunctionType.
  • Compile API: ir.compile() accepts codegen=CodegenBackend.PTO|CCE and uses CCECodegen.generate(program) (snake_case) for CCE backend.

Changes

Codegen

  • Rename code_generatorcce_codegen (header/source and bindings).
  • Add orchestration_codegen.h/cpp: GenerateOrchestration(program, func) used by both PTOCodegen and CCECodegen.
  • CCECodegen::Generate(Program) returns map<path, content>: kernels/<name>.cpp for kernel functions, orchestration/<name>.cpp for orchestration.
  • Python: expose CCECodegen.generate(program) (snake_case) and update codegen.pyi.

Compile

  • Add CodegenBackend enum (PTO, CCE) and codegen parameter to ir.compile().
  • Default remains PTO; codegen=CodegenBackend.CCE uses CCECodegen and writes C++ artifacts.

Other

  • Docs: update docs/dev/11-cce_codegen.md.
  • Example: add examples/ir_builder/orchestration_example.py.
  • Transforms: adjust for new API (e.g. add_alloc_pass, dependency_analyzer, insert_sync_pass).
  • Tests: extend codegen and compile tests for CCE backend and Program-based API.

Testing

  • Pre-commit (cpplint, ruff, pyright, etc.) passes.
  • Recommend running in a full build env: cmake --build build && pytest tests/ut/ -v.

Related

  • Builds on pto-isa code generator and control flow support.
  • Orchestration and kernel separation aligns with Program/FunctionType (e.g. Orchestration vs InCore).

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @zhusy54, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a significant architectural enhancement to the PyPTO Intermediate Representation (IR) code generator. It transitions from a monolithic PTO assembly output to a more flexible multi-file generation system. This enables the clear separation of concerns between low-level computational kernels and high-level task orchestration. By compiling orchestration logic into C++ runtime code and kernels into distinct PTO assembly files, the framework gains the ability to define and manage complex execution flows, paving the way for more sophisticated hardware acceleration and task scheduling.

Highlights

  • Multi-file Code Generation: The PTOCodegen.generate() method now produces a dictionary mapping file paths to their content, allowing for the generation of multiple output files from a single program.
  • Orchestration Function Support: Introduced the concept of orchestration functions, which are now compiled into C++ runtime code (.cpp files) responsible for building task graphs and managing kernel execution.
  • Kernel Function Separation: Kernel functions are now generated into individual PTO assembly files (.pto) within a dedicated kernels/ subdirectory, promoting modularity.
  • New C++ Codegen Utilities: Added C++ methods to identify orchestration functions, infer core types (VECTOR/CUBE) from operations, and generate the C++ orchestration code.
  • Updated Python Integration: Python bindings and the compile.py utility have been updated to seamlessly handle the new multi-file output and save generated code to the appropriate locations.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a significant and valuable feature by separating kernel and orchestration code generation. The implementation is extensive, touching the C++ core, Python bindings, high-level APIs, and tests. However, there are several critical issues in the current implementation, particularly regarding how intermediate tensor sizes are handled and how orchestration functions are identified and tested. These issues need to be addressed to ensure the feature is robust and correct. I've also included some suggestions for improving code quality and test coverage.

Comment on lines 990 to 970
oss << " std::cerr << \"Error: Expected at least " << expected_arg_count
<< " args, got \" << arg_count << std::endl;\n";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Error reporting is done using std::cerr, which writes directly to the standard error stream. This makes it difficult for the calling Python code to catch and handle errors gracefully. It would be better to throw a C++ exception, such as pypto::ValueError, which can be caught by the nanobind layer and translated into a proper Python exception.

        throw ValueError("Error: Expected at least " + std::to_string(expected_arg_count) + " args, got " + std::to_string(arg_count));

Comment on lines 73 to 102
# Save all generated files
for filepath, content in files.items():
full_path = os.path.join(output_dir, filepath)

# Create subdirectories if needed (e.g., kernels/)
file_dir = os.path.dirname(full_path)
if file_dir:
os.makedirs(file_dir, exist_ok=True)

# Write file
with open(full_path, "w") as f:
f.write(content)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The logic for saving the generated files to an output directory is duplicated in examples/ir_builder/codegen_separate_files_demo.py. To improve code maintainability and adhere to the DRY (Don't Repeat Yourself) principle, this logic could be extracted into a shared utility function.

@zhusy54 zhusy54 force-pushed the orchestra-func branch 4 times, most recently from e9776b2 to bd52e03 Compare January 30, 2026 10:36
@zhusy54 zhusy54 changed the title [WIP] Add orchestration function [WIP] feat(ir): Add orchestration function type and separate codegen Jan 30, 2026
@zhusy54 zhusy54 force-pushed the orchestra-func branch 3 times, most recently from 8114c6e to f76dd22 Compare January 30, 2026 11:51
@zhusy54 zhusy54 changed the title [WIP] feat(ir): Add orchestration function type and separate codegen feat(codegen): add separate file output for kernels and orchestration Jan 30, 2026
@zhusy54 zhusy54 force-pushed the orchestra-func branch 4 times, most recently from f0ef85e to a23fcaf Compare January 31, 2026 08:08
@zhusy54 zhusy54 changed the title feat(codegen): add separate file output for kernels and orchestration feat(codegen): CCE codegen with orchestration and compile backend option Jan 31, 2026
@zhusy54 zhusy54 force-pushed the orchestra-func branch 2 times, most recently from 16a4a91 to 7189eab Compare January 31, 2026 14:55
- Add orchestration code generation for task graph building (C++ runtime API)
- Refactor CCECodegen.Generate() to accept Program and return multi-file dict
  - Kernel functions -> kernels/<func_name>.cpp
  - Orchestration -> orchestration/<func_name>.cpp
- Move PTOCodegen from ir module to codegen module for unified API
- Add CodegenBackend enum to ir.compile() for backend selection (PTO/CCE)
- Improve DependencyAnalyzer to merge consecutive simple statements
- Preserve func_type_ across optimization passes
@lyfne123 lyfne123 merged commit b96c1ab into hw-native-sys:main Feb 2, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants