Skip to content

Conversation

@runwangdl
Copy link
Collaborator

Summary

This PR adds complete GAP9 platform support to Deeploy, including platform integration, DMA support, tiling capabilities, CI/CD workflows, and comprehensive testing infrastructure. This represents 20 commits specifically focused on GAP9 development.

Added

GAP9 Platform Support

  • Initial GAP9 platform integration with full deployer, bindings, and platform configuration (Deeploy/Targets/GAP9/)
  • GAP9 DMA support with L3 DMA and Mchan DMA implementations
  • GAP9-specific memory allocation and free templates
  • GAP9 tiling support for L3 memory
  • GAP9 CI/CD workflows (.github/workflows/_runner-gap9.yml, .github/workflows/ci-platform-gap9.yml, .github/workflows/ci-platform-gap9-tiled.yml)
  • Link to PULP-NN, PULP kernels, and Math libraries for GAP9
  • GAP9 SDK configuration with cluster stack macros
  • GAP9 GVSoC simulation support

Changed

  • Minimally modified PULP kernel syntax to fix GAP9 compiler issues. Changes are minimal and maintain compatibility with PULP kernels with GAP9 GCC toolchain-specific requirements
    - Transpose operator: Fixed GCC segmentation fault caused by template syntax (commit 9ca4595)
    - LayerNorm operator: Resolved epsilon ABI compatibility issue (commit 6b5c2e5)

Known Limitations

  • L3-L2 Async DMA - Currently synchronous; async blocked by Siracusa inheritance
  • NE16 Accelerator - Not yet integrated
  • AutoTiler DW/PW - GAP9 SDK AutoTiler kernels not integrated
  • GAP9 Float Math - Limited coverage (affects RMSNorm, etc.)

Platform Capabilities

✅ Multi-core (1-8) | ✅ L1/L2/L3 memory | ✅ Multi-channel DMA
✅ GVSoC simulation | ✅ Tiling | ✅ PULP-NN integration

PR Merge Checklist

  1. The PR is rebased on the latest devel commit and pointing to devel.
  2. Your PR reviewed and approved.
  3. All checks are passing.
  4. The CHANGELOG.md file has been updated.
  5. If the docker was modified, change back its link after review.

Xeratec and others added 24 commits November 13, 2025 22:33
- Enable pulling from private GitLab repo
- Improve caching for pip, apt and cargo
- Fix cMake version
- Remove problematic pip installation in favor of apt package
- Add ZSH an Oh My ZSH
- Add package dependencies for GAP9 SDK
- Remove unused files from the container
- Fix banshee package problems
… issueFix duplicate template generation due to PULP inheritance issue
@Xeratec Xeratec added the Feature Addition of new features label Dec 24, 2025
@Xeratec Xeratec added this to Deeploy Dec 24, 2025
@Xeratec Xeratec added this to the Release 0.3.0 milestone Dec 24, 2025
@Xeratec Xeratec moved this to In progress in Deeploy Dec 24, 2025
@runwangdl runwangdl marked this pull request as ready for review January 11, 2026 20:39
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 11, 2026

📝 Walkthrough

Summary by CodeRabbit

  • New Features
    • Added experimental GAP9 target: runtime, memory management, DMA acceleration, and wide operator support for tiled and untiled deployments.
  • CI/CD
    • New CI workflows and reusable runners for GAP9 (tiled/untiled) and updated CI images; GAP9-aware ccache generation.
    • New Docker image variants with GAP9 toolchain and SDK support.
  • Documentation
    • Added GAP9 usage guide.
  • Tests
    • New GAP9 test runners and platform test harness with emulation support.

✏️ Tip: You can customize this high-level summary in your review settings.

Walkthrough

Adds comprehensive GAP9 support: CI workflows, container/toolchain updates, GAP9 deployment backend (DMA, bindings, tiler, deployer), target runtime libraries, CMake/gvsoc integration, test runners and test harness for GAP9, and documentation.

Changes

Cohort / File(s) Summary
CI Workflows
*.github/workflows/* (.github/workflows/ci-platform-gap9.yml, .github/workflows/ci-platform-gap9-tiled.yml, ._runner-gap9.yml, ._runner-gap9-tiled.yml, ._select-env.yml, ci-deeploy.yml, infra-generate-ccache.yml)
New GAP9 CI pipelines and reusable runners; updated default deeploy image to GAP9 image; added GAP9-specific ccache generation and L3/L2 handling.
Container & Toolchain
Container/* (Dockerfile.Gap9, Dockerfile.deeploy, Dockerfile.toolchain, Makefile, amd64.list) and toolchain/*, Makefile
New Gap9 Dockerfile; toolchain/image tweaks (CMake arg, SSH/known_hosts, ccache, GAP SDK env vars); Make targets for gap9-toolchain/sdk; patches for banshee/gap9-sdk.
Deeploy GAP9 Platform & Tiling
Deeploy/Targets/GAP9/* (Platform.py, Deployer.py, Bindings.py, Tiler.py, Templates/*, DMA/*, __init__.py)
New GAP9 platform: variable/constant/struct buffer types, cluster engine, GAP9Mapping, extensive node bindings, tiling-ready bindings, MCHAN/L3 async DMA implementations, allocation/free templates, and GAP9-specific deployer (L3 allocation/loading).
Test Infrastructure & Runners
DeeployTest/* (testRunner_gap9.py, testRunner_tiled_gap9.py, Platforms/GAP9/*, testUtils/*, CMakeLists.txt)
Added GAP9 test runners (tiled/untiled), GAP9 test platform sources (deeploytest, cycle counter), SDK config, test harness integration, gvsoc_install_dir CLI flag, cleanup and debug prints.
Target Libraries (C) for GAP9
TargetLibraries/GAP9/* (CMakeLists.txt, inc/*, src/*)
New static deeploygap9 library and runtime: DMA implementations (mchan), memory layer (ram/fs/cl_ram), utility functions, headers (mchan, dory_dma/mem), cycle counter, and math macros.
Build System / CMake / gvsoc
CMakeLists.txt, cmake/* (common.cmake, gap9/gap9_gvsoc.cmake, simulation.cmake)
Bumped CMake minimum, GAP9 build branch and gvsoc emulation macro (L2 vs L3/readfs), helper to add files to flash, and moved/adjusted try-compile/deeploylib changes.
PULPOpen / Templates / Small Fixes
Deeploy/Targets/PULPOpen/*, TargetLibraries/PULPOpen/*, TargetLibraries/Generic/*
Reordered LayerNorm parameter (epsilon moved), updated templates (Transpose/FloatLayernorm), added math includes in several files, and propagated GAP9 engine core-count handling.
Docs
GAP9.md, README.md
New GAP9 usage doc and README reference.

Sequence Diagram(s)

sequenceDiagram
    participant CI as CI Workflow
    participant Container as Docker/Image Build
    participant TestRunner as TestRunner
    participant CMake as CMake
    participant GVSOC as GVSOC
    participant Network as GAP9 Network Executable

    CI->>Container: build GAP9 image (toolchain + SDK)
    CI->>TestRunner: start test run (tiled / untiled)
    TestRunner->>CMake: configure & build (platform=GAP9)
    CMake->>Network: link deeploygap9, generate binary(s)
    TestRunner->>GVSOC: launch emulation (L2 or L3/readfs)
    GVSOC->>Network: load & execute
    Network->>GVSOC: perform cluster tasks and DMA transfers
    GVSOC->>TestRunner: return cycles & results
    TestRunner->>CI: report pass/fail
Loading
sequenceDiagram
    participant App as Host App
    participant Cluster as GAP9 Cluster
    participant DMA as MCHAN/L3 DMA
    participant L1 as L1 TCDM
    participant L2 as L2 RAM
    participant L3 as L3 RAM

    App->>Cluster: init cluster, allocate buffers
    Cluster->>L3: load constants (cl_ram_read)
    Cluster->>DMA: async DMA L3→L2 / L2→L1
    DMA->>L2: transfer data
    Cluster->>L1: compute kernel using L1 buffers
    Cluster->>DMA: write-back L1→L2 / L2→L3
    App->>Cluster: retrieve outputs, compare
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested reviewers

  • Victor-Jung
  • Xeratec
🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 1.32% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title '[DRAFT] Deeploy-GAP9 Platform' clearly and concisely summarizes the main change: adding GAP9 platform support to Deeploy.
Description check ✅ Passed The description is comprehensive and directly related to the changeset, detailing GAP9 platform support additions, modifications, known limitations, and platform capabilities.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 050f0ae and ff05304.

📒 Files selected for processing (1)
  • .github/workflows/_select-env.yml
🔇 Additional comments (1)
.github/workflows/_select-env.yml (1)

48-48: LGTM! EOF newline added.

This change adds a newline at the end of the file, which is a best practice and aligns with POSIX standards. No functional changes.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Note

Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.

🤖 Fix all issues with AI agents
In @.github/workflows/_runner-gap9.yml:
- Around line 34-40: The CI step "Build Deeploy" currently masks failures by
appending "|| true" to commands (notably the "source
/app/install/gap9-sdk/configs/gap9_evk_audio.sh || true" and "pip install -e .
|| true"); remove the "|| true" from these commands so errors cause the job to
fail, and for the optional sourcing use a guarded conditional (e.g., test for
the file before sourcing) rather than swallowing errors—locate the commands
inside the "Build Deeploy" run block in the workflow and update the "source ...
|| true" and "pip install -e . || true" lines accordingly.
- Around line 47-63: The current loop "echo $testNames | while IFS= read -r
testName; do ... python testRunner_gap9.py -t Tests/$testName ... done" can hide
failures because of the pipe and missing errexit; enable strict failure handling
(set -euo pipefail or at least set -e and set -o pipefail) and restructure the
loop so it doesn't run in a subshell (e.g., read from a
here-string/process-substitution or iterate over testNames), capture each python
invocation's exit status (or maintain a failure flag) and after the loop call
exit with non-zero if any test failed so the job fails when any test fails.

In @.github/workflows/_select-env.yml:
- Line 29: The workflow hardcodes IMAGE="ghcr.io/runwangdl/deeploy:gap9", making
the inputs.docker_image_deeploy input unused and tying CI to a personal repo;
change the assignment for IMAGE in .github/workflows/_select-env.yml to use the
workflow input or the org image instead (e.g., set IMAGE to the inputs variable
docker_image_deeploy or to "ghcr.io/pulp-platform/deeploy:gap9") so contributors
can override it and avoid depending on a personal account.

In @.github/workflows/ci-deeploy.yml:
- Around line 17-20: The workflow input docker_image_deeploy currently defaults
to a personal registry image ("ghcr.io/runwangdl/deeploy:gap9"), which must not
be used for CI; update the default value of the docker_image_deeploy input to
point to an official, organization-managed image (e.g.,
"ghcr.io/pulp-platform/deeploy:gap9" if published, or revert to
"ghcr.io/pulp-platform/deeploy:devel" until a GAP9 image is available) and
ensure any documentation or README referencing docker_image_deeploy is
consistent with the org registry change.

In @Container/Dockerfile.toolchain:
- Line 33: The Dockerfile.toolchain contains a typo in the APT package list:
replace the incorrect package name "ibglib2.0-dev" with the correct
"libglib2.0-dev" in the package installation line (the line that currently lists
"ibglib2.0-dev" among the packages) so the package installation succeeds.

In @DeeployTest/CMakeLists.txt:
- Around line 53-80: The message uses ${CMAKE_MATCH_COUNT} which is wrong for
counting files; replace that usage by computing the HEXLIST length with
list(LENGTH HEXLIST HEXCOUNT) and then log HEXCOUNT in the "Found … hex file(s)"
message; keep the existing GLOB_RECURSE pattern ("${GENERATED_SOURCE}/hex/*")
as-is and ensure you update the message call that currently references
${CMAKE_MATCH_COUNT} to reference ${HEXCOUNT} instead so the reported count
matches the HEXLIST contents.

In @TargetLibraries/GAP9/src/Util.c:
- Around line 18-22: The preprocessor conditional uses a bitwise OR and Apollo
macros that don't exist here; change the operator from '|' to '||',
remove/replace Apollo-specific macros (AM_PART_APOLLO4B, DAM_PART_APOLLO3) and
any call to am_util_stdio_vprintf (which isn't declared) with the correct GAP9
platform macros and the proper GAP9 logging/printf API (or fall back to
vprintf). Locate the conditional around am_util_stdio_vprintf/vprintf in Util.c,
replace the macro checks with the GAP9-specific macro(s) you find via the
suggested rg search, use '||' for logical OR, and ensure the selected function
(vprintf or the GAP9 declared equivalent) is actually declared/available in
included headers.
🟡 Minor comments (15)
Container/Dockerfile.Gap9-3-5 (1)

3-5: Remove duplicate environment variable.

GAP_RISCV_GCC_TOOLCHAIN is set twice with identical values on lines 3 and 5.

Proposed fix
 FROM ghcr.io/pulp-platform/deeploy:latest

 ENV GAP_RISCV_GCC_TOOLCHAIN=/app/install/gcc/gap9
 ENV GAP_SDK_HOME=/app/install/gap9-sdk
-ENV GAP_RISCV_GCC_TOOLCHAIN=/app/install/gcc/gap9
DeeployTest/Platforms/GAP9/src/deeploytest.c-98-98 (1)

98-98: Typo: "Intializing" should be "Initializing".

Fix
-  printf("Intializing\r\n");
+  printf("Initializing\r\n");
DeeployTest/testUtils/platformMapping.py-210-225 (1)

210-225: Missing inputOffsets parameter in GAP9Deployer instantiation.

All other platform deployers in this function pass the inputOffsets parameter, but the GAP9Deployer call on Line 218 omits it. This means user-specified input offsets will be ignored for GAP9 deployments.

🔧 Suggested fix
     deployer = GAP9Deployer(graph,
                             platform,
                             inputTypes,
                             loweringOptimizer,
                             scheduler,
                             name = name,
                             default_channels_first = default_channels_first,
-                            deeployStateDir = deeployStateDir)
+                            deeployStateDir = deeployStateDir,
+                            inputOffsets = inputOffsets)
DeeployTest/Platforms/GAP9/sdk.config-11-19 (1)

11-19: Double-check storage/readfs configuration coherence (FLASH vs MRAM).

You enable CONFIG_DRIVER_TYPE_FLASH, CONFIG_DRIVER_MRAM, and pick CONFIG_READFS_FLASH_TYPE_OSPI=y while the MRAM readfs type is commented. If the intent is “L3/readfs via OSPI”, consider adding a short comment explaining why MRAM is enabled (driver dependency vs actual storage).

DeeployTest/Platforms/GAP9/sdk.config-5-7 (1)

5-7: Ensure board selection is mutually exclusive (or clearly intentional).

Both CONFIG_BOARD_GAP9MOD_V1_0_B=y and CONFIG_BOARD_GAP9EVK_V1_3=y are enabled; many SDKs assume exactly one board is selected, which can lead to conflicting BSP config.

DeeployTest/Platforms/GAP9/CMakeLists.txt-20-23 (1)

20-23: target_compile_options(${ProjectId} INTERFACE network) looks accidental.

network here is a target name, not a compiler flag. Even if it’s harmless (INTERFACE on an executable), it’s confusing and easy to cargo-cult elsewhere.

Proposed fix
 target_link_libraries(${ProjectId} PRIVATE network deeploylib)
-target_compile_options(${ProjectId} INTERFACE network)
 add_gvsoc_emulation(${ProjectId} "gap9.evk")
GAP9.md-6-6 (1)

6-6: Fix grammatical error.

The phrase "does yet not include" should be "does not yet include".

📝 Proposed fix
-To use Deeploy with GAP9, a custom Docker container is required because the official Deeploy Docker image does yet not include the necessary SDKs and dependencies for GAP9 development, because they are not publicly available.
+To use Deeploy with GAP9, a custom Docker container is required because the official Deeploy Docker image does not yet include the necessary SDKs and dependencies for GAP9 development, because they are not publicly available.
GAP9.md-22-22 (1)

22-22: Fix typo in variable name.

The variable name should be DEEPLOY_IMAGE not DEEPOY_IMAGE (missing an 'L').

📝 Proposed fix
-make deeploy DEEPOY_IMAGE=deeploy:gap9
+make deeploy DEEPLOY_IMAGE=deeploy:gap9
DeeployTest/testUtils/testRunner.py-315-315 (1)

315-315: Fix unnecessary f-string prefix.

The assertion message is an f-string without any placeholders. Remove the f prefix for consistency and clarity.

🐛 Proposed fix
-assert self._args.gvsoc_install_dir is not None, f"Environment variable GVSOC_INSTALL_DIR is not set"
+assert self._args.gvsoc_install_dir is not None, "Environment variable GVSOC_INSTALL_DIR is not set"

Based on static analysis hint.

.github/workflows/_runner-gap9-tiled.yml-62-65 (1)

62-65: Error suppression could mask setup failures.

Both the SDK config sourcing and pip installation use || true to suppress errors. This means failures in environment setup or dependency installation will be silently ignored, potentially leading to test failures that are harder to diagnose.

Consider removing || true for the pip install step to ensure dependencies are properly installed:

💡 Proposed fix
-source /app/install/gap9-sdk/configs/gap9_evk_audio.sh || true
-pip install -e . || true
+source /app/install/gap9-sdk/configs/gap9_evk_audio.sh
+pip install -e .

If these failures are expected in some scenarios, please document why silent failure is acceptable.

TargetLibraries/GAP9/CMakeLists.txt-25-27 (1)

25-27: Duplicate NUM_CORES compile definition.

NUM_CORES is added twice:

  1. Line 26: target_compile_options(deeploygap9 PUBLIC -DNUM_CORES=${NUM_CORES})
  2. Line 44: add_compile_definitions(NUM_CORES=${NUM_CORES})

The add_compile_definitions call applies globally to all targets in the directory, which may unintentionally affect other targets. If the intent is to propagate NUM_CORES to pulp-nn-mixed, consider using target_compile_definitions instead.

Suggested fix
-add_compile_definitions(NUM_CORES=${NUM_CORES})
+# NUM_CORES is already propagated via deeploygap9 PUBLIC compile options

Or if pulp-nn-mixed needs it explicitly:

-add_compile_definitions(NUM_CORES=${NUM_CORES})
+target_compile_definitions(pulp-nn-mixed PUBLIC NUM_CORES=${NUM_CORES})

Also applies to: 44-44

TargetLibraries/GAP9/inc/mchan.h-98-105 (1)

98-105: Comment typo: duplicate "v7" reference.

The comment on lines 98-99 mentions "v7" twice. Based on the code logic, the #else branch handles non-v7 (presumably v6), so the comment should read something like:

"MCHAN version 7 takes 2D count and stride in 2 steps; v6 takes it in 1 step with the stride shifted to the upper 16 bits."

Deeploy/Targets/GAP9/Deployer.py-45-68 (1)

45-68: Fix mutable default argument for inputOffsets.

Using {} as a default argument is a Python anti-pattern—the same dictionary instance is shared across all calls, which can lead to subtle bugs.

Proposed fix
     def __init__(self,
                  graph: gs.Graph,
                  deploymentPlatform: DeploymentPlatform,
                  inputTypes: Dict[str, Type[Pointer]],
                  loweringOptimizer: TopologyOptimizer,
                  scheduler: Callable = lambda x: x,
                  name: str = 'DeeployNetwork',
                  default_channels_first = False,
                  deeployStateDir: str = "DeeployStateDir",
-                 inputOffsets = {}):
+                 inputOffsets: Dict[str, int] = None):
+        if inputOffsets is None:
+            inputOffsets = {}
         super().__init__(graph,
TargetLibraries/GAP9/src/dory_mem.c-153-176 (1)

153-176: File handle is never closed after reading.

Same issue as load_file_to_ram - the file descriptor is not closed after use.

🔧 Suggested fix
     offset += load_size;
   }

+  pi_fs_close(fd);
   return offset;
 }
TargetLibraries/GAP9/src/dory_mem.c-126-151 (1)

126-151: File handle is never closed after reading.

The function opens a file with pi_fs_open but never closes it with pi_fs_close, which could lead to resource leaks if called multiple times.

🔧 Suggested fix
   } while (offset < size);

+  pi_fs_close(fd);
   return offset;
 }
🧹 Nitpick comments (41)
TargetLibraries/GAP9/src/Util.c (1)

7-7: Potentially unused include.

DeeployGAP9Math.h doesn't appear to be used in this file. Consider removing it unless it's required for type definitions or forward declarations used elsewhere.

toolchain/banshee.patch (1)

90-98: LGTM: Using libc::c_char improves FFI type correctness.

Changing NONAME from &'static i8 to &'static libc::c_char is the proper way to represent C character pointers in Rust FFI code. While c_char is typically i8, it can be u8 on some platforms, so this change improves portability and type safety.

Deeploy/Targets/GAP9/Templates/__init__.py (1)

5-5: Consider explicit imports for better clarity.

While from . import * is functional for package re-exports, explicit imports improve code clarity and enable static analysis. For example:

from .AllocateTemplate import *
from .FreeTemplate import *

This makes it clear which submodules are being exported and helps static analyzers verify the imports.

Deeploy/Targets/GAP9/__init__.py (1)

5-5: Consider explicit imports for better clarity.

While from . import * is functional for package re-exports, explicit imports improve code clarity and enable static analysis. For example:

from . import Bindings
from . import Deployer
from . import Platform
from . import DMA
from . import Templates
from . import Tiler

This makes it clear which submodules are being exported and helps static analyzers verify the imports.

Container/Makefile (1)

37-41: Add documentation for SSH agent requirement.

The --ssh default flag is necessary—GAP9 SDK builds require SSH access to clone private dependencies from the PULP toolchain repositories. To help users, document this requirement (e.g., in a README or Makefile comment) noting that the build requires an SSH agent with proper credentials configured.

DeeployTest/Platforms/GAP9/inc/CycleCounter.h (1)

7-8: Consider a more specific include guard name.

The include guard CYCLECOUNTER is generic and may conflict with other headers. A more descriptive guard like DEEPLOY_GAP9_CYCLECOUNTER_H or GAP9_CYCLE_COUNTER_H_ would reduce collision risk.

Suggested improvement
-#ifndef CYCLECOUNTER
-#define CYCLECOUNTER
+#ifndef DEEPLOY_GAP9_CYCLECOUNTER_H
+#define DEEPLOY_GAP9_CYCLECOUNTER_H

And at line 22:

-#endif
+#endif // DEEPLOY_GAP9_CYCLECOUNTER_H
Container/Dockerfile.Gap9 (1)

30-32: Trailing whitespace in cleanup command.

Line 32 has trailing whitespace after the path which could cause issues in some contexts.

Proposed fix
 RUN --mount=type=cache,target=/ccache \
     ccache -z && make gap9-toolchain && \
-    rm -rf /app/toolchain/gap9-toolchain 
+    rm -rf /app/toolchain/gap9-toolchain
DeeployTest/Platforms/GAP9/src/CycleCounter.c (1)

10-19: Minor: Function parameter style inconsistent with header.

The header declares functions with (void) but the implementation uses empty (). While technically compatible in C, consistent style is preferred.

Suggested fix for consistency
-void ResetTimer() {
+void ResetTimer(void) {
   pi_perf_conf(1 << PI_PERF_CYCLES);
   pi_perf_reset();
 }

-void StartTimer() { pi_perf_start(); }
+void StartTimer(void) { pi_perf_start(); }

-void StopTimer() { pi_perf_stop(); }
+void StopTimer(void) { pi_perf_stop(); }

-unsigned int getCycles() { return pi_perf_read(PI_PERF_CYCLES); }
+unsigned int getCycles(void) { return pi_perf_read(PI_PERF_CYCLES); }
.github/workflows/infra-generate-ccache.yml (1)

51-66: Silenced configuration sourcing may hide failures.

Line 61 uses || true which suppresses any errors from sourcing the GAP9 config. If the configuration fails, the subsequent tests may run in an incorrect environment without any indication.

Consider logging a warning or removing || true to fail fast on configuration issues.

Suggested improvement
-          source /app/install/gap9-sdk/configs/gap9_evk_audio.sh || true
+          if ! source /app/install/gap9-sdk/configs/gap9_evk_audio.sh; then
+            echo "Warning: Failed to source GAP9 config, continuing anyway"
+          fi
cmake/gap9/gap9_gvsoc.cmake (3)

47-52: Redundant condition check.

The if(GAPY_RUNNER_ARGS) check on line 48 is redundant since the enclosing block (line 24) already confirms GAPY_RUNNER_ARGS is set.

Suggested simplification
         # Add readfs files if provided
-        if(GAPY_RUNNER_ARGS)
-            list(LENGTH GAPY_RUNNER_ARGS num_readfs_files)
-            message(STATUS "[Deeploy GAP9] Adding ${num_readfs_files} readfs file(s)")
-            list(APPEND GAPY_CMD ${GAPY_RUNNER_ARGS})
-        endif()
+        list(LENGTH GAPY_RUNNER_ARGS num_readfs_files)
+        message(STATUS "[Deeploy GAP9] Adding ${num_readfs_files} readfs file(s)")
+        list(APPEND GAPY_CMD ${GAPY_RUNNER_ARGS})

83-85: POST_BUILD has no effect on custom targets.

POST_BUILD is only meaningful for add_custom_command attached to library/executable targets, not for add_custom_target. It can be safely removed from both blocks.

Additionally, VERBATIM is used in L3 mode (line 85) but not in L2 mode (line 121), which could cause inconsistent shell quoting behavior.

Suggested fix

For L3 mode (lines 83-85):

             COMMENT "Simulating ${name} with gapy for GAP9 (L3 mode)"
-            POST_BUILD
             USES_TERMINAL
             VERBATIM

For L2 mode (lines 119-121):

             COMMENT "Simulating ${name} with gvsoc for GAP9 (L2 mode)"
-            POST_BUILD
             USES_TERMINAL
+            VERBATIM
         )

Also applies to: 119-121


73-73: Inconsistent error handling for copy commands.

L3 mode uses bash -c with 2>/dev/null || true (line 73), while L2 mode uses direct CMake command with || true (line 109). The latter may not work correctly as CMake's copy_if_different doesn't recognize shell operators.

Consider using consistent error suppression:

Suggested fix for L2 mode
-            COMMAND ${CMAKE_COMMAND} -E copy_if_different ${CMAKE_BINARY_DIR}/*.bin ${GVSOC_WORKDIR}/ || true
+            COMMAND bash -c "${CMAKE_COMMAND} -E copy_if_different ${CMAKE_BINARY_DIR}/*.bin ${GVSOC_WORKDIR}/ 2>/dev/null || true"

Also applies to: 109-109

TargetLibraries/GAP9/inc/dory_mem.h (3)

7-8: Consider using a more specific include guard name.

The guard __MEM_H__ is very generic and could conflict with other headers in the project or dependencies. A more specific name like __DEEPLOY_GAP9_DORY_MEM_H__ or __GAP9_DORY_MEM_H__ would reduce collision risk.

Suggested fix
-#ifndef __MEM_H__
-#define __MEM_H__
+#ifndef __DEEPLOY_GAP9_DORY_MEM_H__
+#define __DEEPLOY_GAP9_DORY_MEM_H__

And at the end:

-#endif // __MEM_H__
+#endif // __DEEPLOY_GAP9_DORY_MEM_H__

10-12: Missing include for struct pi_device declaration.

The header declares extern struct pi_device ram; but doesn't include the header that defines struct pi_device. Users of this header will get compilation errors unless they include pmsis.h before this header. Consider adding the necessary include or a forward declaration.

Suggested fix
 #include <stddef.h>
+#include "pmsis.h"
 
 extern struct pi_device ram;

Or if you want to minimize includes, add a forward declaration (though this may not work depending on how pi_device is defined):

 #include <stddef.h>
+
+struct pi_device;
 
 extern struct pi_device ram;

19-24: Inconsistent const qualifiers between header and implementation.

The implementation in dory_mem.c uses const size_t size for ram_read and ram_write, but the header declares them without const. While this doesn't affect ABI compatibility, it's good practice to keep declarations consistent for clarity.

Suggested fix
-void ram_read(void *dest, void *src, size_t size);
-void ram_write(void *dest, void *src, size_t size);
+void ram_read(void *dest, void *src, const size_t size);
+void ram_write(void *dest, void *src, const size_t size);
DeeployTest/Platforms/GAP9/src/deeploytest.c (3)

61-63: Inefficient use of pi_cl_team_fork when only core 0 performs work.

CL_CompareFloat forks NUM_CORES threads but CompareFloatOnCluster only executes on core 0 (line 32). This wastes resources spawning idle cores. Consider either using a single-core dispatch or distributing the comparison work across all cores for parallelism.

Option 1: Single core execution
 void CL_CompareFloat(void *arg) {
-  pi_cl_team_fork(NUM_CORES, CompareFloatOnCluster, arg);
+  pi_cl_team_fork(1, CompareFloatOnCluster, arg);
 }
Option 2: Parallelize comparison (if performance is critical)

Distribute num_elements across cores using pi_core_id() to partition work.


110-110: Extract magic number 0x10000000 to a named constant.

The address 0x10000000 is used multiple times to distinguish between L3 and L2 memory regions. Consider defining a descriptive constant to improve readability and maintainability.

Suggested fix
+#define L2_MEM_BASE_ADDR 0x10000000
+
 // Then replace usages:
-    if ((uint32_t)DeeployNetwork_inputs[buf] >= 0x10000000) {
+    if ((uint32_t)DeeployNetwork_inputs[buf] >= L2_MEM_BASE_ADDR) {

Also applies to: 138-138, 176-176


42-55: Float comparison tolerance is hardcoded.

The tolerance 1e-4 for float comparison is hardcoded. For different models or precision requirements, this may need adjustment. Consider making it configurable or at least defining it as a named constant.

Suggested fix
+#define FLOAT_COMPARE_TOLERANCE 1e-4
+
 // In CompareFloatOnCluster:
-      if ((diff < -1e-4) || (diff > 1e-4) || isnan(diff)) {
+      if ((diff < -FLOAT_COMPARE_TOLERANCE) || (diff > FLOAT_COMPARE_TOLERANCE) || isnan(diff)) {
toolchain/gap9-sdk.patch (1)

34-65: Large blocks of commented-out bindings in soc.py.

The patch comments out 32 self.bind() calls for SFU stream bindings. If these are permanently disabled for GAP9 support, consider removing them entirely or adding a comment explaining why they're disabled. Commented-out code can become stale and confusing.

If these bindings are expected to be re-enabled later, add a # TODO: Re-enable when SFU support is added comment. Otherwise, consider removing the commented lines.

Also applies to: 73-104

Deeploy/Targets/GAP9/DMA/MchanDma.py (1)

29-39: Annotate mutable class attributes with typing.ClassVar.

Per static analysis, _transferTemplates and _waitingStrategy are mutable class attributes that should be annotated with ClassVar to clarify they are shared across all instances.

Suggested fix
+from typing import ClassVar, Dict, Tuple
-from typing import Dict, Tuple
 
 # ... 

 class GAP9MchanDma(AsyncDma):
 
-    _transferTemplates = {
+    _transferTemplates: ClassVar[Dict[int, NodeTemplate]] = {
             NodeTemplate(
                 "{ mchan_transfer_t __mchan_tmp = { .cmd = ${cmd}, .size = ${size}, .loc = ${loc}, .ext = ${ext} }; mchan_transfer_push_1d(__mchan_tmp); }"
             ),
             NodeTemplate(
                 "{ mchan_transfer_t __mchan_tmp = { .cmd = ${cmd}, .size = ${size}, .loc = ${loc}, .ext = ${ext}, .ext_size_1d = ${size_1d}, .ext_stride_1d = ${stride_2d} }; mchan_transfer_push_2d(__mchan_tmp); }"
             ),
     }
-    _waitingStrategy = DirectionWaitingStrategy(MchanTransferFuture, "transfer")
+    _waitingStrategy: ClassVar[DirectionWaitingStrategy] = DirectionWaitingStrategy(MchanTransferFuture, "transfer")
DeeployTest/testUtils/codeGenerate.py (1)

269-274: Good practice: Cleanup prevents test contamination.

Removing stale hex files between runs is important for GAP9 L2/L3 testing. The implementation is correct.

Optional: Move import to module level

For consistency with other imports, consider moving shutil to the top-level imports (line 5) instead of importing it locally:

 import os
+import shutil
 from typing import List, Tuple

Then remove the local import on line 270. This is a minor style preference.

TargetLibraries/GAP9/inc/DeeployGAP9Math.h (1)

16-18: Single-core macros are functional but could benefit from documentation.

The macros correctly restrict execution to core 8 (fabric controller) or core 0 (first cluster core). Consider adding comments explaining when to use each macro and the implications of other cores skipping the guarded code.

Suggested documentation
+// Single-core execution macros for GAP9
+// Core 8: Fabric Controller, Core 0: First cluster core
+// Use BEGIN_SINGLE_CORE/END_SINGLE_CORE for code blocks
+// Use SINGLE_CORE for single statements
+// Other cores will skip the guarded code
 #define BEGIN_SINGLE_CORE if (pi_core_id() == 8 || pi_core_id() == 0) {
 #define END_SINGLE_CORE }
 #define SINGLE_CORE if (pi_core_id() == 8 || pi_core_id() == 0)
Deeploy/Targets/PULPOpen/Templates/TransposeTemplate.py (1)

96-103: Good workaround; tighten types (const src, and use data_out_type for dst).

The src/dst pointer approach is much cleaner and should avoid the previous GCC crash. Minor hardening: make src const, and type dst off data_out_type (in case future transpose variants allow differing in/out types). Based on learnings, this also keeps explicit, templated bitwidth types.

Proposed tweak
-// RW: GCC Segmentation fault
-
-${data_in_type.referencedType.typeName} (*src)${shapeStr} = (${data_in_type.referencedType.typeName} (*)${shapeStr})<%text>${data_in}</%text>;
-
-${data_in_type.referencedType.typeName} (*dst)${outShapeStr} = (${data_in_type.referencedType.typeName} (*)${outShapeStr})<%text>${data_out}</%text>;
+// RW: GCC segmentation fault (workaround: simplify indexing via typed pointers)
+
+const ${data_in_type.referencedType.typeName} (*src)${shapeStr} =
+  (const ${data_in_type.referencedType.typeName} (*)${shapeStr})<%text>${data_in}</%text>;
+
+${data_out_type.referencedType.typeName} (*dst)${outShapeStr} =
+  (${data_out_type.referencedType.typeName} (*)${outShapeStr})<%text>${data_out}</%text>;
DeeployTest/Platforms/GAP9/CMakeLists.txt (1)

24-30: Consider avoiding -Wno-error blanket for network.

If the goal is just to get past known SDK/header warnings, keeping the suppression to the specific warning classes is safer than disabling -Werror behavior wholesale.

CMakeLists.txt (2)

38-38: Remove or properly enable the commented debug line.

The commented set( CMAKE_MESSAGE_LOG_LEVEL "DEBUG" ) should either be removed or enabled via a configuration option.

♻️ Proposed fix
-  # set( CMAKE_MESSAGE_LOG_LEVEL "DEBUG" )

Or enable it conditionally:

-  # set( CMAKE_MESSAGE_LOG_LEVEL "DEBUG" )
+  if(CMAKE_VERBOSE_MAKEFILE)
+    set(CMAKE_MESSAGE_LOG_LEVEL "DEBUG")
+  endif()

56-59: Document why common.cmake is excluded for GAP9.

The exclusion of common.cmake for GAP9 suggests a fundamentally different build pattern. Please add a comment explaining why GAP9 requires this special treatment.

 # Import useful functions / macros
 include(${CMAKE_CURRENT_LIST_DIR}/cmake/Util.cmake)
-# Only if not GAP9
+# GAP9 uses GAP SDK's own build system setup, so common.cmake is not needed
 if(NOT platform STREQUAL GAP9)
   include(${CMAKE_CURRENT_LIST_DIR}/cmake/common.cmake)
 endif()
DeeployTest/testRunner_gap9.py (1)

26-28: Avoid accessing private parser attributes.

Directly accessing parser._actions and modifying action defaults is fragile and couples the code to ArgumentParser's internal implementation.

♻️ Proposed refactor using public API

Consider setting the default via environment variable or using set_defaults():

-    # Set default GVSOC install dir
-    for action in parser._actions:
-        if action.dest == 'gvsoc_install_dir':
-            action.default = "${GAP_SDK_HOME}/install/workstation"
-    args = parser.parse_args()
+    import os
+    
+    # Set default GVSOC install dir from GAP SDK if available
+    gap_sdk_home = os.environ.get('GAP_SDK_HOME')
+    if gap_sdk_home:
+        default_gvsoc = os.path.join(gap_sdk_home, 'install', 'workstation')
+        parser.set_defaults(gvsoc_install_dir=default_gvsoc)
+    
+    args = parser.parse_args()

This approach:

  1. Uses the public set_defaults() API
  2. Properly constructs the path using os.path.join
  3. Handles the case when GAP_SDK_HOME is not set
  4. Resolves the environment variable immediately rather than relying on shell expansion
Deeploy/Targets/GAP9/Templates/FreeTemplate.py (2)

7-8: Identical L2 free templates suggest potential consolidation.

gap9L2LocalTemplate and gap9L2GlobalTemplate have identical implementations. If they're truly meant to behave the same way, consider using a single template definition to reduce duplication.

If these are intentionally separate because they may diverge in the future or have different semantic meanings despite identical current implementations, please add a comment explaining the distinction.

♻️ Proposed consolidation
-gap9L2LocalTemplate = NodeTemplate("pi_l2_free(${name}, sizeof(${type.referencedType.typeName}) * ${size});")
-gap9L2GlobalTemplate = NodeTemplate("pi_l2_free(${name}, sizeof(${type.referencedType.typeName}) * ${size});")
+gap9L2FreeTemplate = NodeTemplate("pi_l2_free(${name}, sizeof(${type.referencedType.typeName}) * ${size});")
+gap9L2LocalTemplate = gap9L2FreeTemplate
+gap9L2GlobalTemplate = gap9L2FreeTemplate

12-22: LGTM with minor formatting suggestion.

The generic free template provides comprehensive coverage of memory levels with appropriate fallback handling. The logic correctly routes to the right free API based on the memory level.

Minor formatting suggestion for the compiler block comment:

♻️ Proposed formatting improvement
-//COMPILER BLOCK - MEMORYLEVEL ${_memoryLevel} NOT FOUND \n
+// COMPILER BLOCK - MEMORY LEVEL ${_memoryLevel} NOT FOUND\n
.github/workflows/ci-platform-gap9-tiled.yml (2)

155-182: Consider simplifying redundant matrix entries.

The num-cores: [8] matrix (line 175) has a single value and could be simplified by using a direct value in the with block, similar to how gap9-kernels-tiled-singlebuffer-L2 handles it (line 93). This reduces indirection.

If you intend to expand to multiple core counts in the future, keeping it as a matrix is fine—just noting the inconsistency with the kernels jobs.


184-207: Inconsistent double-buffer specification.

The double-buffer flag is defined in the matrix (line 199) with a single value [true], then passed via ${{ matrix.double-buffer }} (line 207). Compare this to gap9-kernels-tiled-doublebuffer-L2 (line 153) which simply uses double-buffer: true directly.

Consider aligning the approach for consistency—either use a direct value or keep it in the matrix if you plan to test both modes in the same job.

Deeploy/Targets/GAP9/Templates/AllocateTemplate.py (2)

57-68: Silent fallback may hide configuration errors.

The else branch (lines 64-67) silently falls back to L2 allocation when an unknown _memoryLevel is encountered, only leaving a comment in the generated code. This could mask configuration bugs.

Consider emitting a warning or raising an error during code generation instead of silently proceeding:

Alternative: Fail at generation time

Instead of generating fallback code, you could validate _memoryLevel in the deployer before template generation and raise an explicit error for unsupported levels.


10-10: Consider removing commented-out code.

Lines 10, 23, and 30 contain commented-out alternative implementations. If these are obsolete, consider removing them to reduce noise. If they're kept for reference, a brief comment explaining why would help.

TargetLibraries/GAP9/CMakeLists.txt (1)

29-36: Aggressive warning suppressions may hide issues.

Several of these warnings (-Wno-implicit-function-declaration, -Wno-incompatible-pointer-types) can mask real bugs in your own code. Consider whether all these are necessary for the GAP9-specific sources, or if some could be scoped more narrowly to third-party code only.

Deeploy/Targets/GAP9/DMA/L3Dma.py (1)

27-38: Class attribute annotation.

The static analyzer suggests annotating _transferTemplates with typing.ClassVar. While not strictly required, it clarifies intent and prevents accidental instance-level mutation.

Optional fix
+from typing import ClassVar, Dict, Tuple
+
 class GAP9L3Dma(AsyncDma):

-    _transferTemplates = {
+    _transferTemplates: ClassVar[Dict[int, NodeTemplate]] = {
             NodeTemplate(
                 "pi_cl_ram_copy_2d(get_ram_ptr(), ${ext}, ${loc}, ${transfer_size}, ${stride}, ${length}, ${ext2loc}, &${future});"
             )
     }
TargetLibraries/GAP9/inc/mchan.h (1)

86-138: Static functions in header should be static inline.

Functions defined in a header file with just static linkage will be duplicated in each translation unit that includes this header. While this works, using static inline is the conventional approach and hints to the compiler that inlining is preferred.

Proposed changes
-static int mchan_transfer_get_id() { return MCHAN_READ_CMD(); }
+static inline int mchan_transfer_get_id() { return MCHAN_READ_CMD(); }

-static void mchan_transfer_push_1d(mchan_transfer_t trans) {
+static inline void mchan_transfer_push_1d(mchan_transfer_t trans) {
   // ...
 }

-static void mchan_transfer_push_2d(mchan_transfer_t trans) {
+static inline void mchan_transfer_push_2d(mchan_transfer_t trans) {
   // ...
 }

Apply similarly to mchan_transfer_push, mchan_transfer_free, mchan_transfer_busy, and mchan_transfer_wait.

TargetLibraries/GAP9/src/dory_mem.c (1)

129-130: Error messages missing newline character.

The printf statements on lines 129 and 156 are missing the newline character \n at the end, inconsistent with other error messages in this file.

✏️ Suggested fix
-    printf("ERROR: Cannot open file %s! Exiting...", filename);
+    printf("ERROR: Cannot open file %s! Exiting...\n", filename);

Also applies to: 155-157

TargetLibraries/GAP9/src/dory_dma.c (1)

110-143: Consider simplifying single-core 3D transfer logic.

The log2(1) on line 113 always evaluates to 0, making number_of_2d_copies_per_core equal to copy->number_of_2d_copies. This effectively means no parallelization occurs. If this is intentional (single-core execution), the code could be simplified. If multi-core execution is desired, this appears to be incomplete.

Deeploy/Targets/GAP9/Platform.py (3)

263-272: Avoid mutable default arguments and function calls in argument defaults.

The engines parameter uses a mutable list and a function call as default, which can cause unexpected behavior if the list is modified. This is flagged by static analysis (B006, B008).

♻️ Suggested fix
 class GAP9Platform(DeploymentPlatform):

     def __init__(self,
-                 engines = [GAP9ClusterEngine("GAP9Cluster")],
+                 engines = None,
                  variableBuffer = GAP9VariableBuffer,
                  constantBuffer = GAP9ConstantBuffer,
                  structBuffer = GAP9StructBuffer,
                  transientBuffer = GAP9TransientBuffer) -> None:
+        if engines is None:
+            engines = [GAP9ClusterEngine("GAP9Cluster")]
         super().__init__(engines, variableBuffer, constantBuffer, structBuffer, transientBuffer)

274-292: Same mutable default argument issue and missing ClassVar annotation.

The MemoryGAP9Platform class has the same mutable default argument issue with engines. Additionally, untiledOps is a mutable class attribute that should be annotated with typing.ClassVar (RUF012).

♻️ Suggested fix
+from typing import ClassVar, List
+
 class MemoryGAP9Platform(MemoryPlatform):

-    untiledOps = ["add"]
+    untiledOps: ClassVar[List[str]] = ["add"]

     def __init__(self,
                  memoryHierarchy: MemoryHierarchy,
                  defaultTargetMemoryLevel: MemoryLevel,
-                 engines = [GAP9ClusterEngine("GAP9Cluster")],
+                 engines = None,
                  variableBuffer = GAP9VariableBuffer,
                  constantBuffer = GAP9ConstantBuffer,
                  structBuffer = GAP9StructBuffer,
                  transientBuffer = GAP9TransientBuffer) -> None:
+        if engines is None:
+            engines = [GAP9ClusterEngine("GAP9Cluster")]
         super().__init__(memoryHierarchy, defaultTargetMemoryLevel, engines, variableBuffer, constantBuffer,
                          structBuffer, transientBuffer)

295-307: Same ClassVar annotation issue for untiledOps.

The MemoryGAP9PlatformWrapper class also has a mutable class attribute untiledOps that should be annotated with ClassVar.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ecae48a and 050f0ae.

📒 Files selected for processing (60)
  • .github/workflows/_runner-gap9-tiled.yml
  • .github/workflows/_runner-gap9.yml
  • .github/workflows/_select-env.yml
  • .github/workflows/ci-deeploy.yml
  • .github/workflows/ci-platform-gap9-tiled.yml
  • .github/workflows/ci-platform-gap9.yml
  • .github/workflows/infra-generate-ccache.yml
  • CMakeLists.txt
  • Container/Dockerfile.Gap9
  • Container/Dockerfile.deeploy
  • Container/Dockerfile.toolchain
  • Container/Makefile
  • Container/amd64.list
  • Deeploy/Targets/GAP9/Bindings.py
  • Deeploy/Targets/GAP9/DMA/L3Dma.py
  • Deeploy/Targets/GAP9/DMA/MchanDma.py
  • Deeploy/Targets/GAP9/Deployer.py
  • Deeploy/Targets/GAP9/Platform.py
  • Deeploy/Targets/GAP9/Templates/AllocateTemplate.py
  • Deeploy/Targets/GAP9/Templates/FreeTemplate.py
  • Deeploy/Targets/GAP9/Templates/__init__.py
  • Deeploy/Targets/GAP9/Tiler.py
  • Deeploy/Targets/GAP9/__init__.py
  • Deeploy/Targets/PULPOpen/Deployer.py
  • Deeploy/Targets/PULPOpen/Templates/FloatLayernormTemplate.py
  • Deeploy/Targets/PULPOpen/Templates/TransposeTemplate.py
  • DeeployTest/CMakeLists.txt
  • DeeployTest/Platforms/GAP9/CMakeLists.txt
  • DeeployTest/Platforms/GAP9/inc/CycleCounter.h
  • DeeployTest/Platforms/GAP9/sdk.config
  • DeeployTest/Platforms/GAP9/src/CycleCounter.c
  • DeeployTest/Platforms/GAP9/src/deeploytest.c
  • DeeployTest/testRunner_gap9.py
  • DeeployTest/testRunner_tiled_gap9.py
  • DeeployTest/testUtils/codeGenerate.py
  • DeeployTest/testUtils/platformMapping.py
  • DeeployTest/testUtils/testRunner.py
  • GAP9.md
  • Makefile
  • README.md
  • TargetLibraries/GAP9/CMakeLists.txt
  • TargetLibraries/GAP9/inc/DeeployGAP9Math.h
  • TargetLibraries/GAP9/inc/DeeployMchan.h
  • TargetLibraries/GAP9/inc/dory_dma.h
  • TargetLibraries/GAP9/inc/dory_mem.h
  • TargetLibraries/GAP9/inc/mchan.h
  • TargetLibraries/GAP9/src/Util.c
  • TargetLibraries/GAP9/src/dory_dma.c
  • TargetLibraries/GAP9/src/dory_mem.c
  • TargetLibraries/Generic/src/BatchNorm_fp32.c
  • TargetLibraries/PULPOpen/inc/kernel/Layernorm.h
  • TargetLibraries/PULPOpen/src/DWConvolution_fp32.c
  • TargetLibraries/PULPOpen/src/GELU.c
  • TargetLibraries/PULPOpen/src/Layernorm.c
  • TargetLibraries/PULPOpen/src/Softmax.c
  • cmake/common.cmake
  • cmake/gap9/gap9_gvsoc.cmake
  • cmake/simulation.cmake
  • toolchain/banshee.patch
  • toolchain/gap9-sdk.patch
💤 Files with no reviewable changes (1)
  • cmake/common.cmake
🧰 Additional context used
🧠 Learnings (7)
📚 Learning: 2025-12-02T13:54:22.700Z
Learnt from: Xeratec
Repo: pulp-platform/Deeploy PR: 69
File: Deeploy/Targets/PULPOpen/Templates/FloatLayernormTemplate.py:36-38
Timestamp: 2025-12-02T13:54:22.700Z
Learning: In Deeploy templates (Python files in Deeploy/Targets/PULPOpen/Templates/), always use explicit bitwidth types (e.g., `float${...type.referencedType.typeWidth}_t*`) instead of hardcoded types (e.g., `float*`) to ensure type consistency with templated kernel calls.

Applied to files:

  • Deeploy/Targets/PULPOpen/Templates/TransposeTemplate.py
  • Deeploy/Targets/PULPOpen/Templates/FloatLayernormTemplate.py
📚 Learning: 2025-09-09T15:43:20.195Z
Learnt from: Xeratec
Repo: pulp-platform/Deeploy PR: 105
File: Deeploy/Targets/PULPOpen/TileConstraints/GEMMTileConstraint.py:120-124
Timestamp: 2025-09-09T15:43:20.195Z
Learning: In GEMMTileConstraint.serializeTilingSolution, transpose flags (transA, transB) must be read from operatorRepresentation and used to adjust NSize calculation and matrix offset/shape calculations, following the pattern in FloatGEMMTileConstraint.

Applied to files:

  • Deeploy/Targets/PULPOpen/Templates/TransposeTemplate.py
📚 Learning: 2025-09-24T12:17:21.624Z
Learnt from: diaconuccalin
Repo: pulp-platform/Deeploy PR: 117
File: Deeploy/Targets/PULPOpen/Templates/FloatConvTemplate.py:46-0
Timestamp: 2025-09-24T12:17:21.624Z
Learning: In Deeploy's PULP templates, transient buffer size calculation can return element counts as strings from computeTransientBuffersSize(), and then manually set the buffer type in hoistTransientBuffers() using ctxt.lookup(buffer_name)._type.referencedType = input_type. The allocation system automatically multiplies the element count by the element size when the buffer type is properly set, achieving correct byte allocation.

Applied to files:

  • Deeploy/Targets/PULPOpen/Templates/TransposeTemplate.py
📚 Learning: 2025-09-24T12:49:17.889Z
Learnt from: diaconuccalin
Repo: pulp-platform/Deeploy PR: 117
File: Deeploy/Targets/PULPOpen/Templates/FloatConvTemplate.py:100-0
Timestamp: 2025-09-24T12:49:17.889Z
Learning: In Deeploy's PULP FloatConvTemplate.py, the parameter order for PULP_Conv2d_Im2Col_fp*_HWC calls uses X,Y ordering (dim_im_in_x, dim_im_in_y, dim_kernel_x, dim_kernel_y, stride_x, stride_y) which is correct for the implementation, despite appearing different from some other function signatures.

Applied to files:

  • Deeploy/Targets/PULPOpen/Templates/TransposeTemplate.py
  • Deeploy/Targets/PULPOpen/Templates/FloatLayernormTemplate.py
  • TargetLibraries/PULPOpen/src/DWConvolution_fp32.c
  • TargetLibraries/PULPOpen/inc/kernel/Layernorm.h
📚 Learning: 2025-09-24T11:43:47.236Z
Learnt from: diaconuccalin
Repo: pulp-platform/Deeploy PR: 117
File: .github/workflows/ci-platform-siracusa.yml:57-60
Timestamp: 2025-09-24T11:43:47.236Z
Learning: In the Deeploy test system, test names in CI workflows correspond to directory names under DeeployTest/Tests/, not Python function names. The TestRunner class executes tests by passing directory paths via the `-t` argument, where each directory contains test configurations and definitions.

Applied to files:

  • DeeployTest/testRunner_tiled_gap9.py
  • .github/workflows/_runner-gap9-tiled.yml
  • DeeployTest/testRunner_gap9.py
📚 Learning: 2025-09-09T15:58:06.454Z
Learnt from: Xeratec
Repo: pulp-platform/Deeploy PR: 105
File: Deeploy/Targets/PULPOpen/DMA/MchanDma.py:61-64
Timestamp: 2025-09-09T15:58:06.454Z
Learning: The _legalizeTransfers function in TilingCodeGeneration.py handles conversion from elements to bytes for DMA operations when isFinalMemoryLevel is true, eliminating the need for individual DMA implementations like MchanDma to perform this conversion manually.

Applied to files:

  • Deeploy/Targets/GAP9/DMA/L3Dma.py
  • TargetLibraries/GAP9/src/dory_dma.c
  • Deeploy/Targets/GAP9/DMA/MchanDma.py
📚 Learning: 2025-09-09T15:58:06.454Z
Learnt from: Xeratec
Repo: pulp-platform/Deeploy PR: 105
File: Deeploy/Targets/PULPOpen/DMA/MchanDma.py:61-64
Timestamp: 2025-09-09T15:58:06.454Z
Learning: The _legalizeTransfers function in TilingCodeGeneration.py handles conversion from elements to bytes for DMA operations when isFinalMemoryLevel is true, eliminating the need for individual DMA implementations like MchanDma to perform this conversion.

Applied to files:

  • Deeploy/Targets/GAP9/DMA/L3Dma.py
  • TargetLibraries/GAP9/src/dory_dma.c
  • Deeploy/Targets/GAP9/DMA/MchanDma.py
🧬 Code graph analysis (13)
DeeployTest/Platforms/GAP9/inc/CycleCounter.h (1)
DeeployTest/Platforms/GAP9/src/CycleCounter.c (4)
  • ResetTimer (10-13)
  • StartTimer (15-15)
  • StopTimer (17-17)
  • getCycles (19-19)
TargetLibraries/GAP9/inc/dory_dma.h (1)
TargetLibraries/GAP9/src/dory_dma.c (9)
  • dory_dma_memcpy_hwc_to_chw (38-72)
  • dory_dma_memcpy_1d_async (74-86)
  • dory_dma_memcpy_2d_async (88-108)
  • dory_dma_memcpy_3d_async (110-143)
  • dory_dma_memcpy_async (145-162)
  • dory_dma_memcpy_mindims_async (200-208)
  • dory_dma_free (210-210)
  • dory_dma_barrier (212-212)
  • dory_dma_allocate (214-214)
DeeployTest/testUtils/platformMapping.py (2)
Deeploy/Targets/GAP9/Deployer.py (1)
  • GAP9Deployer (32-102)
Deeploy/Targets/GAP9/Platform.py (3)
  • GAP9Platform (263-271)
  • MemoryGAP9Platform (274-292)
  • MemoryGAP9PlatformWrapper (295-307)
DeeployTest/Platforms/GAP9/src/deeploytest.c (2)
DeeployTest/Platforms/GAP9/src/CycleCounter.c (4)
  • ResetTimer (10-13)
  • StartTimer (15-15)
  • getCycles (19-19)
  • StopTimer (17-17)
TargetLibraries/GAP9/src/dory_mem.c (3)
  • mem_init (64-78)
  • open_fs (52-62)
  • ram_read (92-94)
TargetLibraries/GAP9/src/dory_dma.c (1)
TargetLibraries/GAP9/inc/mchan.h (5)
  • mchan_transfer_push_2d (94-106)
  • mchan_transfer_push_1d (88-92)
  • mchan_transfer_free (124-124)
  • mchan_transfer_wait (130-138)
  • mchan_transfer_get_id (86-86)
Deeploy/Targets/GAP9/Deployer.py (3)
Deeploy/CommonExtensions/NetworkDeployers/SignPropDeployer.py (1)
  • SignPropDeployer (14-57)
Deeploy/DeeployTypes.py (5)
  • ConstantBuffer (393-430)
  • DeploymentPlatform (2377-2420)
  • TopologyOptimizer (2175-2204)
  • VariableBuffer (232-360)
  • outputs (2522-2539)
Deeploy/Targets/PULPOpen/Deployer.py (1)
  • generateBufferAllocationCode (109-138)
Deeploy/Targets/PULPOpen/Deployer.py (2)
Deeploy/Targets/GAP9/Platform.py (1)
  • GAP9ClusterEngine (251-260)
Deeploy/Targets/PULPOpen/Platform.py (1)
  • PULPClusterEngine (246-255)
Deeploy/Targets/GAP9/Templates/AllocateTemplate.py (1)
Deeploy/DeeployTypes.py (1)
  • NodeTemplate (87-229)
Deeploy/Targets/GAP9/Tiler.py (5)
Deeploy/Targets/PULPOpen/TileConstraints/ConvTileConstraint.py (1)
  • Conv2DTileConstraint (233-598)
Deeploy/Targets/PULPOpen/TileConstraints/DWConvTileConstraint.py (1)
  • DWConv2DTileConstraint (238-255)
Deeploy/Targets/PULPOpen/TileConstraints/SGDTileConstraint.py (1)
  • SGDTileConstraint (8-12)
Deeploy/Targets/PULPOpen/TileConstraints/SoftmaxCrossEntropyTileConstraint.py (2)
  • SoftmaxCrossEntropyTileConstraint (19-109)
  • SoftmaxCrossEntropyGradTileConstraint (112-115)
Deeploy/TilingExtension/TilerExtension.py (1)
  • TilingReadyNodeBindings (1027-1035)
Deeploy/Targets/GAP9/Bindings.py (2)
Deeploy/CommonExtensions/DataTypes.py (3)
  • float32_t (74-78)
  • int8_t (12-15)
  • int32_t (24-27)
Deeploy/Targets/GAP9/DMA/MchanDma.py (1)
  • GAP9MchanDma (27-91)
TargetLibraries/GAP9/inc/dory_mem.h (1)
TargetLibraries/GAP9/src/dory_mem.c (13)
  • open_fs (52-62)
  • mem_init (64-78)
  • get_ram_ptr (80-80)
  • ram_malloc (82-86)
  • ram_free (88-90)
  • ram_read (92-94)
  • ram_write (96-98)
  • cl_ram_malloc (100-106)
  • cl_ram_free (108-112)
  • cl_ram_read (114-118)
  • cl_ram_write (120-124)
  • load_file_to_ram (126-151)
  • load_file_to_local (153-176)
DeeployTest/testRunner_gap9.py (1)
DeeployTest/testUtils/testRunner.py (6)
  • TestRunner (285-454)
  • TestRunnerArgumentParser (110-282)
  • parse_args (95-107)
  • parse_args (235-237)
  • cmake_args (277-282)
  • run (333-344)
Deeploy/Targets/GAP9/Templates/FreeTemplate.py (1)
Deeploy/DeeployTypes.py (1)
  • NodeTemplate (87-229)
🪛 Ruff (0.14.10)
DeeployTest/testUtils/testRunner.py

315-315: f-string without any placeholders

Remove extraneous f prefix

(F541)

Deeploy/Targets/GAP9/DMA/L3Dma.py

29-34: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

Deeploy/Targets/GAP9/Templates/__init__.py

5-5: from . import * used; unable to detect undefined names

(F403)

Deeploy/Targets/GAP9/__init__.py

5-5: from . import * used; unable to detect undefined names

(F403)

Deeploy/Targets/GAP9/Deployer.py

54-54: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)

Deeploy/Targets/GAP9/Platform.py

266-266: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)


266-266: Do not perform function call GAP9ClusterEngine in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)


276-276: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)


281-281: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)


281-281: Do not perform function call GAP9ClusterEngine in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)


297-297: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

Deeploy/Targets/GAP9/DMA/MchanDma.py

29-38: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

Deeploy/Targets/GAP9/Bindings.py

162-162: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)


174-174: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)


183-183: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)


214-214: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)


223-223: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)


248-254: Consider iterable unpacking instead of concatenation

Replace with iterable unpacking

(RUF005)


393-399: Consider iterable unpacking instead of concatenation

Replace with iterable unpacking

(RUF005)

Comment on lines +34 to +40
- name: Build Deeploy
shell: bash
run: |
source /app/install/gap9-sdk/.gap9-venv/bin/activate
source /app/install/gap9-sdk/configs/gap9_evk_audio.sh || true
pip install -e . || true
deactivate
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Don’t mask environment/build failures with || true.

pip install -e . || true (and source ... || true) can turn real breakages into false-green CI. Prefer failing fast unless you have an explicit fallback path.

Proposed fix
       - name: Build Deeploy
         shell: bash
         run: |
           source /app/install/gap9-sdk/.gap9-venv/bin/activate
-          source /app/install/gap9-sdk/configs/gap9_evk_audio.sh || true
-          pip install -e . || true
+          source /app/install/gap9-sdk/configs/gap9_evk_audio.sh
+          pip install -e .
           deactivate
🤖 Prompt for AI Agents
In @.github/workflows/_runner-gap9.yml around lines 34 - 40, The CI step "Build
Deeploy" currently masks failures by appending "|| true" to commands (notably
the "source /app/install/gap9-sdk/configs/gap9_evk_audio.sh || true" and "pip
install -e . || true"); remove the "|| true" from these commands so errors cause
the job to fail, and for the optional sourcing use a guarded conditional (e.g.,
test for the file before sourcing) rather than swallowing errors—locate the
commands inside the "Build Deeploy" run block in the workflow and update the
"source ... || true" and "pip install -e . || true" lines accordingly.

Comment on lines 47 to 63
run: |
testNames="${{ inputs.test-names }}"
source /app/install/gap9-sdk/.gap9-venv/bin/activate
source /app/install/gap9-sdk/configs/gap9_evk_audio.sh || true
export GVSOC_INSTALL_DIR=/app/install/gap9-sdk/install/workstation
export GAP_RISCV_GCC_TOOLCHAIN=/app/install/gcc/gap9
cd DeeployTest
mkdir -p /app/.ccache
export CCACHE_DIR=/app/.ccache
echo "$testNames" | while IFS= read -r testName; do
if [[ -n "$testName" ]]; then
echo "Running test: $testName"
python testRunner_gap9.py -t Tests/$testName
fi
done
deactivate
shell: bash
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Fix test failure propagation (current loop can pass even if some tests fail).

Because there’s no set -e/pipefail and the loop is fed by a pipe, a failing python testRunner_gap9.py ... may not fail the job (e.g., if a later test passes).

Proposed fix
       - name: Run Test
         run: |
-          testNames="${{ inputs.test-names }}"
+          set -euo pipefail
+          testNames="${{ inputs.test-names }}"
           source /app/install/gap9-sdk/.gap9-venv/bin/activate
-          source /app/install/gap9-sdk/configs/gap9_evk_audio.sh || true
+          source /app/install/gap9-sdk/configs/gap9_evk_audio.sh
           export GVSOC_INSTALL_DIR=/app/install/gap9-sdk/install/workstation
           export GAP_RISCV_GCC_TOOLCHAIN=/app/install/gcc/gap9
           cd DeeployTest
           mkdir -p /app/.ccache
           export CCACHE_DIR=/app/.ccache
-          echo "$testNames" | while IFS= read -r testName; do
-            if [[ -n "$testName" ]]; then
-              echo "Running test: $testName"
-              python testRunner_gap9.py -t Tests/$testName
-            fi
-          done
+          mapfile -t tests <<< "$testNames"
+          for testName in "${tests[@]}"; do
+            [[ -z "$testName" ]] && continue
+            echo "Running test: $testName"
+            python testRunner_gap9.py -t "Tests/$testName"
+          done
           deactivate
         shell: bash
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
run: |
testNames="${{ inputs.test-names }}"
source /app/install/gap9-sdk/.gap9-venv/bin/activate
source /app/install/gap9-sdk/configs/gap9_evk_audio.sh || true
export GVSOC_INSTALL_DIR=/app/install/gap9-sdk/install/workstation
export GAP_RISCV_GCC_TOOLCHAIN=/app/install/gcc/gap9
cd DeeployTest
mkdir -p /app/.ccache
export CCACHE_DIR=/app/.ccache
echo "$testNames" | while IFS= read -r testName; do
if [[ -n "$testName" ]]; then
echo "Running test: $testName"
python testRunner_gap9.py -t Tests/$testName
fi
done
deactivate
shell: bash
run: |
set -euo pipefail
testNames="${{ inputs.test-names }}"
source /app/install/gap9-sdk/.gap9-venv/bin/activate
source /app/install/gap9-sdk/configs/gap9_evk_audio.sh
export GVSOC_INSTALL_DIR=/app/install/gap9-sdk/install/workstation
export GAP_RISCV_GCC_TOOLCHAIN=/app/install/gcc/gap9
cd DeeployTest
mkdir -p /app/.ccache
export CCACHE_DIR=/app/.ccache
mapfile -t tests <<< "$testNames"
for testName in "${tests[@]}"; do
[[ -z "$testName" ]] && continue
echo "Running test: $testName"
python testRunner_gap9.py -t "Tests/$testName"
done
deactivate
shell: bash
🤖 Prompt for AI Agents
In @.github/workflows/_runner-gap9.yml around lines 47 - 63, The current loop
"echo $testNames | while IFS= read -r testName; do ... python testRunner_gap9.py
-t Tests/$testName ... done" can hide failures because of the pipe and missing
errexit; enable strict failure handling (set -euo pipefail or at least set -e
and set -o pipefail) and restructure the loop so it doesn't run in a subshell
(e.g., read from a here-string/process-substitution or iterate over testNames),
capture each python invocation's exit status (or maintain a failure flag) and
after the loop call exit with non-zero if any test failed so the job fails when
any test fails.

Comment on lines 17 to +20
docker_image_deeploy:
description: "Deeploy Image to use"
required: false
default: "ghcr.io/pulp-platform/deeploy:devel"
default: "ghcr.io/runwangdl/deeploy:gap9"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Personal Docker registry should not be used in production CI.

The default image was changed from ghcr.io/pulp-platform/deeploy:devel to ghcr.io/runwangdl/deeploy:gap9, which references a personal registry. This is problematic for:

  1. Reliability: Personal accounts can delete or modify images
  2. Security: Personal images are less auditable than organization-managed images
  3. Maintainability: The image lifecycle isn't tied to the project

Given the PR's DRAFT status, this is likely a development convenience. Before merging, ensure the image is published to the official ghcr.io/pulp-platform registry.

Expected change before merge
-        default: "ghcr.io/runwangdl/deeploy:gap9"
+        default: "ghcr.io/pulp-platform/deeploy:gap9"

Or revert to devel if GAP9-specific image isn't ready:

-        default: "ghcr.io/runwangdl/deeploy:gap9"
+        default: "ghcr.io/pulp-platform/deeploy:devel"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
docker_image_deeploy:
description: "Deeploy Image to use"
required: false
default: "ghcr.io/pulp-platform/deeploy:devel"
default: "ghcr.io/runwangdl/deeploy:gap9"
docker_image_deeploy:
description: "Deeploy Image to use"
required: false
default: "ghcr.io/pulp-platform/deeploy:gap9"
🤖 Prompt for AI Agents
In @.github/workflows/ci-deeploy.yml around lines 17 - 20, The workflow input
docker_image_deeploy currently defaults to a personal registry image
("ghcr.io/runwangdl/deeploy:gap9"), which must not be used for CI; update the
default value of the docker_image_deeploy input to point to an official,
organization-managed image (e.g., "ghcr.io/pulp-platform/deeploy:gap9" if
published, or revert to "ghcr.io/pulp-platform/deeploy:devel" until a GAP9 image
is available) and ensure any documentation or README referencing
docker_image_deeploy is consistent with the org registry change.

ccache \
ninja-build \
pkg-config \
ibglib2.0-dev \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Typo in package name.

ibglib2.0-dev should be libglib2.0-dev. This may cause the package installation to fail.

Proposed fix
-    ibglib2.0-dev \
+    libglib2.0-dev \
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
ibglib2.0-dev \
libglib2.0-dev \
🤖 Prompt for AI Agents
In @Container/Dockerfile.toolchain at line 33, The Dockerfile.toolchain contains
a typo in the APT package list: replace the incorrect package name
"ibglib2.0-dev" with the correct "libglib2.0-dev" in the package installation
line (the line that currently lists "ibglib2.0-dev" among the packages) so the
package installation succeeds.

Comment on lines +53 to 80
elseif(platform STREQUAL GAP9)

# Search for hex files generated by Python code generator
# These files indicate L3 mode (external memory with readfs)
file(GLOB_RECURSE HEXLIST
"${GENERATED_SOURCE}/hex/*"
)

if (NOT HEXLIST)
# L2 mode: No flash/readfs files
# Data lives in L2 memory only
target_compile_options(network PUBLIC -DNOFLASH)
message(STATUS "[Deeploy GAP9] L2 mode: No hex files found, -DNOFLASH set")
message(STATUS "[Deeploy GAP9] If you expect L3 mode, ensure Python codegen created hex files in ${GENERATED_SOURCE}/hex/")
else()
# L3 mode: Use flash with readfs files
# Data will be loaded from external flash via readfs
message(STATUS "[Deeploy GAP9] L3 mode: Found ${CMAKE_MATCH_COUNT} hex file(s)")
message(STATUS "==== HEXLIST ====")
foreach(f ${HEXLIST})
message(STATUS " ${f}")
endforeach()
gvsoc_flags_add_files_to_flash(GAPY_RUNNER_ARGS HEXLIST)
message(STATUS "GAPY_RUNNER_ARGS: ${GAPY_RUNNER_ARGS}")
endif()

add_subdirectory(Platforms/GAP9)
endif()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, locate and examine the full CMakeLists.txt structure to understand branch logic
find . -type f -name "CMakeLists.txt" | head -20

Repository: pulp-platform/Deeploy

Length of output: 186


🏁 Script executed:

# Examine the full relevant section of DeeployTest/CMakeLists.txt to understand the branching
cat -n DeeployTest/CMakeLists.txt | head -100

Repository: pulp-platform/Deeploy

Length of output: 186


🏁 Script executed:

# Search for DEEPLOY_ARCH and platform configuration
rg -n "DEEPLOY_ARCH|platform" DeeployTest/CMakeLists.txt | head -30

Repository: pulp-platform/Deeploy

Length of output: 186


🏁 Script executed:

# Search for GAP9 mentions across the codebase
rg -n "GAP9" --type cmake --type py --type yml | head -40

Repository: pulp-platform/Deeploy

Length of output: 186


🏁 Script executed:

# Check if CMAKE_MATCH_COUNT is used elsewhere and understand its intended use
rg -n "CMAKE_MATCH_COUNT" . | head -20

Repository: pulp-platform/Deeploy

Length of output: 186


🌐 Web query:

CMAKE_MATCH_COUNT variable CMake file GLOB_RECURSE list length

💡 Result:

CMAKE_MATCH_COUNT is NOT the length of a list produced by file(GLOB_RECURSE). It contains the number of capture groups from the most recent regular-expression match (i.e. how many CMAKE_MATCH_ variables were filled). Use list(LENGTH ) to get a list’s length (file(GLOB_RECURSE ...) stores its results in a CMake list variable). [1][2][3]

Sources:

  • CMAKE_MATCH_COUNT description. [1]
  • file(GLOB_RECURSE) docs (produces a list variable). [2]
  • list(LENGTH ...) to get list length. [3]

🌐 Web query:

CMake GLOB_RECURSE pattern hex/* vs hex/** recursive files

💡 Result:

Short answer: With file(GLOB_RECURSE) you do not need "" — a pattern like "hex/*" will already match files in hex and all its subdirectories. CMake treats the argument as a globbing expression where the part before the final slash is the recursion root and the part after is the filename pattern, so GLOB_RECURSE will search that root recursively. Using "hex/" is unnecessary and offers no special recursion behavior beyond what GLOB_RECURSE already performs; prefer explicit patterns (or RELATIVE/CONFIGURE_DEPENDS) to avoid accidental matches. [1][2]

Sources:

  • CMake file(GLOB_RECURSE) docs. [1]
  • Explanation and examples on behavior (Stack Overflow). [2]

Fix incorrect hex file count reporting in GAP9 L3 mode.

${CMAKE_MATCH_COUNT} is not the list length—it counts regex capture groups from the most recent pattern match, so the "Found … hex file(s)" message reports an incorrect count. Use list(LENGTH HEXLIST HEXCOUNT) instead:

-    message(STATUS "[Deeploy GAP9] L3 mode: Found ${CMAKE_MATCH_COUNT} hex file(s)")
+    list(LENGTH HEXLIST HEXCOUNT)
+    message(STATUS "[Deeploy GAP9] L3 mode: Found ${HEXCOUNT} hex file(s)")

The current glob pattern ${GENERATED_SOURCE}/hex/* is correct for GLOB_RECURSE and already matches files in nested subdirectories; no change to the pattern is necessary.

Regarding the reachability concern: verify that the GAP9 branch can execute when intended by checking the surrounding if/elseif chain and confirming that DEEPLOY_ARCH=PULP and platform=GAP9 are mutually exclusive in your build configuration.

🤖 Prompt for AI Agents
In @DeeployTest/CMakeLists.txt around lines 53 - 80, The message uses
${CMAKE_MATCH_COUNT} which is wrong for counting files; replace that usage by
computing the HEXLIST length with list(LENGTH HEXLIST HEXCOUNT) and then log
HEXCOUNT in the "Found … hex file(s)" message; keep the existing GLOB_RECURSE
pattern ("${GENERATED_SOURCE}/hex/*") as-is and ensure you update the message
call that currently references ${CMAKE_MATCH_COUNT} to reference ${HEXCOUNT}
instead so the reported count matches the HEXLIST contents.

Comment on lines +18 to +22
#if defined(AM_PART_APOLLO4B) | defined(DAM_PART_APOLLO3)
ret = am_util_stdio_vprintf(fmt, args);
#else
ret = vprintf(fmt, args);
#endif
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical: Wrong platform macros and incorrect preprocessor operator.

This code has multiple critical issues:

  1. Incorrect operator: Line 18 uses bitwise OR | instead of logical OR || in the preprocessor directive. Preprocessor conditionals require || for logical OR.

  2. Wrong platform macros: AM_PART_APOLLO4B and DAM_PART_APOLLO3 are Apollo microcontroller macros, not GAP9 platform macros. This appears to be copy-pasted from an Apollo platform implementation without proper adaptation for GAP9.

  3. Typo in macro name: DAM_PART_APOLLO3 should likely be AM_PART_APOLLO3 (note the missing 'A').

  4. Missing declaration: am_util_stdio_vprintf is not declared in any of the included headers. If these macros were defined, compilation would fail.

Replace with correct GAP9 platform detection macros and fix the operator.

Run the following to check what GAP9-specific platform macros should be used:

#!/bin/bash
# Search for GAP9 platform macro definitions
rg -n "define.*GAP9" -g "*.h" -g "*.c"
🤖 Prompt for AI Agents
In @TargetLibraries/GAP9/src/Util.c around lines 18 - 22, The preprocessor
conditional uses a bitwise OR and Apollo macros that don't exist here; change
the operator from '|' to '||', remove/replace Apollo-specific macros
(AM_PART_APOLLO4B, DAM_PART_APOLLO3) and any call to am_util_stdio_vprintf
(which isn't declared) with the correct GAP9 platform macros and the proper GAP9
logging/printf API (or fall back to vprintf). Locate the conditional around
am_util_stdio_vprintf/vprintf in Util.c, replace the macro checks with the
GAP9-specific macro(s) you find via the suggested rg search, use '||' for
logical OR, and ensure the selected function (vprintf or the GAP9 declared
equivalent) is actually declared/available in included headers.

@runwangdl runwangdl force-pushed the gap9-operators-github branch from c54febe to ff05304 Compare January 11, 2026 20:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Feature Addition of new features

Projects

Status: In progress

Development

Successfully merging this pull request may close these issues.

2 participants