Skip to content

Conversation

@nicobuzeta
Copy link
Collaborator

@nicobuzeta nicobuzeta commented Dec 12, 2025

Summary by CodeRabbit

  • New Features

    • Added partition-based query functionality for advanced event stream processing.
    • Introduced Python package with improved module bindings and type support.
  • Documentation

    • Added Python package installation and usage guide.
    • Updated installation instructions with simplified build process.
    • Updated feature documentation.
  • Chores

    • Simplified dependency management system.
    • Updated build scripts and toolchain configurations.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Dec 12, 2025

Note

Reviews paused

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Walkthrough

This pull request migrates the project from Conan to vcpkg for dependency management while simultaneously introducing a partitioned query evaluation system, converting Python bindings from pybind11 to nanobind, and updating all build infrastructure and documentation to support the new configuration.

Changes

Cohort / File(s) Summary
Dependency Manager Migration
vcpkg.json, vcpkg-triplets/*, .gitignore
Adds vcpkg manifest with project dependencies (antlr4, catch2, cppzmq, cereal, gmp, re2, etc.) and triplet configuration files for different compiler/architecture combinations (clang/gcc with libstdc++/libc++); adds vcpkg directories to gitignore.
Conan Removal
conanfile.py, conan_profiles/*, requirements.txt
Removes Conan recipe file, all Conan profile configurations (x86_64-linux-clang-libstdc, x86_64-linux-gcc-libstdc, aarch64-*, x86_64-linux-clang-libc, aarch64-macos-clang-libc), and Conan from Python requirements.
CMake Build System Updates
CMakeLists.txt
Adds vcpkg toolchain setup before cmake_minimum_required; replaces direct find_package(gmp) with pkg_check_modules; migrates Python module from pybind11-based pycore to nanobind-based pycer with stub generation and install rules; updates core target linkage to use PkgConfig::gmp/gmpxx and new uwebsockets/usockets targets via vcpkg; introduces VCPKG_TARGET_TRIPLET handling for platform-specific includes.
GitHub Workflows
.github/workflows/benchmark.yaml, checks.yaml, test-checks.yaml, valgrind-checks.yaml, .github/workflows/upload-conan.yaml
Removes all Conan-specific setup steps (Get Conan, Setup conan, add artifactory, update remotes) across workflows; replaces with consolidated install_dependencies.sh script invocation; changes matrix key from conan_profile to compiler_profile with new values (clang-libstdcxx, gcc-libstdcxx); removes upload-conan.yaml workflow entirely; updates script arguments from -p conan_profiles/${{...}} to -c ${{...}}.
Build Scripts
scripts/build.sh, build_grammar.sh, build_and_test.sh, build_and_valgrind.sh, common.sh, all.sh, install_dependencies.sh, create_package.sh
Removes conanfile.py comments throughout; replaces Conan profile handling with compiler_profile validation in common.sh; updates install_dependencies.sh to clone/bootstrap vcpkg and generate ANTLR scripts via Python; removes create_package.sh entirely; updates path references from build/${{BUILD_TYPE}}/executable to build/executable; introduces COMPILER_PROFILE validation (clang-libstdcxx, clang-libcxx, gcc-libstdcxx) and vcpkg triplet configuration.
Query Test Scripts
scripts/queries/base_queries/build_and_test_*.sh, benchmark_queries/benchmark_runner.sh
Replaces all BUILD_TYPE-based executable paths (build/${{BUILD_TYPE}}/offline) with static paths (build/offline); removes conanfile.py comments.
ANTLR Generation
scripts/generate_antlr.py, src/core_server/internal/parsing/*/autogenerate_script.sh
Introduces new Python script to generate per-grammar ANTLR autogeneration scripts; updates autogenerate_script.sh headers to reference scripts/generate_antlr.py instead of conanfile.py.
Python Bindings Migration
src/pybind/python_binding.cpp
Comprehensive migration from pybind11 to nanobind: replaces PYBIND11_MODULE with NB_MODULE, py::class_ with nb::class_, py::init with nb::init, def_readonly with def_ro, py::overload_cast with nb::overload_cast, and py::arg with nb::arg; updates exception handling and iterator patterns; maintains all public class/function bindings.
Query Processing & Evaluators
src/core_server/internal/interface/modules/query/dynamic_evaluator.hpp, evaluators/dynamic_evaluator.hpp (new wrappers), query_types/partition_by_query.hpp
Introduces EvaluatorIndexWrapper and EvaluatorStorageWrapper for LRU tracking; replaces vector storage with unordered_map + doubly-linked list in DynamicEvaluator; adds process_event, get_or_create_evaluator, and clean_or_expand_evaluator_storage methods; creates new PartitionByQuery class with partition-key-based evaluator routing, TupleValuesKey struct for partition identification, and event-to-partition mapping with dynamic evaluator lifecycle management.
Evaluator Helpers
src/core_server/internal/evaluation/evaluator.hpp
Adds public is_empty() and is_time_window_empty(uint64_t) methods; makes out-of-order timestamp checking unconditional with [[unlikely]] annotation and early return in non-debug builds.
Dockerfile & Container
Dockerfile, .devcontainer/Dockerfile
Removes Python venv and Conan setup; replaces with install_dependencies.sh invocation; updates PATH to reference new Python venv location at /CORE/python_streamer/.venv/bin.
Documentation
README.md, docs/developers/README.md, docs/users/INSTALLATION.md, PYTHON_PACKAGE.md
Updates README to reference vcpkg instead of Conan; changes developer docs from conan_profiles to vcpkg-triplets; completely rewrites installation guide to use g++ 14, Kitware CMake, and install_dependencies.sh; adds new PYTHON_PACKAGE.md with wheel build and PyPI upload instructions.
Python Packaging
pyproject.toml
Introduces new pyproject.toml with scikit-build-core build system, project metadata (name: cer-core, requires-python ≥3.8), and tooling config for mypy type checking and explicit VCPKG CMake arguments with overlay triplets.

Sequence Diagram(s)

sequenceDiagram
    participant App as Application
    participant Evt as Event Router
    participant PQ as PartitionByQuery
    participant PE as Partition Evaluator<br/>(via DynamicEvaluator)
    participant LRU as LRU Manager
    
    App->>Evt: send_event(Event)
    Evt->>PQ: process_event(EventWrapper)
    activate PQ
    PQ->>PQ: find_tuple_indexes(event)<br/>extract partition attributes
    alt Partition attributes found
        PQ->>PQ: find_or_create_evaluator_index<br/>build partition key<br/>hash & lookup/create
        PQ->>LRU: get_or_create_evaluator(idx)
        activate LRU
        alt Evaluator exists
            LRU->>LRU: move to front of LRU list
            LRU-->>PE: existing evaluator
        else New partition
            LRU->>PE: create new evaluator instance
            LRU->>LRU: save_evaluator()<br/>insert at front
            LRU->>LRU: clean_or_expand_storage()<br/>if capacity exceeded
            LRU-->>PE: new evaluator
        end
        deactivate LRU
        PQ->>PE: route event to<br/>partition evaluator
        activate PE
        PE->>PE: process_event(event)
        PE-->>PQ: optional results
        deactivate PE
        PQ-->>App: forward results
    else Missing partition attrs
        PQ-->>App: empty result
    end
    deactivate PQ
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Areas requiring extra attention:

  • CMakeLists.txt: Dense logic involving vcpkg toolchain setup, PkgConfig integration, dynamic triplet-based include paths, and new Python module installation workflows. Verify all find_package and target_link_libraries calls are correct.
  • scripts/common.sh: Critical build script with new input validation, COMPILER_PROFILE handling, vcpkg triplet construction, and CMake flag assembly. Ensure all compiler profiles and sanitizer options are properly validated.
  • PartitionByQuery & DynamicEvaluator: New complex logic for per-partition evaluator management, LRU eviction, and event routing. Review partition key hashing semantics, evaluator lifecycle, and time-window-based cleanup.
  • src/pybind/python_binding.cpp: Large-scale binding migration from pybind11 to nanobind. Verify all class hierarchies, constructors, read-only properties, overloads, and exception registrations translate correctly.
  • GitHub Workflows: Verify all matrix permutations and script invocation updates; ensure compiler_profile values are consistent across all workflows.

Possibly related PRs

  • Allow other types in partition by #159: Directly related via PartitionByQuery enhancements (TupleValuesKey, ValueVectorHash, partition-based evaluator mapping and Value-based key cloning) that extend similar code-level modifications.

Poem

🐰 From Conan's grasp, we hop away,
To vcpkg's brighter, faster way,
With partitions split and queries routed right,
The evaluators dance in LRU delight!
And nanobind binds our Python dreams—
A faster future, or so it seems!

Pre-merge checks and finishing touches

❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 23.08% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
Title check ❓ Inconclusive The title 'Feat/vckpg gemini' is vague and does not clearly describe the main changes in the pull request, which involve replacing Conan with vcpkg as the dependency manager across the project. Revise the title to be more descriptive and specific, such as 'Replace Conan with vcpkg for dependency management' or 'Migrate from Conan to vcpkg package manager'.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@nicobuzeta
Copy link
Collaborator Author

@coderabbitai pause

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 12

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (8)
scripts/common.sh (1)

4-77: Send validation errors to stderr + prefer [[ ... ]] in bash
The validation is good, but errors should be >&2 and [[ ... ]] is safer/consistent with bash tooling.

-            echo -e "${RED}Error: Invalid build type '${BUILD_TYPE}'. Valid options are: Debug, Release, RelWithDebInfo, MinSizeRel${NORMAL_OUTPUT}"
+            echo -e "${RED}Error: Invalid build type '${BUILD_TYPE}'. Valid options are: Debug, Release, RelWithDebInfo, MinSizeRel${NORMAL_OUTPUT}" >&2
@@
-            echo -e "${RED}Error: Invalid compiler profile '${COMPILER_PROFILE}'. Valid options are: clang-libstdcxx, clang-libcxx, gcc-libstdcxx${NORMAL_OUTPUT}"
+            echo -e "${RED}Error: Invalid compiler profile '${COMPILER_PROFILE}'. Valid options are: clang-libstdcxx, clang-libcxx, gcc-libstdcxx${NORMAL_OUTPUT}" >&2
@@
-            echo -e "${RED}Error: Invalid sanitizer '${SANITIZER}'. Valid options are: none, address, thread${NORMAL_OUTPUT}"
+            echo -e "${RED}Error: Invalid sanitizer '${SANITIZER}'. Valid options are: none, address, thread${NORMAL_OUTPUT}" >&2
@@
-            echo -e "${RED}Error: Invalid logging level '${LOGGING}'. Valid options are: trace, debug, info, warn, error, critical, off${NORMAL_OUTPUT}"
+            echo -e "${RED}Error: Invalid logging level '${LOGGING}'. Valid options are: trace, debug, info, warn, error, critical, off${NORMAL_OUTPUT}" >&2
@@
-        if [ "$J" != "all-1" ] && ! [[ "$J" =~ ^[0-9]+$ ]]; then
-          echo -e "${RED}Error: Invalid value for -j flag '${J}'. Must be a positive number or 'all-1'${NORMAL_OUTPUT}"
+        if [[ "$J" != "all-1" && ! "$J" =~ ^[0-9]+$ ]]; then
+          echo -e "${RED}Error: Invalid value for -j flag '${J}'. Must be a positive number or 'all-1'${NORMAL_OUTPUT}" >&2
           exit 1
         fi
src/pybind/python_binding.cpp (1)

46-70: Move NB_MODULE to global scope outside namespace CORE

NB_MODULE(pycer, m) generates the Python module entry point and must be declared at global (translation-unit) scope. Placing it inside namespace CORE breaks the module initialization symbol, preventing Python import. Close the namespace before NB_MODULE:

-NB_MODULE(pycer, m) {
+}  // namespace CORE
+
+NB_MODULE(pycer, m) {
   // clang-format off
.github/workflows/test-checks.yaml (3)

41-44: Avoid interactive update-alternatives --config gcc in CI (can hang jobs).
Use --set (or skip alternatives entirely and invoke gcc-14/g++-14 explicitly).

- sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-14 60 --slave /usr/bin/g++ g++ /usr/bin/g++-14 && sudo update-alternatives --config gcc
+ sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-14 60 \
+   --slave /usr/bin/g++ g++ /usr/bin/g++-14
+ sudo update-alternatives --set gcc /usr/bin/gcc-14

Also applies to: 90-94, 143-147, 202-206, 262-266


34-39: Stop downloading libtinfo5 over plain HTTP + consider avoiding the pinned .deb.
This is both a supply-chain risk and brittle against ubuntu-latest changes. Prefer an apt package, or at least use HTTPS + integrity verification.

- wget http://security.ubuntu.com/ubuntu/pool/universe/n/ncurses/libtinfo5_6.3-2ubuntu0.1_amd64.deb
+ wget https://security.ubuntu.com/ubuntu/pool/universe/n/ncurses/libtinfo5_6.3-2ubuntu0.1_amd64.deb

Also applies to: 84-89, 137-142, 196-201, 256-261


185-190: ccache key references matrix.os (undefined) → key collisions / poor cache behavior.
Consider keying on runner.os + matrix.compiler_profile.

- key: ${{ github.job }}-${{ matrix.os }}
+ key: ${{ github.job }}-${{ runner.os }}-${{ matrix.compiler_profile }}

Also applies to: 245-250

docs/users/INSTALLATION.md (1)

56-66: Docs: running ./scripts/*.sh may fail if not executable—consider documenting chmod +x.
(Workflows already do this, but users following docs may hit “permission denied”.)

- ./scripts/install_dependencies.sh
+ chmod +x ./scripts/install_dependencies.sh && ./scripts/install_dependencies.sh
...
- ./scripts/build.sh
+ chmod +x ./scripts/build.sh && ./scripts/build.sh
src/core_server/internal/evaluation/evaluator.hpp (1)

71-118: is_empty() invariant can be violated by reset() (assert may fire / release may misreport).

is_empty() asserts historic_ordered_keys.size() == historic_union_list_map.size() (Line 106-107), but reset() currently clears historic_ordered_keys without clearing historic_union_list_map, so the containers are not “in sync” immediately after reset.

Concrete fix: clear the map in reset() (keeps invariant true and makes is_empty() safe for other callers).

 void reset() {
   LOG_TRACE_L3(logger, "Resetting historic states in Evaluator");
   cea.state_manager.unpin_states(historic_ordered_keys);
   historic_ordered_keys.clear();
   for (auto& [state, ul] : historic_union_list_map) {
     tecs->unpin(ul);
   }
+  historic_union_list_map.clear();
 }
src/core_server/internal/interface/modules/query/query_types/partition_by_query.hpp (1)

203-228: Unbounded partition map growth + deep-cloned keys can become a memory hot spot.

Each new partition stores cloned Types::Value objects in the unordered_map key (Line 206-216, 223-227). Combined with the fact that partition_by_attrs_to_evaluator_idx never shrinks (even if DynamicEvaluator evicts evaluators), this can grow without bound.

If you expect high-cardinality partitions, consider:

  • storing a compact canonical partition key (e.g., stable hash + collision-safe representation),
  • and/or coordinating eviction so partitions can be retired when their evaluator is evicted.
♻️ Duplicate comments (2)
scripts/build_and_valgrind.sh (1)

18-18: Duplicate concern: Build system flat directory verification needed.

These path changes have the same root cause and verification requirement as the changes in scripts/build_and_test.sh. Please see that review comment for details on verifying flat build/ output directory configuration.

Also applies to: 30-30

scripts/queries/base_queries/build_and_test_bluesky_ordered.sh (1)

14-14: Verify hardcoded build path aligns with CMake output structure.

Same concern as in build_and_test_smart_homes_queries.sh: the hardcoded path build/offline must match the actual CMake output location. Please verify using the script provided in the other file's review.

🧹 Nitpick comments (11)
vcpkg-triplets/toolchains/gcc-libstdcxx.cmake (1)

3-4: Verify that C++23 standard is intentional and compatible with build environment.

The toolchain sets CMAKE_CXX_STANDARD to 23, which is a very recent standard. This requires:

  • GCC 14+ for full C++23 support (GCC 13 has only partial support)
  • All CI/CD environments must provide a compatible compiler version
  • Project dependencies and existing code must be compatible with C++23

Confirm whether C++23 was an intentional choice for this PR, or if this should remain at C++20/C++17 for wider compatibility.

Can you confirm the minimum GCC version requirement? If C++20 is acceptable, I can help update this to CMAKE_CXX_STANDARD 20.

src/pybind/python_binding.cpp (1)

120-134: Remove rvalue-ref from Python-facing constructors
nb::init<uint64_t, std::vector<std::shared_ptr<Types::Event>>&&>() and nb::init<uint64_t, std::string, std::vector<Types::EventInfo>&&>() use rvalue references, which nanobind doesn't properly support in Python bindings. Use by-value parameters instead (matching the pattern in Types::EventInfo and Types::EventInfoParsed elsewhere in this file). When Python calls these constructors, nanobind creates temporary C++ objects and move-semantics are handled automatically.

         nb::class_<Types::Stream>(m, "PyStream")
-            .def(nb::init<uint64_t, std::vector<std::shared_ptr<Types::Event>>&&>())
+            .def(nb::init<uint64_t, std::vector<std::shared_ptr<Types::Event>>>())
             .def_ro("id", &Types::Stream::id)
             .def_ro("events", &Types::Stream::events);

         nb::class_<Types::StreamInfo>(m, "PyStreamInfo")
-            .def(nb::init<uint64_t, std::string, std::vector<Types::EventInfo>&&>())
+            .def(nb::init<uint64_t, std::string, std::vector<Types::EventInfo>>())
             .def_ro("events_info", &Types::StreamInfo::events_info)
             .def_ro("name", &Types::StreamInfo::name)
             .def_ro("id", &Types::StreamInfo::id);
vcpkg.json (1)

1-18: Consider version constraints for reproducibility.

The vcpkg manifest is well-structured and includes comprehensive dependencies. However, no version constraints are specified, which means vcpkg will use baseline versions from its registry. While this provides flexibility, it may lead to version drift and potential compatibility issues over time.

Consider adding version constraints for critical dependencies to ensure reproducible builds:

 {
     "name": "cer-core",
     "version": "0.0.0",
     "dependencies": [
-        "antlr4",
+        {
+            "name": "antlr4",
+            "version>=": "4.12.0"
+        },
         "catch2",
         ...
     ]
 }

Alternatively, you can rely on vcpkg's baseline versioning in vcpkg-configuration.json if that provides sufficient control for your use case.

.github/workflows/benchmark.yaml (1)

59-60: Make scripts/install_dependencies.sh deterministic + cacheable in CI
Right now CI performance/reliability depends entirely on that script (vcpkg clone/baseline, registries, retries, timeouts). Consider ensuring it (1) pins a vcpkg baseline/commit and (2) uses a stable download/build cache directory you can persist via actions/cache.

Dockerfile (1)

54-56: Consider reordering to improve Docker cache reuse for vcpkg installs
Since COPY . . happens before this, any source change will bust the dependency layer and rebuild all vcpkg deps. If feasible, copy only vcpkg.json/triplets first, run scripts/install_dependencies.sh, then copy the rest.

.devcontainer/Dockerfile (1)

22-28: Pin the vcpkg commit (and confirm deps actually get installed in the devcontainer)
git clone without a pinned commit/baseline makes devcontainer builds non-reproducible. Also verify there’s a later vcpkg install (or scripts/install_dependencies.sh) step; otherwise this only bootstraps vcpkg but doesn’t fetch ports.

scripts/generate_antlr.py (1)

62-71: Consider creating missing directories instead of exiting.
If someone runs this before those parsing dirs exist (or paths change), generation hard-fails.

     for dir_name, grammar_name in directories.items():
         full_path = path.join(parsing_dir, dir_name, script_name)
         print(f"Generating {full_path}...")
         try:
+            os.makedirs(path.join(parsing_dir, dir_name), exist_ok=True)
             with open(full_path, "w", encoding="utf-8") as f:
                 f.write(create_script(grammar_name, antlr4_version))
             os.chmod(full_path, 0o755) # Make executable
         except FileNotFoundError:
CMakeLists.txt (1)

169-175: Prefer vcpkg-provided install dir variables (or target include dirs) over hardcoding ${CMAKE_BINARY_DIR}/vcpkg_installed/....
Hardcoding is brittle if VCPKG_INSTALLED_DIR is customized (or if the toolchain changes defaults).

-target_include_directories(core SYSTEM PUBLIC "${CMAKE_BINARY_DIR}/vcpkg_installed/${VCPKG_TARGET_TRIPLET}/include/uwebsockets")
+if(DEFINED VCPKG_INSTALLED_DIR)
+  target_include_directories(core SYSTEM PUBLIC "${VCPKG_INSTALLED_DIR}/${VCPKG_TARGET_TRIPLET}/include/uwebsockets")
+else()
+  target_include_directories(core SYSTEM PUBLIC "${CMAKE_BINARY_DIR}/vcpkg_installed/${VCPKG_TARGET_TRIPLET}/include/uwebsockets")
+endif()
src/core_server/internal/interface/modules/query/evaluators/dynamic_evaluator.hpp (2)

32-58: Make EvaluatorStorageWrapper constructors explicit + simplify init (Sonar).

Both constructors (Line 44-48) are single-arg and should be explicit to prevent accidental implicit conversions. Also : evaluator() is redundant.

 struct EvaluatorStorageWrapper {
   std::optional<EvaluatorIndexWrapper> evaluator;
   std::optional<std::list<EvaluatorIndexWrapper>::iterator> list_position;

-  EvaluatorStorageWrapper(EvaluatorIndexWrapper&& evaluator)
+  explicit EvaluatorStorageWrapper(EvaluatorIndexWrapper&& evaluator)
       : evaluator(std::move(evaluator)), list_position(std::nullopt) {}

-  EvaluatorStorageWrapper(std::list<EvaluatorIndexWrapper>::iterator list_it)
-      : evaluator(), list_position(list_it) {}
+  explicit EvaluatorStorageWrapper(std::list<EvaluatorIndexWrapper>::iterator list_it)
+      : list_position(list_it) {}
 };

124-152: Avoid std::move on local return (copy elision).

Line 151: return std::move(enumerator); inhibits copy elision; just return enumerator; is preferable.

src/core_server/internal/interface/modules/query/query_types/partition_by_query.hpp (1)

108-142: Good staged query setup; consider clearing per-query caches on (re)create.

The 5-stage pipeline reads well. If create_query() can be called more than once per instance, you likely want to clear:

  • partition_by_attrs_to_evaluator_idx
  • event_id_to_tuple_idx
  • evaluator (old evaluators)

Otherwise, stale schema indices / partitions could leak across queries.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 479d0fe and 16d0b15.

📒 Files selected for processing (52)
  • .devcontainer/Dockerfile (1 hunks)
  • .github/workflows/benchmark.yaml (1 hunks)
  • .github/workflows/checks.yaml (2 hunks)
  • .github/workflows/test-checks.yaml (6 hunks)
  • .github/workflows/upload-conan.yaml (0 hunks)
  • .github/workflows/valgrind-checks.yaml (2 hunks)
  • .gitignore (1 hunks)
  • CMakeLists.txt (5 hunks)
  • Dockerfile (1 hunks)
  • PYTHON_PACKAGE.md (1 hunks)
  • README.md (1 hunks)
  • conan_profiles/aarch64-linux-gcc-libstdc (0 hunks)
  • conan_profiles/aarch64-macos-clang-libc (0 hunks)
  • conan_profiles/x86_64-linux-clang-libc (0 hunks)
  • conan_profiles/x86_64-linux-clang-libstdc (0 hunks)
  • conan_profiles/x86_64-linux-gcc-libstdc (0 hunks)
  • conanfile.py (0 hunks)
  • docs/developers/README.md (1 hunks)
  • docs/users/INSTALLATION.md (3 hunks)
  • pyproject.toml (1 hunks)
  • requirements.txt (0 hunks)
  • scripts/all.sh (1 hunks)
  • scripts/build.sh (0 hunks)
  • scripts/build_and_test.sh (1 hunks)
  • scripts/build_and_valgrind.sh (2 hunks)
  • scripts/build_grammar.sh (0 hunks)
  • scripts/common.sh (4 hunks)
  • scripts/create_package.sh (0 hunks)
  • scripts/generate_antlr.py (1 hunks)
  • scripts/install_dependencies.sh (1 hunks)
  • scripts/queries/base_queries/build_and_test_bluesky_ordered.sh (1 hunks)
  • scripts/queries/base_queries/build_and_test_bluesky_unordered.sh (1 hunks)
  • scripts/queries/base_queries/build_and_test_smart_homes_queries.sh (1 hunks)
  • scripts/queries/base_queries/build_and_test_stock_queries.sh (1 hunks)
  • scripts/queries/base_queries/build_and_test_stock_queries_unordered.sh (1 hunks)
  • scripts/queries/base_queries/build_and_test_taxi_queries.sh (1 hunks)
  • scripts/queries/benchmark_queries/benchmark_runner.sh (2 hunks)
  • scripts/run_and_compare.sh (0 hunks)
  • src/core_server/internal/evaluation/evaluator.hpp (3 hunks)
  • src/core_server/internal/interface/modules/query/evaluators/dynamic_evaluator.hpp (4 hunks)
  • src/core_server/internal/interface/modules/query/query_types/partition_by_query.hpp (6 hunks)
  • src/core_server/internal/parsing/ceql_query/autogenerate_script.sh (1 hunks)
  • src/core_server/internal/parsing/option_declaration/autogenerate_script.sh (1 hunks)
  • src/core_server/internal/parsing/stream_declaration/autogenerate_script.sh (1 hunks)
  • src/pybind/python_binding.cpp (4 hunks)
  • vcpkg-triplets/toolchains/clang-libcxx.cmake (1 hunks)
  • vcpkg-triplets/toolchains/clang-libstdcxx.cmake (1 hunks)
  • vcpkg-triplets/toolchains/gcc-libstdcxx.cmake (1 hunks)
  • vcpkg-triplets/x64-linux-clang-libcxx.cmake (1 hunks)
  • vcpkg-triplets/x64-linux-clang-libstdcxx.cmake (1 hunks)
  • vcpkg-triplets/x64-linux-gcc-libstdcxx.cmake (1 hunks)
  • vcpkg.json (1 hunks)
💤 Files with no reviewable changes (12)
  • scripts/run_and_compare.sh
  • scripts/build_grammar.sh
  • scripts/create_package.sh
  • conan_profiles/aarch64-linux-gcc-libstdc
  • conan_profiles/x86_64-linux-clang-libc
  • scripts/build.sh
  • .github/workflows/upload-conan.yaml
  • conan_profiles/x86_64-linux-clang-libstdc
  • conan_profiles/x86_64-linux-gcc-libstdc
  • requirements.txt
  • conanfile.py
  • conan_profiles/aarch64-macos-clang-libc
🧰 Additional context used
🧬 Code graph analysis (3)
src/core_server/internal/interface/modules/query/evaluators/dynamic_evaluator.hpp (1)
src/core_server/internal/evaluation/evaluator.hpp (7)
  • assert (105-109)
  • event (120-228)
  • event (121-121)
  • event (260-319)
  • event (260-264)
  • current_time (111-118)
  • current_time (111-111)
src/core_server/internal/interface/modules/query/query_types/partition_by_query.hpp (2)
src/shared/datatypes/value.hpp (1)
  • val (145-145)
src/core_server/internal/interface/modules/query/evaluators/dynamic_evaluator.hpp (5)
  • event (124-152)
  • event (125-125)
  • evaluator_idx (157-180)
  • evaluator_idx (158-158)
  • evaluator (50-57)
src/core_server/internal/evaluation/evaluator.hpp (2)
src/core_server/internal/interface/modules/query/evaluators/dynamic_evaluator.hpp (2)
  • event (124-152)
  • event (125-125)
src/core_server/internal/ceql/query/within.hpp (1)
  • time_window (73-88)
🪛 Cppcheck (2.18.0)
src/pybind/python_binding.cpp

[error] 67-67: syntax error

(syntaxError)

🪛 GitHub Check: SonarCloud Code Analysis
scripts/install_dependencies.sh

[failure] 13-13: Use '[[' instead of '[' for conditional tests. The '[[' construct is safer and more feature-rich.

See more on https://sonarcloud.io/project/issues?id=CORE-cer_CORE&issues=AZsT1bGvTl4z2CG7OjhN&open=AZsT1bGvTl4z2CG7OjhN&pullRequest=166

src/core_server/internal/interface/modules/query/evaluators/dynamic_evaluator.hpp

[warning] 48-48: Remove this use of the constructor's initializer list for data member "evaluator". It is redundant with default initialization behavior.

See more on https://sonarcloud.io/project/issues?id=CORE-cer_CORE&issues=AZsT1bDATl4z2CG7Ojg_&open=AZsT1bDATl4z2CG7Ojg_&pullRequest=166


[failure] 47-47: Add the "explicit" keyword to this constructor.

See more on https://sonarcloud.io/project/issues?id=CORE-cer_CORE&issues=AZsT1bDATl4z2CG7OjhB&open=AZsT1bDATl4z2CG7OjhB&pullRequest=166


[failure] 44-44: Add the "explicit" keyword to this constructor.

See more on https://sonarcloud.io/project/issues?id=CORE-cer_CORE&issues=AZsT1bDATl4z2CG7OjhA&open=AZsT1bDATl4z2CG7OjhA&pullRequest=166


[warning] 151-151: moving a local object in a return statement prevents copy elision

See more on https://sonarcloud.io/project/issues?id=CORE-cer_CORE&issues=AZsT1bDATl4z2CG7Ojg-&open=AZsT1bDATl4z2CG7Ojg-&pullRequest=166

src/core_server/internal/evaluation/evaluator.hpp

[warning] 114-114: Use the init-statement to declare "expiration_time" inside the if statement.

See more on https://sonarcloud.io/project/issues?id=CORE-cer_CORE&issues=AZsT1bCcTl4z2CG7Ojg9&open=AZsT1bCcTl4z2CG7Ojg9&pullRequest=166

scripts/common.sh

[warning] 41-41: Redirect this error message to stderr (>&2).

See more on https://sonarcloud.io/project/issues?id=CORE-cer_CORE&issues=AZsT1bGaTl4z2CG7OjhE&open=AZsT1bGaTl4z2CG7OjhE&pullRequest=166


[warning] 28-28: Redirect this error message to stderr (>&2).

See more on https://sonarcloud.io/project/issues?id=CORE-cer_CORE&issues=AZsT1bGaTl4z2CG7OjhD&open=AZsT1bGaTl4z2CG7OjhD&pullRequest=166


[warning] 15-15: Redirect this error message to stderr (>&2).

See more on https://sonarcloud.io/project/issues?id=CORE-cer_CORE&issues=AZsT1bGaTl4z2CG7OjhC&open=AZsT1bGaTl4z2CG7OjhC&pullRequest=166


[warning] 58-58: Redirect this error message to stderr (>&2).

See more on https://sonarcloud.io/project/issues?id=CORE-cer_CORE&issues=AZsT1bGaTl4z2CG7OjhF&open=AZsT1bGaTl4z2CG7OjhF&pullRequest=166


[warning] 68-68: Redirect this error message to stderr (>&2).

See more on https://sonarcloud.io/project/issues?id=CORE-cer_CORE&issues=AZsT1bGaTl4z2CG7OjhG&open=AZsT1bGaTl4z2CG7OjhG&pullRequest=166


[failure] 67-67: Use '[[' instead of '[' for conditional tests. The '[[' construct is safer and more feature-rich.

See more on https://sonarcloud.io/project/issues?id=CORE-cer_CORE&issues=AZsT1bGaTl4z2CG7OjhH&open=AZsT1bGaTl4z2CG7OjhH&pullRequest=166


[failure] 108-108: Use '[[' instead of '[' for conditional tests. The '[[' construct is safer and more feature-rich.

See more on https://sonarcloud.io/project/issues?id=CORE-cer_CORE&issues=AZsT1bGaTl4z2CG7OjhK&open=AZsT1bGaTl4z2CG7OjhK&pullRequest=166


[failure] 83-83: Use '[[' instead of '[' for conditional tests. The '[[' construct is safer and more feature-rich.

See more on https://sonarcloud.io/project/issues?id=CORE-cer_CORE&issues=AZsT1bGaTl4z2CG7OjhI&open=AZsT1bGaTl4z2CG7OjhI&pullRequest=166


[failure] 124-124: Use '[[' instead of '[' for conditional tests. The '[[' construct is safer and more feature-rich.

See more on https://sonarcloud.io/project/issues?id=CORE-cer_CORE&issues=AZsT1bGaTl4z2CG7OjhM&open=AZsT1bGaTl4z2CG7OjhM&pullRequest=166


[failure] 116-116: Use '[[' instead of '[' for conditional tests. The '[[' construct is safer and more feature-rich.

See more on https://sonarcloud.io/project/issues?id=CORE-cer_CORE&issues=AZsT1bGaTl4z2CG7OjhL&open=AZsT1bGaTl4z2CG7OjhL&pullRequest=166


[failure] 106-106: Use '[[' instead of '[' for conditional tests. The '[[' construct is safer and more feature-rich.

See more on https://sonarcloud.io/project/issues?id=CORE-cer_CORE&issues=AZsT1bGaTl4z2CG7OjhJ&open=AZsT1bGaTl4z2CG7OjhJ&pullRequest=166

🪛 Ruff (0.14.8)
scripts/generate_antlr.py

68-68: os.chmod setting a permissive mask 0o755 on file or directory

(S103)

🪛 Shellcheck (0.11.0)
scripts/all.sh

[error] 56-56: Double quote array expansions to avoid re-splitting elements.

(SC2068)


[error] 64-64: Double quote array expansions to avoid re-splitting elements.

(SC2068)


[error] 72-72: Double quote array expansions to avoid re-splitting elements.

(SC2068)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
  • GitHub Check: run-address-sanitizer-tests-release (clang-libstdcxx)
  • GitHub Check: run-basic-tests-debug (gcc-libstdcxx)
  • GitHub Check: run-tests-release (gcc-libstdcxx)
  • GitHub Check: run-basic-tests-debug (clang-libstdcxx)
  • GitHub Check: run-thread-sanitizer-tests-release (clang-libstdcxx)
  • GitHub Check: run-tests-release (clang-libstdcxx)
  • GitHub Check: run-sanitizer-tests-debug (clang-libstdcxx)
  • GitHub Check: check-code-format
  • GitHub Check: check-clang-tidy
  • GitHub Check: check-ceql-query-grammar-up-to-date
  • GitHub Check: run-valgrind-debug (gcc-libstdcxx)
  • GitHub Check: run-docker-build-check
  • GitHub Check: run-valgrind-debug (clang-libstdcxx)
  • GitHub Check: run-benchmarks-release
🔇 Additional comments (26)
.gitignore (1)

2-3: LGTM!

The vcpkg directory entries are correctly added and aligned with the Conan-to-vcpkg migration.

docs/developers/README.md (1)

29-29: LGTM!

Documentation correctly updated to reference vcpkg triplets instead of Conan profiles.

README.md (1)

19-19: LGTM!

Feature list correctly updated to reflect vcpkg as the dependency manager.

scripts/all.sh (1)

56-56: All base_queries scripts exist at the specified paths and are accessible. The script references in scripts/all.sh (lines 56, 64, 72) are correct:

  • ./scripts/queries/base_queries/build_and_test_stock_queries.sh
  • ./scripts/queries/base_queries/build_and_test_smart_homes_queries.sh
  • ./scripts/queries/base_queries/build_and_test_taxi_queries.sh
scripts/build_and_test.sh (1)

16-16: The test executable paths are correctly configured for the current build system. With Ninja (a single-config generator) and no explicit CMAKE_RUNTIME_OUTPUT_DIRECTORY setting in CMakeLists.txt, CMake outputs executables directly to the specified build directory. The flat build/tests and build/third_party_tests paths are correct and will resolve properly.

scripts/queries/base_queries/build_and_test_stock_queries_unordered.sh (1)

14-14: The executable path build/offline is correct and uses CMake's default behavior.

This file was newly created as part of the conan removal commit. The flat build/offline path is the standard CMake behavior when CMAKE_RUNTIME_OUTPUT_DIRECTORY is not explicitly set—executables are placed directly in the CMAKE_BINARY_DIR (which is build/ in this configuration). All other query scripts in the same directory consistently use the same build/offline path. No per-BUILD_TYPE subdirectories (like build/Debug/ or build/Release/) are configured, and this is appropriate for the current build setup.

src/core_server/internal/parsing/ceql_query/autogenerate_script.sh (1)

3-6: Header update for generated script is clear
This makes it explicit where to edit/regenerate.

scripts/queries/base_queries/build_and_test_bluesky_unordered.sh (1)

13-15: No action needed: build/offline is the correct executable path

The path executable="build/offline" is the standard output location used consistently across the entire test suite (all query scripts and benchmarks). The script properly calls the build() function on line 11 before using this executable, which generates the binary at this location.

scripts/queries/base_queries/build_and_test_stock_queries.sh (1)

13-15: Path is correct; executable will be at build/offline after building the project

The CMake configuration explicitly uses cmake -B build to set the build directory, and the offline executable defined in CMakeLists.txt will be placed at build/offline. This is the intended post-build artifact location, not a source repository artifact, so the reference is accurate.

vcpkg-triplets/x64-linux-clang-libcxx.cmake (1)

1-5: LGTM!

The vcpkg triplet configuration is well-structured and follows vcpkg conventions. The combination of static library linkage with dynamic CRT linkage is appropriate, and the chainloaded toolchain path correctly uses CMAKE_CURRENT_LIST_DIR for relative resolution.

vcpkg-triplets/toolchains/clang-libstdcxx.cmake (1)

1-5: LGTM!

The toolchain configuration correctly sets up Clang with libstdc++ and enforces C++23 standard. The use of CMAKE_CXX_FLAGS_INIT for the stdlib flag is appropriate.

src/core_server/internal/parsing/stream_declaration/autogenerate_script.sh (1)

4-5: LGTM!

The updated comment correctly reflects the new ANTLR autogeneration workflow via scripts/generate_antlr.py, aligning with the migration from Conan to vcpkg-based tooling.

vcpkg-triplets/toolchains/clang-libcxx.cmake (1)

1-6: LGTM!

The toolchain configuration correctly sets up Clang with libc++ and includes the necessary linker flags for libc++abi. The C++23 standard requirement is consistent with the project's modern C++ approach.

src/core_server/internal/parsing/option_declaration/autogenerate_script.sh (1)

4-5: LGTM!

The updated comment correctly reflects the new ANTLR autogeneration workflow, consistent with the changes in other autogeneration scripts across the codebase.

scripts/queries/base_queries/build_and_test_smart_homes_queries.sh (1)

14-14: The hardcoded path is correct and does not require changes.

CMakeLists.txt does not configure RUNTIME_OUTPUT_DIRECTORY, so the offline executable is placed directly in build/ (CMAKE_BINARY_DIR), not in configuration-specific subdirectories like build/Debug/ or build/Release/. The hardcoded path build/offline aligns with CMake's default behavior and will work as intended. CMAKE_BUILD_TYPE is passed to CMake for compilation flags and debug symbols, not to control binary output location.

Likely an incorrect or invalid review comment.

vcpkg-triplets/x64-linux-clang-libstdcxx.cmake (1)

1-5: Triplet looks fine; verify toolchain path + CI mapping
Please confirm vcpkg-triplets/toolchains/clang-libstdcxx.cmake exists and your build scripts map -c clang-libstdcxx to this triplet consistently.

vcpkg-triplets/x64-linux-gcc-libstdcxx.cmake (1)

1-5: Triplet looks fine; verify toolchain file + GCC selection in CI
Confirm vcpkg-triplets/toolchains/gcc-libstdcxx.cmake exists and that your scripts/CI actually select GCC (not Clang) when -c gcc-libstdcxx is used.

.github/workflows/checks.yaml (1)

47-48: The toolchain configuration is correctly handled by CMake flags, not environment variables. The build() function in common.sh explicitly passes -DCMAKE_TOOLCHAIN_FILE, -DVCPKG_TARGET_TRIPLET, and -DVCPKG_OVERLAY_TRIPLETS to cmake, and CMakeLists.txt sets CMAKE_TOOLCHAIN_FILE directly if not already defined. The install_dependencies.sh script only needs to clone and bootstrap vcpkg, which it does correctly—no additional toolchain environment exports are required.

.github/workflows/test-checks.yaml (1)

27-27: Consider updating actions/checkout@v3@v4.
External action versions are a supply-chain surface; worth keeping current.

- uses: actions/checkout@v3
+ uses: actions/checkout@v4

Also applies to: 77-77, 130-130, 183-183, 243-243

scripts/generate_antlr.py (1)

68-68: Ruff S103 on chmod 0o755: OK for an intentionally executable generated script.
If you want to silence S103, add a targeted ignore comment; otherwise I’d keep as-is.

CMakeLists.txt (2)

1-6: Toolchain setup before cmake_minimum_required() is fragile—verify this works with your CMake/policies.
If you see policy warnings, consider moving this into presets/toolchain invocation or top-level include mechanisms.


24-30: Verify vcpkg provides pkg-config metadata for gmp/gmpxx in your environment.
If not, pkg_check_modules(... REQUIRED ...) will fail even if vcpkg installed the libs. Consider preferring a CMake config package if available.

src/core_server/internal/evaluation/evaluator.hpp (1)

136-152: Out-of-order tuples now get dropped in non-debug builds — verify this is the intended production behavior.

Line 137-150 turns “out-of-order” into a hard drop (return {}) with LOG_CRITICAL. If the upstream ordering guarantee is ever violated (even transiently), this can silently lose matches.

If that’s intended: consider reducing log severity (or rate-limiting) to avoid CRITICAL spam under sustained disorder.

src/core_server/internal/interface/modules/query/query_types/partition_by_query.hpp (3)

1-10: File-level docs are clear and match the intent.


144-164: Caching flow looks sound; watch pointer lifetime assumptions.

The “cache hit after warmup” path (Line 146-153) and caching nullopt for incompatible event types (Line 188-196) looks reasonable.

Just ensure tuple_indexes pointers never escape process_event() (they’d be invalidated on future event_id_to_tuple_idx rehashes).

Also applies to: 169-197


48-78: Partition key hashing is actually consistent with TupleValuesKey::operator== for current Value types, but the design is fragile.

The hash (Line 69-74) uses get_type() and to_string(), while operator== (Line 56-62) compares Values using their operator!=. Since each Value subclass (StringValue, IntValue, DoubleValue, BoolValue, DateValue) uses direct value comparison in operator== (e.g., this->val == other_ptr->val) and std::to_string() is deterministic for those types, equal keys will hash consistently—the current code is not broken.

However, this pattern is architecturally fragile. If a future Value subclass uses looser equality (e.g., epsilon-based double comparison) or non-canonical to_string() output, the hash would silently become inconsistent, breaking unordered_map semantics. A more robust approach would be to add a virtual hash() method to the Value class or use stable binary serialization, ensuring the hash implementation is decoupled from string formatting.

Comment on lines 216 to 224
- name: base build
run: chmod +x ./scripts/build.sh && ./scripts/build.sh -b Release -s address -p conan_profiles/${{ matrix.conan_profile }}
run: chmod +x ./scripts/build.sh && ./scripts/build.sh -b Release -s address -c ${{ matrix.compiler_profile }}

- name: build with grammar
run: chmod +x ./scripts/build_grammar.sh && ./scripts/build_grammar.sh -b Release -s address -p conan_profiles/${{ matrix.conan_profile }}
run: chmod +x ./scripts/build_grammar.sh && ./scripts/build_grammar.sh -b Release -s address -c ${{ matrix.compiler_profile }}

- name: run tests with thread sanitizer
run: chmod +x ./scripts/build_and_test.sh && ./scripts/build_and_test.sh -b Release -s thread -p conan_profiles/${{ matrix.conan_profile }}
run: chmod +x ./scripts/build_and_test.sh && ./scripts/build_and_test.sh -b Release -s thread -c ${{ matrix.compiler_profile }}

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Thread-sanitizer Release job is building with Address sanitizer flags (-s address).
This looks like a copy/paste error and can invalidate the TSAN run.

- run: chmod +x ./scripts/build.sh && ./scripts/build.sh -b Release -s address -c ${{ matrix.compiler_profile }}
+ run: chmod +x ./scripts/build.sh && ./scripts/build.sh -b Release -s thread -c ${{ matrix.compiler_profile }}

- run: chmod +x ./scripts/build_grammar.sh && ./scripts/build_grammar.sh -b Release -s address -c ${{ matrix.compiler_profile }}
+ run: chmod +x ./scripts/build_grammar.sh && ./scripts/build_grammar.sh -b Release -s thread -c ${{ matrix.compiler_profile }}
🤖 Prompt for AI Agents
.github/workflows/test-checks.yaml around lines 216 to 224: the "run tests with
thread sanitizer" step is invoking ./scripts/build_and_test.sh with the wrong
sanitizer flag (-s address) due to copy/paste; update that step to use -s thread
instead of -s address so the build_and_test job actually compiles/tests with
ThreadSanitizer, keeping the chmod +x and the -b Release and -c ${{
matrix.compiler_profile }} options unchanged.

Comment on lines +16 to 17
compiler_profile: [clang-libstdcxx, gcc-libstdcxx]
runs-on: ubuntu-latest
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== List scripts directory =="
find scripts -type f -name "*.sh" 2>/dev/null | head -20

echo "== locate -c flag parsing in scripts =="
rg -n --hidden -S -C3 --type sh '\bgetopts\b|-c|compiler_profile|triplet' scripts/

echo "== check mapping for clang-libstdcxx/gcc-libstdcxx =="
rg -n --hidden -S -C3 --type sh 'clang-libstdcxx|gcc-libstdcxx|x64-linux-clang-libstdcxx|x64-linux-gcc-libstdcxx' scripts/

Repository: CORE-cer/CORE

Length of output: 170


🏁 Script executed:

#!/bin/bash
set -euo pipefail

REPO="CORE-cer/CORE"
BRANCH="main"  # or default branch

echo "== Attempting to fetch workflow file from GitHub raw content =="
curl -s "https://raw.githubusercontent.com/${REPO}/${BRANCH}/.github/workflows/valgrind-checks.yaml" | head -100

echo -e "\n== Attempting to fetch build.sh =="
curl -s "https://raw.githubusercontent.com/${REPO}/${BRANCH}/scripts/build.sh" | grep -A5 -B5 "getopts\|-c\|compiler_profile" || echo "Pattern not found or file not accessible"

echo -e "\n== Attempting to fetch build_grammar.sh =="
curl -s "https://raw.githubusercontent.com/${REPO}/${BRANCH}/scripts/build_grammar.sh" | grep -A5 -B5 "getopts\|-c\|compiler_profile" || echo "Pattern not found or file not accessible"

echo -e "\n== Attempting to fetch build_and_valgrind.sh =="
curl -s "https://raw.githubusercontent.com/${REPO}/${BRANCH}/scripts/build_and_valgrind.sh" | grep -A5 -B5 "getopts\|-c\|compiler_profile" || echo "Pattern not found or file not accessible"

Repository: CORE-cer/CORE

Length of output: 3965


🏁 Script executed:

#!/bin/bash
set -euo pipefail

REPO="CORE-cer/CORE"
BRANCH="main"

echo "== Full build.sh =="
curl -s "https://raw.githubusercontent.com/${REPO}/${BRANCH}/scripts/build.sh" | head -100

echo -e "\n\n== Full build_grammar.sh =="
curl -s "https://raw.githubusercontent.com/${REPO}/${BRANCH}/scripts/build_grammar.sh" | head -100

echo -e "\n\n== Full build_and_valgrind.sh =="
curl -s "https://raw.githubusercontent.com/${REPO}/${BRANCH}/scripts/build_and_valgrind.sh" | head -100

Repository: CORE-cer/CORE

Length of output: 1963


🏁 Script executed:

#!/bin/bash
set -euo pipefail

REPO="CORE-cer/CORE"
BRANCH="main"

echo "== scripts/common.sh =="
curl -s "https://raw.githubusercontent.com/${REPO}/${BRANCH}/scripts/common.sh"

Repository: CORE-cer/CORE

Length of output: 1492


🏁 Script executed:

#!/bin/bash
set -euo pipefail

REPO="CORE-cer/CORE"
BRANCH="main"

echo "== Check if install_dependencies.sh exists and its content =="
curl -s "https://raw.githubusercontent.com/${REPO}/${BRANCH}/scripts/install_dependencies.sh" | head -80

Repository: CORE-cer/CORE

Length of output: 302


Add -c flag support to scripts/common.sh or revert to -p flag with profile paths

The proposed changes in lines 16–17 (renaming matrix.conan_profile to matrix.compiler_profile and changing values to clang-libstdcxx/gcc-libstdcxx) combined with passing -c ${{ matrix.compiler_profile }} in lines 63–70 will fail silently: scripts/common.sh only recognizes -p for profiles; the -c flag is not parsed by _setArgs(). The build will always use the default profile (x86_64-linux-clang-libstdc), making the matrix ineffective.

Either:

  1. Update scripts/common.sh to parse -c and map clang-libstdcxx/gcc-libstdcxx to valid Conan profile paths, or
  2. Keep the -p flag and pass full profile paths: -p conan_profiles/x86_64-linux-${{ matrix.compiler_profile }} (adjusting path format as needed).
🤖 Prompt for AI Agents
In .github/workflows/valgrind-checks.yaml around lines 16–17, the workflow now
emits matrix.compiler_profile and the job was changed to pass -c
${matrix.compiler_profile}, but scripts/common.sh only recognizes -p so the
chosen profile is ignored; fix by either (A) updating scripts/common.sh's
_setArgs() to accept a -c flag that maps values like
clang-libstdcxx/gcc-libstdcxx to the correct conan profile path (e.g. translate
to conan_profiles/x86_64-linux-clang-libstdcxx or the actual profile filenames
and set the internal profile variable used later), or (B) revert the workflow to
use -p and pass the full profile path per matrix entry (e.g. -p
conan_profiles/x86_64-linux-${{ matrix.compiler_profile }}) so no shell changes
are required.

Comment on lines +127 to +155
# Only build Python module if nanobind is available (pip builds)
if(nanobind_FOUND AND Python_FOUND)
message(STATUS "Building python target")
nanobind_add_module(pycer MODULE src/pybind/python_binding.cpp)

# Add compiler and linker flags to pycer (removed -Werror=old-style-cast due to Python/nanobind headers)
target_compile_options(pycer PRIVATE -Wno-unused-variable -Wno-unused-command-line-argument)

target_link_libraries(pycer PRIVATE cereal::cereal quill::quill cppzmq TracyClient Threads::Threads PkgConfig::gmp PkgConfig::gmpxx)
# target_link_libraries(pycer PRIVATE ${project_LIB})
install(TARGETS pycer DESTINATION .)

# Add stub generation using nanobind's built-in tool
nanobind_add_stub(
pycer_stub
MODULE pycer
OUTPUT pycer.pyi
MARKER_FILE py.typed
PYTHON_PATH $<TARGET_FILE_DIR:pycer>
DEPENDS pycer
)

# Add compiler and linker flags to pycore
target_compile_options(_pycore PRIVATE -Werror=old-style-cast -Wno-unused-variable -Wno-unused-command-line-argument)
# Install the module, stub, and marker file
install(TARGETS pycer LIBRARY DESTINATION . COMPONENT python)
install(FILES ${CMAKE_CURRENT_BINARY_DIR}/pycer.pyi
${CMAKE_CURRENT_BINARY_DIR}/py.typed
DESTINATION .
COMPONENT python)
else()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, locate and read the CMakeLists.txt file
fd -n CMakeLists.txt

Repository: CORE-cer/CORE

Length of output: 287


🏁 Script executed:

# Read the specific lines mentioned in the review comment (127-155)
cat -n CMakeLists.txt | sed -n '127,155p'

Repository: CORE-cer/CORE

Length of output: 1358


🏁 Script executed:

# Check if there are any other install(TARGETS pycer) rules in the entire CMakeLists.txt
rg "install\(TARGETS pycer" CMakeLists.txt

Repository: CORE-cer/CORE

Length of output: 155


🏁 Script executed:

# Get more context around the install commands to understand the full scope
cat -n CMakeLists.txt | sed -n '130,160p'

Repository: CORE-cer/CORE

Length of output: 1338


🏁 Script executed:

# Verify CMake documentation on install behavior with duplicate TARGETS
web_search
CMake install TARGETS duplicate rules behavior error

Repository: CORE-cer/CORE

Length of output: 151


Remove duplicate install(TARGETS pycer ...) rule.

Line 137 (install(TARGETS pycer DESTINATION .)) duplicates the installation defined at line 150 with more specific parameters. Keep only the rule at line 150 which properly specifies the LIBRARY component for the Python module.

- install(TARGETS pycer DESTINATION .)
@@
 install(TARGETS pycer LIBRARY DESTINATION . COMPONENT python)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Only build Python module if nanobind is available (pip builds)
if(nanobind_FOUND AND Python_FOUND)
message(STATUS "Building python target")
nanobind_add_module(pycer MODULE src/pybind/python_binding.cpp)
# Add compiler and linker flags to pycer (removed -Werror=old-style-cast due to Python/nanobind headers)
target_compile_options(pycer PRIVATE -Wno-unused-variable -Wno-unused-command-line-argument)
target_link_libraries(pycer PRIVATE cereal::cereal quill::quill cppzmq TracyClient Threads::Threads PkgConfig::gmp PkgConfig::gmpxx)
# target_link_libraries(pycer PRIVATE ${project_LIB})
install(TARGETS pycer DESTINATION .)
# Add stub generation using nanobind's built-in tool
nanobind_add_stub(
pycer_stub
MODULE pycer
OUTPUT pycer.pyi
MARKER_FILE py.typed
PYTHON_PATH $<TARGET_FILE_DIR:pycer>
DEPENDS pycer
)
# Add compiler and linker flags to pycore
target_compile_options(_pycore PRIVATE -Werror=old-style-cast -Wno-unused-variable -Wno-unused-command-line-argument)
# Install the module, stub, and marker file
install(TARGETS pycer LIBRARY DESTINATION . COMPONENT python)
install(FILES ${CMAKE_CURRENT_BINARY_DIR}/pycer.pyi
${CMAKE_CURRENT_BINARY_DIR}/py.typed
DESTINATION .
COMPONENT python)
else()
# Only build Python module if nanobind is available (pip builds)
if(nanobind_FOUND AND Python_FOUND)
message(STATUS "Building python target")
nanobind_add_module(pycer MODULE src/pybind/python_binding.cpp)
# Add compiler and linker flags to pycer (removed -Werror=old-style-cast due to Python/nanobind headers)
target_compile_options(pycer PRIVATE -Wno-unused-variable -Wno-unused-command-line-argument)
target_link_libraries(pycer PRIVATE cereal::cereal quill::quill cppzmq TracyClient Threads::Threads PkgConfig::gmp PkgConfig::gmpxx)
# target_link_libraries(pycer PRIVATE ${project_LIB})
# Add stub generation using nanobind's built-in tool
nanobind_add_stub(
pycer_stub
MODULE pycer
OUTPUT pycer.pyi
MARKER_FILE py.typed
PYTHON_PATH $<TARGET_FILE_DIR:pycer>
DEPENDS pycer
)
# Install the module, stub, and marker file
install(TARGETS pycer LIBRARY DESTINATION . COMPONENT python)
install(FILES ${CMAKE_CURRENT_BINARY_DIR}/pycer.pyi
${CMAKE_CURRENT_BINARY_DIR}/py.typed
DESTINATION .
COMPONENT python)
else()
🤖 Prompt for AI Agents
In CMakeLists.txt around lines 127 to 155, there is a duplicate install rule for
the pycer target: remove the earlier install(TARGETS pycer DESTINATION .) at
line 137 and keep the more specific install(TARGETS pycer LIBRARY DESTINATION .
COMPONENT python) at line 150; ensure only the component-scoped install remains
and leave the subsequent install(FILES ${CMAKE_CURRENT_BINARY_DIR}/pycer.pyi
${CMAKE_CURRENT_BINARY_DIR}/py.typed DESTINATION . COMPONENT python) unchanged.

Comment on lines +22 to 26
2. Install build tools:

```bash
sudo apt install python3-pip -y
pip install conan
sudo apt install -y git ninja-build build-essential pkg-config
```
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Kitware apt repo instructions should ensure gnupg is installed (for gpg --dearmor).
On minimal images, gpg may be missing.

- sudo apt install -y git ninja-build build-essential pkg-config
+ sudo apt install -y git ninja-build build-essential pkg-config gnupg

Also applies to: 28-43

🤖 Prompt for AI Agents
In docs/users/INSTALLATION.md around lines 22 to 26 (and also apply the same
change to lines 28-43), the Kitware apt repo instructions assume gpg is present
for gpg --dearmor; on minimal images gpg/gnupg may be missing. Update the
installation steps to ensure gnupg is installed before any gpg --dearmor
commands—either add gnupg to the existing sudo apt install -y git ninja-build
build-essential pkg-config line or insert a preceding apt install step for gnupg
(and mention gnupg2 if appropriate), and update the Kitware repo section text to
reference this prerequisite.

Comment on lines +1 to +3
[build-system]
requires = ["scikit-build-core>=0.4.3", "nanobind>=2.0.0"]
build-backend = "scikit_build_core.build"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# First, check the full content of pyproject.toml to verify all mentioned line ranges
git ls-files pyproject.toml

Repository: CORE-cer/CORE

Length of output: 71


🏁 Script executed:

# Read the pyproject.toml file with line numbers
cat -n pyproject.toml | head -50

Repository: CORE-cer/CORE

Length of output: 1875


🌐 Web query:

scikit-build-core VCPKG_ROOT cmake toolchain environment variables pyproject.toml best practices

💡 Result:

Short best-practices summary (scikit-build-core + vcpkg + CMake)

  1. How to supply vcpkg to CMake
  • Use vcpkg’s CMake toolchain: /scripts/buildsystems/vcpkg.cmake and set CMAKE_TOOLCHAIN_FILE so it’s evaluated before the first project() call. Use VCPKG_CHAINLOAD_TOOLCHAIN_FILE if you need to combine toolchains. [1][2]
  1. Prefer declarative (pyproject.toml) or CMakePresets, not hard-coded absolute paths
  • scikit-build-core prefers static settings in pyproject.toml (tool.scikit-build) when values are static; environment variables or --config-settings are alternatives for dynamic values. scikit-build-core exposes cmake.args / cmake.define and also respects SKBUILD_* env vars. [3][4]
  • For machine-specific vcpkg paths prefer CMakePresets.json + CMakeUserPresets.json (CMakeUserPresets is gitignored) so you can reference $env{VCPKG_ROOT} rather than committing absolute paths. Microsoft docs demonstrate using CMAKE_TOOLCHAIN_FILE = $env{VCPKG_ROOT}/scripts/buildsystems/vcpkg.cmake in presets. [1][2]
  1. Recommended configuration patterns
  • Use VCPKG manifest mode (vcpkg.json) where possible (vcpkg installs deps automatically); set the toolchain file once at configure time. [1]
  • Keep toolchain selection and triplets in environment (VCPKG_ROOT, VCPKG_DEFAULT_TRIPLET or custom triplet) or in user presets so builds are reproducible per-machine. [1][2]
  • Ensure any CMake variables that influence vcpkg (e.g., VCPKG_TARGET_TRIPLET, VCPKG_OVERLAY_TRIPLETS) are set before project() (i.e., via CMAKE_TOOLCHAIN_FILE or cache variables at configure). [1]
  1. scikit-build-core-specific tips
  • Put persistent/static CMake config in pyproject.toml under [tool.scikit-build] (cmake.args, cmake.define, build-dir, etc.). Runtime overrides: use pip/setuptools --config-settings or environment SKBUILD_CMAKE_ARGS / SKBUILD_BUILD_* as needed. [3][4]

  • To pass a toolchain file from pyproject.toml, set cmake.args or cmake.define (prefer define for flags you want overrideable); be cautious: args in toml are hard to override from CLI. [4]

  • Example minimal pyproject.toml (adapt path or prefer presets/user-presets):

    [tool.scikit-build]
    cmake.args = ["-DCMAKE_TOOLCHAIN_FILE=$env{VCPKG_ROOT}/scripts/buildsystems/vcpkg.cmake"]
    cmake.define = { VCPKG_TARGET_TRIPLET = "x64-windows" }

    (Or set SKBUILD_CMAKE_ARGS / CMAKE_TOOLCHAIN_FILE in the environment on CI/machines instead of hardcoding.) [3][1]

  1. CI and reproducibility
  • On CI, install or bootstrap vcpkg and set VCPKG_ROOT in the CI environment; use manifest mode in project repo so CI runs vcpkg install automatically via the toolchain. Avoid committing machine-specific preset files. [1][2]
  1. Common pitfalls to avoid
  • Don’t set CMAKE_TOOLCHAIN_FILE after project() — toolchain is processed early. [1][5]
  • Avoid embedding absolute local paths in pyproject.toml; use env vars or CMakeUserPresets for machine-specific values. [1][3]
  • Remember scikit-build-core will also respect CMAKE_ARGS / SKBUILD_CONFIGURE_OPTIONS; be explicit which mechanism you use to avoid surprises. [3][6]

References
[1] vcpkg CMake integration and examples (Microsoft docs).
[2] vcpkg build-system integration / presets guidance (Microsoft docs).
[3] scikit-build-core — Configuration (readthedocs).
[4] scikit-build-core — Getting started / configuration examples (readthedocs).
[5] CMAKE_TOOLCHAIN_FILE variable behavior (CMake docs).
[6] scikit-build (scikit-build repo/docs) — environment/config behavior and notes about CMAKE_ARGS.


Remove hardcoded paths, fix placeholder metadata, and align Python version with wheel ABI

Lines 38, 40 hardcode absolute local paths (/home/nicobuzeta/...) that will break on any other machine or CI environment. Lines 11–12 contain placeholder author/email, and line 27 has a placeholder homepage URL. Additionally, wheel.py-api = "cp312" (line 35) targets Python 3.12+ stable ABI but requires-python = ">=3.8" claims compatibility with 3.8–3.11; this mismatch should be resolved (either narrow supported versions or adjust wheel strategy).

Fix metadata placeholders:

  • Replace author name/email with actual maintainer info (lines 11–12)
  • Update homepage URL to the actual repository (line 27)

For VCPKG toolchain paths, use CMakeUserPresets.json (gitignored) instead of hardcoding in pyproject.toml:

# CMakeUserPresets.json (add to .gitignore)
{
  "version": 4,
  "configurePresets": [
    {
      "name": "default",
      "cacheVariables": {
        "CMAKE_TOOLCHAIN_FILE": "$env{VCPKG_ROOT}/scripts/buildsystems/vcpkg.cmake"
      }
    }
  ]
}

Then remove the absolute paths from cmake.args in pyproject.toml (lines 38, 40), or set VCPKG_ROOT environment variable on CI/build machines. For overlay triplets, use a relative path if the vcpkg-triplets directory is in the repository.

Clarify supported Python versions: Either add classifiers and pin Python wheels per version, or narrow requires-python to ">=3.12" if stable ABI is the intended strategy.

🤖 Prompt for AI Agents
In pyproject.toml (address lines ~1–40, specifically replace placeholder
metadata at lines 11–12 and the homepage URL at line 27), remove the hardcoded
absolute VCPKG/CMake paths at lines 38 and 40 and stop pinning a mismatched
wheel ABI at line 35; replace the author/email placeholders with the real
maintainer name and contact, update the homepage to the repository URL, move
toolchain configuration into a gitignored CMakeUserPresets.json or rely on
VCPKG_ROOT env/relative triplet paths instead of absolute paths in cmake.args,
and resolve the Python ABI mismatch by either narrowing requires-python to
">=3.12" if you intend to use cp312 stable ABI or remove/change wheel.py-api so
it matches the published supported Python versions (alternatively add
per-version wheel build strategy and classifiers).

Comment on lines 80 to 131
function build() {
conan build . --profile:host ${CONAN_PROFILE} --profile:build ${CONAN_PROFILE}\
-s build_type=${BUILD_TYPE}\
--build missing -o sanitizer=${SANITIZER} -o logging=${LOGGING} -o j=${J} -o profiling=${PROFILING}
# Determine vcpkg triplet from compiler profile
# Allow override via environment variable (for Python builds and CI)
if [ -z "${VCPKG_TARGET_TRIPLET}" ]; then
# Hardcode x64-linux for now (extensible to other architectures later)
TRIPLET="x64-linux-${COMPILER_PROFILE}"
export VCPKG_TARGET_TRIPLET="${TRIPLET}"
else
echo "Using externally specified triplet: ${VCPKG_TARGET_TRIPLET}"
TRIPLET="${VCPKG_TARGET_TRIPLET}"
fi

echo -e "${GREEN}Building with compiler profile: ${COMPILER_PROFILE} (triplet: ${TRIPLET})${NORMAL_OUTPUT}"

# Define overlay triplets path
OVERLAY_TRIPLETS="$(pwd)/vcpkg-triplets"

# Define CMake flags
CMAKE_FLAGS="-DCMAKE_BUILD_TYPE=${BUILD_TYPE}"
CMAKE_FLAGS="${CMAKE_FLAGS} -DCMAKE_TOOLCHAIN_FILE=vcpkg/scripts/buildsystems/vcpkg.cmake"
CMAKE_FLAGS="${CMAKE_FLAGS} -DVCPKG_TARGET_TRIPLET=${TRIPLET}"
CMAKE_FLAGS="${CMAKE_FLAGS} -DVCPKG_OVERLAY_TRIPLETS=${OVERLAY_TRIPLETS}"

# C23 workaround for GMP/M4 autotools packages
export CFLAGS="-std=gnu17"

if [ "${SANITIZER}" == "address" ]; then
CMAKE_FLAGS="${CMAKE_FLAGS} -DADDRESS_SANITIZER=ON"
elif [ "${SANITIZER}" == "thread" ]; then
CMAKE_FLAGS="${CMAKE_FLAGS} -DTHREAD_SANITIZER=ON"
fi

# Logging
CMAKE_FLAGS="${CMAKE_FLAGS} -DLOGGING=${LOGGING}"

# Profiling
if [ "${PROFILING}" == "on" ]; then
CMAKE_FLAGS="${CMAKE_FLAGS} -DPROFILING=ON"
fi

# Configure
cmake -B build -S . -G Ninja ${CMAKE_FLAGS}

# Build
if [ "${J}" == "all-1" ]; then
# Get number of cores - 1
CORES=$(nproc)
CORES=$((CORES-1))
cmake --build build -j${CORES}
else
cmake --build build -j${J}
fi
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Fix -j all-1 edge case + don’t clobber user CFLAGS
nproc can be 1 → CORES=0; also export CFLAGS="-std=gnu17" overwrites user flags.

-  export CFLAGS="-std=gnu17"
+  export CFLAGS="${CFLAGS:+$CFLAGS }-std=gnu17"
@@
-  if [ "${SANITIZER}" == "address" ]; then
+  if [[ "${SANITIZER}" == "address" ]]; then
       CMAKE_FLAGS="${CMAKE_FLAGS} -DADDRESS_SANITIZER=ON"
-  elif [ "${SANITIZER}" == "thread" ]; then
+  elif [[ "${SANITIZER}" == "thread" ]]; then
       CMAKE_FLAGS="${CMAKE_FLAGS} -DTHREAD_SANITIZER=ON"
   fi
@@
-  if [ "${PROFILING}" == "on" ]; then
+  if [[ "${PROFILING}" == "on" ]]; then
       CMAKE_FLAGS="${CMAKE_FLAGS} -DPROFILING=ON"
   fi
@@
-  if [ "${J}" == "all-1" ]; then
+  if [[ "${J}" == "all-1" ]]; then
       # Get number of cores - 1
       CORES=$(nproc)
       CORES=$((CORES-1))
+      if (( CORES < 1 )); then CORES=1; fi
       cmake --build build -j${CORES}
   else
       cmake --build build -j${J}
   fi
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
function build() {
conan build . --profile:host ${CONAN_PROFILE} --profile:build ${CONAN_PROFILE}\
-s build_type=${BUILD_TYPE}\
--build missing -o sanitizer=${SANITIZER} -o logging=${LOGGING} -o j=${J} -o profiling=${PROFILING}
# Determine vcpkg triplet from compiler profile
# Allow override via environment variable (for Python builds and CI)
if [ -z "${VCPKG_TARGET_TRIPLET}" ]; then
# Hardcode x64-linux for now (extensible to other architectures later)
TRIPLET="x64-linux-${COMPILER_PROFILE}"
export VCPKG_TARGET_TRIPLET="${TRIPLET}"
else
echo "Using externally specified triplet: ${VCPKG_TARGET_TRIPLET}"
TRIPLET="${VCPKG_TARGET_TRIPLET}"
fi
echo -e "${GREEN}Building with compiler profile: ${COMPILER_PROFILE} (triplet: ${TRIPLET})${NORMAL_OUTPUT}"
# Define overlay triplets path
OVERLAY_TRIPLETS="$(pwd)/vcpkg-triplets"
# Define CMake flags
CMAKE_FLAGS="-DCMAKE_BUILD_TYPE=${BUILD_TYPE}"
CMAKE_FLAGS="${CMAKE_FLAGS} -DCMAKE_TOOLCHAIN_FILE=vcpkg/scripts/buildsystems/vcpkg.cmake"
CMAKE_FLAGS="${CMAKE_FLAGS} -DVCPKG_TARGET_TRIPLET=${TRIPLET}"
CMAKE_FLAGS="${CMAKE_FLAGS} -DVCPKG_OVERLAY_TRIPLETS=${OVERLAY_TRIPLETS}"
# C23 workaround for GMP/M4 autotools packages
export CFLAGS="-std=gnu17"
if [ "${SANITIZER}" == "address" ]; then
CMAKE_FLAGS="${CMAKE_FLAGS} -DADDRESS_SANITIZER=ON"
elif [ "${SANITIZER}" == "thread" ]; then
CMAKE_FLAGS="${CMAKE_FLAGS} -DTHREAD_SANITIZER=ON"
fi
# Logging
CMAKE_FLAGS="${CMAKE_FLAGS} -DLOGGING=${LOGGING}"
# Profiling
if [ "${PROFILING}" == "on" ]; then
CMAKE_FLAGS="${CMAKE_FLAGS} -DPROFILING=ON"
fi
# Configure
cmake -B build -S . -G Ninja ${CMAKE_FLAGS}
# Build
if [ "${J}" == "all-1" ]; then
# Get number of cores - 1
CORES=$(nproc)
CORES=$((CORES-1))
cmake --build build -j${CORES}
else
cmake --build build -j${J}
fi
function build() {
# Determine vcpkg triplet from compiler profile
# Allow override via environment variable (for Python builds and CI)
if [ -z "${VCPKG_TARGET_TRIPLET}" ]; then
# Hardcode x64-linux for now (extensible to other architectures later)
TRIPLET="x64-linux-${COMPILER_PROFILE}"
export VCPKG_TARGET_TRIPLET="${TRIPLET}"
else
echo "Using externally specified triplet: ${VCPKG_TARGET_TRIPLET}"
TRIPLET="${VCPKG_TARGET_TRIPLET}"
fi
echo -e "${GREEN}Building with compiler profile: ${COMPILER_PROFILE} (triplet: ${TRIPLET})${NORMAL_OUTPUT}"
# Define overlay triplets path
OVERLAY_TRIPLETS="$(pwd)/vcpkg-triplets"
# Define CMake flags
CMAKE_FLAGS="-DCMAKE_BUILD_TYPE=${BUILD_TYPE}"
CMAKE_FLAGS="${CMAKE_FLAGS} -DCMAKE_TOOLCHAIN_FILE=vcpkg/scripts/buildsystems/vcpkg.cmake"
CMAKE_FLAGS="${CMAKE_FLAGS} -DVCPKG_TARGET_TRIPLET=${TRIPLET}"
CMAKE_FLAGS="${CMAKE_FLAGS} -DVCPKG_OVERLAY_TRIPLETS=${OVERLAY_TRIPLETS}"
# C23 workaround for GMP/M4 autotools packages
export CFLAGS="${CFLAGS:+$CFLAGS }-std=gnu17"
if [[ "${SANITIZER}" == "address" ]]; then
CMAKE_FLAGS="${CMAKE_FLAGS} -DADDRESS_SANITIZER=ON"
elif [[ "${SANITIZER}" == "thread" ]]; then
CMAKE_FLAGS="${CMAKE_FLAGS} -DTHREAD_SANITIZER=ON"
fi
# Logging
CMAKE_FLAGS="${CMAKE_FLAGS} -DLOGGING=${LOGGING}"
# Profiling
if [[ "${PROFILING}" == "on" ]]; then
CMAKE_FLAGS="${CMAKE_FLAGS} -DPROFILING=ON"
fi
# Configure
cmake -B build -S . -G Ninja ${CMAKE_FLAGS}
# Build
if [[ "${J}" == "all-1" ]]; then
# Get number of cores - 1
CORES=$(nproc)
CORES=$((CORES-1))
if (( CORES < 1 )); then CORES=1; fi
cmake --build build -j${CORES}
else
cmake --build build -j${J}
fi
🧰 Tools
🪛 GitHub Check: SonarCloud Code Analysis

[failure] 108-108: Use '[[' instead of '[' for conditional tests. The '[[' construct is safer and more feature-rich.

See more on https://sonarcloud.io/project/issues?id=CORE-cer_CORE&issues=AZsT1bGaTl4z2CG7OjhK&open=AZsT1bGaTl4z2CG7OjhK&pullRequest=166


[failure] 83-83: Use '[[' instead of '[' for conditional tests. The '[[' construct is safer and more feature-rich.

See more on https://sonarcloud.io/project/issues?id=CORE-cer_CORE&issues=AZsT1bGaTl4z2CG7OjhI&open=AZsT1bGaTl4z2CG7OjhI&pullRequest=166


[failure] 124-124: Use '[[' instead of '[' for conditional tests. The '[[' construct is safer and more feature-rich.

See more on https://sonarcloud.io/project/issues?id=CORE-cer_CORE&issues=AZsT1bGaTl4z2CG7OjhM&open=AZsT1bGaTl4z2CG7OjhM&pullRequest=166


[failure] 116-116: Use '[[' instead of '[' for conditional tests. The '[[' construct is safer and more feature-rich.

See more on https://sonarcloud.io/project/issues?id=CORE-cer_CORE&issues=AZsT1bGaTl4z2CG7OjhL&open=AZsT1bGaTl4z2CG7OjhL&pullRequest=166


[failure] 106-106: Use '[[' instead of '[' for conditional tests. The '[[' construct is safer and more feature-rich.

See more on https://sonarcloud.io/project/issues?id=CORE-cer_CORE&issues=AZsT1bGaTl4z2CG7OjhJ&open=AZsT1bGaTl4z2CG7OjhJ&pullRequest=166

🤖 Prompt for AI Agents
In scripts/common.sh around lines 80 to 131, the build function can set CORES to
0 when nproc returns 1 and it clobbers any user-provided CFLAGS by exporting a
fixed value; change the CORES calculation to ensure at least 1 (e.g., compute
CORES=$(nproc) then if CORES>1 set CORES=CORES-1 else CORES=1) so -j never gets
0, and change the CFLAGS export to append the workaround flag instead of
overwriting (e.g., export CFLAGS="${CFLAGS} -std=gnu17" with care to handle
empty CFLAGS).

Comment on lines +5 to +48
def create_script(grammar_name, antlr4_version):
script = f"""#!/bin/bash

#################################################################
# Auto-generated by scripts/generate_antlr.py #
# DO NOT MODIFY DIRECTLY, modify it inside the python script. #
#################################################################

RED='\\033[0;31m'
GREEN='\\033[0;32m'
NORMAL_OUTPUT='\\033[0m'

# Work at the scripts directory
cd "$(dirname "$0")"

export ANTLR_HOME="~/.local/bin"
export ANTLR_JAR="$ANTLR_HOME/antlr-VERSION-complete.jar"
export CLASSPATH=".:$ANTLR_JAR:$CLASSPATH"
alias antlr4="java -jar $ANTLR_JAR"
alias grun="java org.antlr.v4.gui.TestRig"

on_error() {{
echo -e "${{RED}}An error occurred, script has aborted.$NORMAL_OUTPUT"
}}

set -e
trap on_error ERR

echo -e "${{GREEN}}Installing antlr-{antlr4_version}" if it is not installed already.$NORMAL_OUTPUT
mkdir -p ~/.local/bin
cd ~/.local/bin && [ ! -f "antlr-{antlr4_version}-complete.jar" ] && wget https://www.antlr.org/download/antlr-{antlr4_version}-complete.jar
cd -

echo -e "${{GREEN}}Generating antlr files.$NORMAL_OUTPUT"

if [ -d "autogenerated" ]; then
rm -r "autogenerated"
fi
mkdir -p autogenerated
java -jar ~/.local/bin/antlr-VERSION-complete.jar -o autogenerated -no-listener -visitor -Dlanguage=Cpp {grammar_name}Lexer.g4
java -jar ~/.local/bin/antlr-VERSION-complete.jar -o autogenerated -no-listener -visitor -Dlanguage=Cpp {grammar_name}Parser.g4
echo -e "${{GREEN}}Fishing the auto-generation of {grammar_name} successfully.$NORMAL_OUTPUT"
"""
return script.replace("VERSION", antlr4_version)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Generated bash script is currently broken (malformed echo, bad ~ expansion, alias usage).
As-is, autogenerate_script.sh will likely fail.

 def create_script(grammar_name, antlr4_version):
     script = f"""#!/bin/bash
@@
-    export ANTLR_HOME="~/.local/bin"
-    export ANTLR_JAR="$ANTLR_HOME/antlr-VERSION-complete.jar"
+    export ANTLR_HOME="$HOME/.local/bin"
+    export ANTLR_JAR="$ANTLR_HOME/antlr-VERSION-complete.jar"
     export CLASSPATH=".:$ANTLR_JAR:$CLASSPATH"
-    alias antlr4="java -jar $ANTLR_JAR"
-    alias grun="java org.antlr.v4.gui.TestRig"
+    # Avoid aliases (not expanded in non-interactive shells unless enabled)
+    antlr4() {{ java -jar "$ANTLR_JAR" "$@"; }}
+    grun() {{ java org.antlr.v4.gui.TestRig "$@"; }}
@@
-    echo -e "${{GREEN}}Installing antlr-{antlr4_version}" if it is not installed already.$NORMAL_OUTPUT
-    mkdir -p ~/.local/bin
-    cd ~/.local/bin && [ ! -f "antlr-{antlr4_version}-complete.jar" ] && wget https://www.antlr.org/download/antlr-{antlr4_version}-complete.jar
-    cd -
+    echo -e "${{GREEN}}Installing antlr-{antlr4_version} (if needed).${{NORMAL_OUTPUT}}"
+    mkdir -p "$ANTLR_HOME"
+    if [ ! -f "$ANTLR_HOME/antlr-{antlr4_version}-complete.jar" ]; then
+        wget -O "$ANTLR_HOME/antlr-{antlr4_version}-complete.jar" "https://www.antlr.org/download/antlr-{antlr4_version}-complete.jar"
+    fi
@@
-    java -jar ~/.local/bin/antlr-VERSION-complete.jar -o autogenerated -no-listener -visitor -Dlanguage=Cpp {grammar_name}Lexer.g4
-    java -jar ~/.local/bin/antlr-VERSION-complete.jar -o autogenerated -no-listener -visitor -Dlanguage=Cpp {grammar_name}Parser.g4
-    echo -e "${{GREEN}}Fishing the auto-generation of {grammar_name} successfully.$NORMAL_OUTPUT"
+    java -jar "$ANTLR_JAR" -o autogenerated -no-listener -visitor -Dlanguage=Cpp {grammar_name}Lexer.g4
+    java -jar "$ANTLR_JAR" -o autogenerated -no-listener -visitor -Dlanguage=Cpp {grammar_name}Parser.g4
+    echo -e "${{GREEN}}Finishing the auto-generation of {grammar_name} successfully.${{NORMAL_OUTPUT}}"
     """
     return script.replace("VERSION", antlr4_version)
🤖 Prompt for AI Agents
In scripts/generate_antlr.py around lines 5 to 48, the generated bash script has
multiple issues: the echo line contains an inline "if" fragment making it
syntactically invalid, ANTLR_HOME uses an unexpanded "~" which won't expand
reliably in scripts, aliases shouldn't be used in non-interactive scripts, and
ANTLR_JAR and invocations use the literal "VERSION" rather than the resolved
version. Fix by: 1) correct the echo to a single valid echo statement without
the stray "if", 2) set ANTLR_HOME to "$HOME/.local/bin" (or expand with
os.path.expanduser in Python) and build ANTLR_JAR using the supplied version
variable, 3) remove alias lines and invoke java -jar "$ANTLR_JAR" explicitly,
and 4) update the wget/check to use the versioned filename and call java -jar
with the same versioned jar when generating the lexer/parser; ensure all
variables are quoted and file checks use the correct names.

Comment on lines 1 to 21
#!/bin/bash

# Work at the root directory
# Should have conanfile.py present there.
cd "$(dirname "$0")"
cd ..

# Include common
source scripts/common.sh
_setArgs "$@"

conan source .

# Install vcpkg if not present
if [ ! -d "vcpkg" ]; then
echo "Cloning vcpkg..."
git clone https://github.com/microsoft/vcpkg.git
./vcpkg/bootstrap-vcpkg.sh
fi

# Generate ANTLR scripts
python3 scripts/generate_antlr.py

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

❓ Verification inconclusive

vcpkg bootstrap binary pinning reproducible builds best practices


Recommended best practices (concise)

  1. Pin the vcpkg tool and registry
  • Check out a vcpkg release commit (or use the git submodule) and run bootstrap from that exact commit so the tool + builtin ports are fixed. [1][3]
  1. Use manifest builtin-baseline and lockfiles
  • Set "builtin-baseline" in vcpkg.json to the exact registry commit/commit-hash you used. This locks port versions/recipes. [1][3]
  1. Pin external tools used by vcpkg
  • Use --x-abi-tools-use-exact-versions to force exact tool versions recorded in vcpkgTools.xml (prevents ABI churn when CI runners or hosts update tools). Also pin CMake/compilers in your CI image. [4][2]
  1. Enable and publish binary caches from CI (per-port caches)
  • Produce and push vcpkg binary packages from your CI to a NuGet/GitHub Packages feed (or file cache) and configure consumers to read that feed. This avoids rebuilding and preserves per-port granularity. Prefer vcpkg’s NuGet-style cache over single-tree artifacts for reproducibility. [1][5]
  1. Ensure correct permissions and cache configuration
  • Configure cache destinations with appropriate read/write permissions and verify uploads/restores in CI logs. Use troubleshooting guidance when restores rebuild unexpectedly. [2]
  1. Make ABI hashes deterministic
  • Avoid pulling “latest” tool installs at build time. Use fixed base images or vendored tool versions so ABI metadata (abi_info) is stable across builds. If you must update vcpkg, update builtin-baseline and rebuild/publish binaries. [2][3]
  1. Provide reproducible build artifacts
  • For releases, use vcpkg export or publish the binary cache snapshots alongside your release so others can reproduce exact binaries without rebuilding from source. [1]
  1. CI housekeeping and verification
  • Rebuild and republish binaries on controlled cadence when you intentionally update baselines/tools; add smoke tests to validate restored binaries match expected ABI. [2]

Quick troubleshooting tips

  • If packages rebuild unexpectedly: check ABI hash differences (tool versions, CMake, compiler), verify builtin-baseline usage, confirm cache read/write permissions, and consider --x-abi-tools-use-exact-versions. [2][3]

Sources
[1] vcpkg Binary Caching docs (Microsoft Learn). (learn.microsoft.com)
[2] vcpkg Binary caching troubleshooting (Microsoft Learn). (learn.microsoft.com)
[3] Issue: builtin-baseline and abi cache effects (microsoft/vcpkg GitHub). (github.com)
[4] vcpkg “What’s New” (June 2025) — guidance on caching providers and pinning tools. (devblogs.microsoft.com)

If you want, I can convert this into a short CI checklist or sample GitHub Actions workflow that implements these practices.

Citations:


Script executed:

# Check if generate_antlr.py exists and what it does
find . -name "generate_antlr.py" -type f | head -5

Repository: CORE-cer/CORE


Repository: CORE-cer/CORE
Exit code: 0

stdout:

./scripts/generate_antlr.py

Script executed:

# Look for any CI configuration that may handle vcpkg pinning
find . -type f \( -name "*.yml" -o -name "*.yaml" -o -name ".devcontainer*" -o -name "Dockerfile*" \) | head -20

Repository: CORE-cer/CORE


Repository: CORE-cer/CORE
Exit code: 0

stdout:

./.devcontainer/Dockerfile
./.github/workflows/docker-checks.yaml
./.github/workflows/docker-push.yaml
./.github/workflows/benchmark.yaml
./.github/workflows/checks.yaml
./.github/workflows/test-checks.yaml
./.github/workflows/valgrind-checks.yaml
./docker-compose.yml
./Dockerfile

Script executed:

head -30 scripts/generate_antlr.py

Repository: CORE-cer/CORE


Repository: CORE-cer/CORE
Exit code: 0

stdout:

import os
from os import path
import sys

def create_script(grammar_name, antlr4_version):
    script = f"""#!/bin/bash

    #################################################################
    # Auto-generated by scripts/generate_antlr.py                   #
    # DO NOT MODIFY DIRECTLY, modify it inside the python script.   #
    #################################################################

    RED='\\033[0;31m'
    GREEN='\\033[0;32m'
    NORMAL_OUTPUT='\\033[0m'

    # Work at the scripts directory
    cd "$(dirname "$0")"

    export ANTLR_HOME="~/.local/bin"
    export ANTLR_JAR="$ANTLR_HOME/antlr-VERSION-complete.jar"
    export CLASSPATH=".:$ANTLR_JAR:$CLASSPATH"
    alias antlr4="java -jar $ANTLR_JAR"
    alias grun="java org.antlr.v4.gui.TestRig"

    on_error() {{
        echo -e "${{RED}}An error occurred, script has aborted.$NORMAL_OUTPUT"
    }}

    set -e

Script executed:

head -100 .github/workflows/checks.yaml

Repository: CORE-cer/CORE


Repository: CORE-cer/CORE
Exit code: 0

stdout:

name: Common Checks

on:
  pull_request:
    branches:
      - main

concurrency:
  group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
  cancel-in-progress: true

jobs:
  check-ceql-query-grammar-up-to-date:
    runs-on: ubuntu-latest
    steps:
      - name: Check out repository code
        uses: actions/checkout@v3

      - name: ccache
        uses: hendrikmuhs/ccache-action@v1.2
        with:
          key: ${{ github.job }}-${{ matrix.os }}
          create-symlink: true

      - name: Install LLVM and Clang
        uses: KyleMayes/install-llvm-action@v2
        with:
          version: "19.1.7"

      - name: Install libtinfo5 for clang
        run: |
          sudo apt update
          wget http://security.ubuntu.com/ubuntu/pool/universe/n/ncurses/libtinfo5_6.3-2ubuntu0.1_amd64.deb
          sudo apt install ./libtinfo5_6.3-2ubuntu0.1_amd64.deb

      - name: Set up GCC
        run: |
          sudo apt install gcc-14 g++-14
          sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-14 60 --slave /usr/bin/g++ g++ /usr/bin/g++-14 && sudo update-alternatives --config gcc

      - name: Check versions
        run: |
          clang --version
          clang-format --version
          gcc -v

      - name: install dependencies
        run: chmod +x ./scripts/install_dependencies.sh && ./scripts/install_dependencies.sh

      - name: base build
        run: chmod +x ./scripts/build.sh && ./scripts/build.sh

      - name: check if grammar autogenerated is up to date
        run: |
          FOLDER_TO_CHECK="./src/core_server/internal/parsing/ceql_query/autogenerated/"
          INITIAL_CHECKSUM=$(find "$FOLDER_TO_CHECK" -type f -exec md5sum {} \; | sort -k 2 | md5sum)

          # generate grammar
          chmod +x ./scripts/build_grammar.sh
          ./scripts/build_grammar.sh

          # Wait just in case
          sleep 2

          FINAL_CHECKSUM=$(find "$FOLDER_TO_CHECK" -type f -exec md5sum {} \; | sort -k 2 | md5sum)

          if [ "$INITIAL_CHECKSUM" != "$FINAL_CHECKSUM" ]; then
          echo "Error: $FOLDER_TO_CHECK changed after compiling."
          exit 1
          else
          echo "$FOLDER_TO_CHECK has not changed."
          fi
  check-code-format:
    runs-on: ubuntu-latest
    steps:
      - name: Check out repository code
        uses: actions/checkout@v3

      - name: ccache
        uses: hendrikmuhs/ccache-action@v1.2
        with:
          key: ${{ github.job }}-${{ matrix.os }}
          create-symlink: true

      - name: Install LLVM and Clang
        uses: KyleMayes/install-llvm-action@v2
        with:
          version: "19.1.7"

      - name: Install libtinfo5 for clang
        run: |
          sudo apt update
          wget http://security.ubuntu.com/ubuntu/pool/universe/n/ncurses/libtinfo5_6.3-2ubuntu0.1_amd64.deb
          sudo apt install ./libtinfo5_6.3-2ubuntu0.1_amd64.deb

      - name: Set up GCC
        run: |
          sudo apt install gcc-14 g++-14
          sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-14 60 --slave /usr/bin/g++ g++ /usr/bin/g++-14 && sudo update-alternatives --config gcc


Script executed:

# Check if bootstrap creates vcpkg/vcpkg binary
find . -name "vcpkg" -type f 2>/dev/null | head -10

Repository: CORE-cer/CORE


Repository: CORE-cer/CORE
Exit code: 0


Make vcpkg bootstrap idempotent and fail fast

If vcpkg/ directory exists but wasn't successfully bootstrapped, the script silently skips bootstrap and fails later with cryptic errors. Add strict mode and check for the bootstrap binary:

 #!/bin/bash
+set -euo pipefail
 
 # Work at the root directory
-cd "$(dirname "$0")"
+cd "$(dirname "$0")" || exit 1
 cd ..
 
 # Include common
 source scripts/common.sh
 _setArgs "$@"
 
 
 # Install vcpkg if not present
-if [ ! -d "vcpkg" ]; then
+if [[ ! -d "vcpkg" ]]; then
     echo "Cloning vcpkg..."
     git clone https://github.com/microsoft/vcpkg.git
-    ./vcpkg/bootstrap-vcpkg.sh
 fi
+
+if [[ ! -x "vcpkg/vcpkg" ]]; then
+  ./vcpkg/bootstrap-vcpkg.sh
+fi

Also consider pinning vcpkg to a specific commit (e.g., via git submodule or environment variable) for reproducible builds rather than always cloning main.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
#!/bin/bash
# Work at the root directory
# Should have conanfile.py present there.
cd "$(dirname "$0")"
cd ..
# Include common
source scripts/common.sh
_setArgs "$@"
conan source .
# Install vcpkg if not present
if [ ! -d "vcpkg" ]; then
echo "Cloning vcpkg..."
git clone https://github.com/microsoft/vcpkg.git
./vcpkg/bootstrap-vcpkg.sh
fi
# Generate ANTLR scripts
python3 scripts/generate_antlr.py
#!/bin/bash
set -euo pipefail
# Work at the root directory
cd "$(dirname "$0")" || exit 1
cd ..
# Include common
source scripts/common.sh
_setArgs "$@"
# Install vcpkg if not present
if [[ ! -d "vcpkg" ]]; then
echo "Cloning vcpkg..."
git clone https://github.com/microsoft/vcpkg.git
fi
if [[ ! -x "vcpkg/vcpkg" ]]; then
./vcpkg/bootstrap-vcpkg.sh
fi
# Generate ANTLR scripts
python3 scripts/generate_antlr.py
🧰 Tools
🪛 GitHub Check: SonarCloud Code Analysis

[failure] 13-13: Use '[[' instead of '[' for conditional tests. The '[[' construct is safer and more feature-rich.

See more on https://sonarcloud.io/project/issues?id=CORE-cer_CORE&issues=AZsT1bGvTl4z2CG7OjhN&open=AZsT1bGvTl4z2CG7OjhN&pullRequest=166

🪛 Shellcheck (0.11.0)

[warning] 4-4: Use 'cd ... || exit' or 'cd ... || return' in case cd fails.

(SC2164)

🤖 Prompt for AI Agents
In scripts/install_dependencies.sh lines 1-21, the script neither runs in strict
mode nor verifies that vcpkg was successfully bootstrapped, so an existing
vcpkg/ dir that lacks the vcpkg binary will be skipped and cause opaque failures
later; enable strict/fail-fast (e.g., set -euo pipefail) at the top, change the
bootstrap condition to check for the vcpkg executable (e.g., test -x
vcpkg/vcpkg) and run bootstrap if the binary is missing (or re-run bootstrap if
necessary), and after cloning ensure bootstrap completed successfully (fail with
a clear message if not); additionally, make cloning/pinning reproducible by
adding an option to clone a specific commit/branch or recommend using a git
submodule or env var to pin the vcpkg revision.

Comment on lines +207 to 243
void clean_or_expand_evaluator_storage(uint64_t current_time) {
ZoneScopedN("Interface::DynamicEvaluator::clean_or_expand_evaluator_storage");
size_t initial_size = evaluators_storage.size();
// First try to clean empty evaluators
for (auto it = evaluators_storage.rbegin(); it != evaluators_storage.rend();) {
assert(it->evaluator != nullptr && "Evaluator pointer should not be null");
// It should be one since it is only copied in this same thread during usage, but after this runs
assert(it->evaluator.use_count() == 1 && "Shared pointer use count should be 1.");

if (it->evaluator->is_time_window_empty(current_time)) {
// Remove node from map
value_to_evaluator_list_pos.erase(it->evaluator_idx);
// Remove from list - advance iterator before erasing
auto to_erase = std::next(it).base();
it = std::make_reverse_iterator(evaluators_storage.erase(to_erase));
} else {
++it;
}
}
return enumerator;

if (initial_size != evaluators_storage.size()) {
LOG_TRACE_L1(this->logger,
"Cleaned up {} evaluators that were out of time window.",
initial_size - evaluators_storage.size());
} else {
LOG_WARNING(this->logger,
"Number of evaluators ({}) exceeded maximum allowed ({}). "
"Doubling current size to {}",
evaluators_storage.size(),
max_evaluators,
max_evaluators * 2);
max_evaluators *= 2;
}

assert(evaluators_storage.size() <= max_evaluators
&& "Number of evaluators should be within the maximum allowed after cleanup");
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Cleanup time monotonicity assumption can break across partitions/time sources (assert/logic risk).

clean_or_expand_evaluator_storage(current_time) calls it->evaluator->is_time_window_empty(current_time) (Line 216), but current_time is derived from the current event, not necessarily a global monotonically increasing clock for all evaluators (e.g., Within::ATTRIBUTE, or if out-of-order events are tolerated/dropped).

Suggestion: track a monotonic “global_time” in DynamicEvaluator and use that for cleanup:

  • add uint64_t last_seen_time = 0;
  • in process_event: last_seen_time = std::max(last_seen_time, time);
  • pass last_seen_time into clean_or_expand_evaluator_storage(...) / is_time_window_empty(...)

Also: the use_count()==1 assert (Line 214) is very brittle; I’d drop it or relax it to >= 1.

🤖 Prompt for AI Agents
In
src/core_server/internal/interface/modules/query/evaluators/dynamic_evaluator.hpp
around lines 207-243, the cleanup uses the per-event current_time which can be
non-monotonic across partitions and the assert use_count()==1 is brittle; add a
member uint64_t last_seen_time = 0 to DynamicEvaluator, update process_event to
set last_seen_time = std::max(last_seen_time, event_time), modify
clean_or_expand_evaluator_storage to accept/use last_seen_time (or read the
member) and call is_time_window_empty(last_seen_time) instead of the per-event
time, and relax/remove the assert so it checks use_count() >= 1 (or drop it
entirely); ensure related call sites/signatures are updated accordingly.

Comment on lines +84 to +169
nb::class_<Types::Value>(m, "PyValue"); // NOLINT

py::class_<Types::IntValue, Types::Value, std::shared_ptr<Types::IntValue>>(m, "PyIntValue")
.def(py::init<int64_t>());
nb::class_<Types::IntValue, Types::Value>(m, "PyIntValue")
.def(nb::init<int64_t>());

py::class_<Types::StringValue, Types::Value, std::shared_ptr<Types::StringValue>>(m, "PyStringValue")
.def(py::init<std::string>());
nb::class_<Types::StringValue, Types::Value>(m, "PyStringValue")
.def(nb::init<std::string>());

py::class_<Types::DoubleValue, Types::Value, std::shared_ptr<Types::DoubleValue>>(m, "PyDoubleValue")
.def(py::init<double>());
nb::class_<Types::DoubleValue, Types::Value>(m, "PyDoubleValue")
.def(nb::init<double>());

py::class_<Types::BoolValue, Types::Value, std::shared_ptr<Types::BoolValue>>(m, "PyBoolValue")
.def(py::init<bool>());
nb::class_<Types::BoolValue, Types::Value>(m, "PyBoolValue")
.def(nb::init<bool>());

py::class_<Types::DateValue, Types::Value, std::shared_ptr<Types::DateValue>>(m, "PyDateValue")
.def(py::init<std::time_t>());
nb::class_<Types::DateValue, Types::Value>(m, "PyDateValue")
.def(nb::init<std::time_t>());

py::class_<Types::AttributeInfo>(m, "PyAttributeInfo")
.def(py::init<std::string, Types::ValueTypes>())
.def_readonly("name", &Types::AttributeInfo::name)
.def_readonly("value_type", &Types::AttributeInfo::value_type);
nb::class_<Types::AttributeInfo>(m, "PyAttributeInfo")
.def(nb::init<std::string, Types::ValueTypes>())
.def_ro("name", &Types::AttributeInfo::name)
.def_ro("value_type", &Types::AttributeInfo::value_type);

py::class_<Types::Event, std::shared_ptr<Types::Event>>(m, "PyEvent")
.def(py::init<uint64_t, std::vector<std::shared_ptr<Types::Value>>>())
.def(py::init<uint64_t, std::vector<std::shared_ptr<Types::Value>>, Types::IntValue>());
nb::class_<Types::Event>(m, "PyEvent")
.def(nb::init<uint64_t, std::vector<std::shared_ptr<Types::Value>>>())
.def(nb::init<uint64_t, std::vector<std::shared_ptr<Types::Value>>, Types::IntValue>());

py::class_<Types::EventInfoParsed>(m, "PyEventInfoParsed")
.def(py::init<std::string, std::vector<Types::AttributeInfo>>())
.def_readonly("name", &Types::EventInfoParsed::name)
.def_readonly("attributes", &Types::EventInfoParsed::attributes_info);

py::class_<Types::EventInfo>(m, "PyEventInfo")
.def(py::init<uint64_t, std::string, std::vector<Types::AttributeInfo>>())
.def_readonly("id", &Types::EventInfo::id)
.def_readonly("name", &Types::EventInfo::name);

py::class_<Types::Stream>(m, "PyStream")
.def(py::init<uint64_t, std::vector<std::shared_ptr<Types::Event>>&&>())
.def_readonly("id", &Types::Stream::id)
.def_readonly("events", &Types::Stream::events);

py::class_<Types::StreamInfoParsed>(m, "PyStreamInfoParsed")
.def(py::init<std::string, std::vector<Types::EventInfoParsed>>())
.def_readonly("name", &Types::StreamInfoParsed::name)
.def_readonly("events_info", &Types::StreamInfoParsed::events_info);

py::class_<Types::StreamInfo>(m, "PyStreamInfo")
.def(py::init<uint64_t, std::string, std::vector<Types::EventInfo>&&>())
.def_readonly("events_info", &Types::StreamInfo::events_info)
.def_readonly("name", &Types::StreamInfo::name)
.def_readonly("id", &Types::StreamInfo::id);
nb::class_<Types::EventInfoParsed>(m, "PyEventInfoParsed")
.def(nb::init<std::string, std::vector<Types::AttributeInfo>>())
.def_ro("name", &Types::EventInfoParsed::name)
.def_ro("attributes", &Types::EventInfoParsed::attributes_info);

nb::class_<Types::EventInfo>(m, "PyEventInfo")
.def(nb::init<uint64_t, std::string, std::vector<Types::AttributeInfo>>())
.def_ro("id", &Types::EventInfo::id)
.def_ro("name", &Types::EventInfo::name);

nb::class_<Types::Stream>(m, "PyStream")
.def(nb::init<uint64_t, std::vector<std::shared_ptr<Types::Event>>&&>())
.def_ro("id", &Types::Stream::id)
.def_ro("events", &Types::Stream::events);

nb::class_<Types::StreamInfoParsed>(m, "PyStreamInfoParsed")
.def(nb::init<std::string, std::vector<Types::EventInfoParsed>>())
.def_ro("name", &Types::StreamInfoParsed::name)
.def_ro("events_info", &Types::StreamInfoParsed::events_info);

nb::class_<Types::StreamInfo>(m, "PyStreamInfo")
.def(nb::init<uint64_t, std::string, std::vector<Types::EventInfo>&&>())
.def_ro("events_info", &Types::StreamInfo::events_info)
.def_ro("name", &Types::StreamInfo::name)
.def_ro("id", &Types::StreamInfo::id);

py::class_<Streamer>(m, "PyStreamer")
.def(py::init<std::string, uint16_t>())
.def("send_stream", py::overload_cast<Types::Stream>(&Streamer::send_stream))
nb::class_<Streamer>(m, "PyStreamer")
.def(nb::init<std::string, uint16_t>())
.def("send_stream", nb::overload_cast<Types::Stream>(&Streamer::send_stream))
.def("send_stream", [](Streamer& self, Types::StreamTypeId stream_id, std::shared_ptr<Types::Event> event) {
self.send_stream(stream_id, std::move(event));
}, py::arg("stream_id"), py::arg("event"),
}, nb::arg("stream_id"), nb::arg("event"),
"Envía un evento único a un stream.");

py::class_<Client>(m, "PyClient")
.def(py::init<std::string, uint16_t>())
.def("declare_stream", py::overload_cast<std::string>(&Client::declare_stream))
.def("declare_option", py::overload_cast<std::string>(&Client::declare_option))
.def("add_query", py::overload_cast<std::string>(&Client::add_query));
nb::class_<Client>(m, "PyClient")
.def(nb::init<std::string, uint16_t>())
.def("declare_stream", nb::overload_cast<std::string>(&Client::declare_stream))
.def("declare_option", nb::overload_cast<std::string>(&Client::declare_option))
.def("add_query", nb::overload_cast<std::string>(&Client::add_query));

py::class_<Types::ComplexEvent>(m, "PyComplexEvent")
nb::class_<Types::ComplexEvent>(m, "PyComplexEvent")
.def("to_string", &Types::ComplexEvent::to_string)
.def_readonly("start", &Types::ComplexEvent::start)
.def_readonly("end", &Types::ComplexEvent::end)
.def_readonly("start", &Types::ComplexEvent::start)
.def_readonly("events", &Types::ComplexEvent::events);
.def_ro("start", &Types::ComplexEvent::start)
.def_ro("end", &Types::ComplexEvent::end)
.def_ro("events", &Types::ComplexEvent::events);

py::class_<CallbackHandler, std::unique_ptr<CallbackHandler>>(m, "PyCallbckHandler")
nb::class_<CallbackHandler>(m, "PyCallbckHandler")
.def_static("set_event_handler", [](std::function<void(const Types::Enumerator&)> handler) {
CallbackHandler::event_handler = handler;
});

py::class_<CORE::Types::Enumerator>(m, "PyEnumerator")
.def_readonly("complex_events", &CORE::Types::Enumerator::complex_events)
.def("__iter__", [](CORE::Types::Enumerator &self) {
return py::make_iterator(self.complex_events.begin(), self.complex_events.end());
}, py::keep_alive<0, 1>());

py::register_exception<ClientException>(m, "PyClientException");
py::register_exception<AttributeNameAlreadyDeclaredException>(m, "PyAttributeNameAlreadyDeclared");
py::register_exception<AttributeNotDefinedException>(m, "PyAttributeNotDefinedException");
py::register_exception<EventNameAlreadyDeclaredException>(m, "PyEventNameAlreadyDeclared");
py::register_exception<EventNotDefinedException>(m, "PyEventNotDefinedException");
py::register_exception<EventNotInStreamException>(m, "EventNotInStreamException");
py::register_exception<ParsingSyntaxException>(m, "PyParsingSyntaxException");
py::register_exception<StreamNameAlreadyDeclaredException>(m, "PyStreamNameAlreadyDeclaredException");
py::register_exception<StreamNotFoundException>(m, "PyStreamNotFoundException");
py::register_exception<WarningException>(m, "PyWarningException");
nb::class_<CORE::Types::Enumerator>(m, "PyEnumerator")
.def_ro("complex_events", &CORE::Types::Enumerator::complex_events)
.def("__iter__", [](nb::handle self) {
auto& enumerator = nb::cast<CORE::Types::Enumerator&>(self);
return nb::make_iterator(nb::type<CORE::Types::Enumerator>(), "iterator",
enumerator.complex_events.begin(),
enumerator.complex_events.end());
}, nb::keep_alive<0, 1>());

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Align nanobind holder types + fix accidental Python API typo
You pass/return std::shared_ptr<Types::Value/Event> in multiple places; define classes with shared_ptr holders to avoid lifetime/cast surprises. Also "PyCallbckHandler" looks like an unintended rename.

-        nb::class_<Types::Value>(m, "PyValue"); // NOLINT
+        nb::class_<Types::Value, std::shared_ptr<Types::Value>>(m, "PyValue"); // NOLINT
@@
-        nb::class_<Types::Event>(m, "PyEvent")
+        nb::class_<Types::Event, std::shared_ptr<Types::Event>>(m, "PyEvent")
@@
-        nb::class_<CallbackHandler>(m, "PyCallbckHandler")
+        nb::class_<CallbackHandler>(m, "PyCallbackHandler")
nanobind: recommended pattern for binding polymorphic base classes when APIs use std::shared_ptr<T> (class_ holder_type)
🤖 Prompt for AI Agents
In src/pybind/python_binding.cpp around lines 84-169, several classes are bound
but the code passes/returns std::shared_ptr<Types::Value> and
std::shared_ptr<Types::Event> without declaring nanobind holder types, and the
CallbackHandler Python name has a typo. Fix by declaring the nanobind holder
type std::shared_ptr for polymorphic/shared classes (e.g.,
nb::class_<Types::Value, std::shared_ptr<Types::Value>>(...), and likewise for
Types::IntValue, Types::StringValue, Types::DoubleValue, Types::BoolValue,
Types::DateValue, Types::Event, and any other classes exposed/used via
shared_ptr such as Types::Stream if its events are shared_ptrs) so
lifetime/casting work correctly; and rename the binding "PyCallbckHandler" to
"PyCallbackHandler". Ensure overloads/ctors stay the same but use the shared_ptr
holder variants.

@coderabbitai
Copy link

coderabbitai bot commented Dec 12, 2025

✅ Actions performed

Reviews paused.

@sonarqubecloud
Copy link

Quality Gate Failed Quality Gate failed

Failed conditions
1 Security Hotspot

See analysis details on SonarQube Cloud

@github-actions
Copy link

Stocks:

query time
other-q1_any.txt 1.33000000000000000000
other-q1_none.txt 2.76000000000000000000
other-q2_any.txt .77000000000000000000
other-q2_none.txt .99000000000000000000
other-q3_any.txt .82000000000000000000
other-q3_none.txt .79000000000000000000
other-q4_any.txt 1.52666666666666666666
other-q4_none.txt 27.44666666666666666666
other-q5_any.txt .79000000000000000000
other-q5_none.txt 3.49666666666666666666
other-q6_any.txt .80333333333333333333
other-q6_none.txt .81333333333333333333
other-q6_partition.txt .80333333333333333333
q1_any.txt .77000000000000000000
q1_none.txt 1.73666666666666666666
q2_any.txt 2.99333333333333333333
q2_none.txt 9.93000000000000000000
q3_any.txt 1.66666666666666666666
q3_none.txt 19.09666666666666666666
q4_any.txt .87000000000000000000
q4_none.txt .87666666666666666666
q5_any.txt 1.04000000000000000000
q5_none.txt 1.06333333333333333333
q6_any.txt .78000000000000000000
q6_none.txt .77000000000000000000
q7_any.txt .82000000000000000000
q7_none.txt .83000000000000000000
q8_any.txt .86666666666666666666
q8_none.txt .87000000000000000000
q9_any.txt .93333333333333333333
q9_none.txt .90000000000000000000
q10_any.txt 1.07666666666666666666
q10_none.txt 1.04666666666666666666
q11_any.txt .76666666666666666666
q11_none.txt .78333333333333333333
q12_any.txt .77666666666666666666
q12_none.txt .80333333333333333333

@github-actions
Copy link

Unordered Stocks:

query time
other-q1_any.txt 1.43333333333333333333
other-q1_none.txt 2.88666666666666666666
other-q2_any.txt .96000000000000000000
other-q2_none.txt 1.16666666666666666666
other-q3_any.txt .99000000000000000000
other-q3_none.txt 1.00666666666666666666
other-q4_any.txt 1.70333333333333333333
other-q4_none.txt 28.99666666666666666666
other-q5_any.txt 1.00333333333333333333
other-q5_none.txt 3.41333333333333333333
other-q6_any.txt .99666666666666666666
other-q6_none.txt 1.02000000000000000000
other-q6_partition.txt .98333333333333333333
q1_any.txt .91333333333333333333
q1_none.txt 1.83000000000000000000
q2_any.txt 3.12333333333333333333
q2_none.txt 10.64000000000000000000
q3_any.txt 1.81333333333333333333
q3_none.txt 18.87333333333333333333
q4_any.txt 1.08333333333333333333
q4_none.txt 1.07333333333333333333
q5_any.txt 1.19333333333333333333
q5_none.txt 1.25666666666666666666
q6_any.txt .95000000000000000000
q6_none.txt .96666666666666666666
q7_any.txt 1.02333333333333333333
q7_none.txt 1.02666666666666666666
q8_any.txt 1.05000000000000000000
q8_none.txt 1.08000000000000000000
q9_any.txt 1.10666666666666666666
q9_none.txt 1.11000000000000000000
q10_any.txt 1.25666666666666666666
q10_none.txt 1.23000000000000000000
q11_any.txt .96000000000000000000
q11_none.txt .98000000000000000000
q12_any.txt .94666666666666666666
q12_none.txt .97333333333333333333

@github-actions
Copy link

Smart Homes:

query time
q1_any.txt 4.10666666666666666666
q1_none.txt 3.83333333333333333333
q2_any.txt 3.69333333333333333333
q2_none.txt 43.74000000000000000000
q3_any.txt 7.47333333333333333333
q3_none.txt 127.70666666666666666666
q4_any.txt 13.26333333333333333333
q4_none.txt 109.76666666666666666666
q5_any.txt 13.23333333333333333333
q5_none.txt 43.85333333333333333333
q6_any.txt 2.91000000000000000000
q6_none.txt 2.84333333333333333333
q7_any.txt 3.30666666666666666666
q7_none.txt 3.29333333333333333333
q8_any.txt 3.72666666666666666666
q8_none.txt 3.68333333333333333333
q9_any.txt 4.17666666666666666666
q9_none.txt 4.31000000000000000000
q10_any.txt 5.73000000000000000000
q10_none.txt 5.70000000000000000000
q11_any.txt 2.90333333333333333333
q11_none.txt 2.96333333333333333333
q12_any.txt 2.91000000000000000000
q12_none.txt 2.95333333333333333333

@github-actions
Copy link

Taxis:

query time
q1_any.txt 4.15000000000000000000
q1_none.txt 4.42666666666666666666
q2_any.txt 4.66666666666666666666
q2_none.txt 5.50666666666666666666
q3_any.txt 4.92000000000000000000
q3_none.txt 8.37666666666666666666
q4_any.txt 4.99000000000000000000
q4_none.txt 8.70333333333333333333
q5_any.txt 5.41000000000000000000
q5_none.txt 5.72666666666666666666
q6_any.txt 4.42666666666666666666
q6_none.txt 4.34666666666666666666
q7_any.txt 4.68666666666666666666
q7_none.txt 4.71333333333333333333
q8_any.txt 4.80333333333333333333
q8_none.txt 4.81000000000000000000
q9_any.txt 4.91666666666666666666
q9_none.txt 4.95333333333333333333
q10_any.txt 5.45333333333333333333
q10_none.txt 5.37333333333333333333
q11_any.txt 4.39666666666666666666
q11_none.txt 4.38666666666666666666
q12_any.txt 4.48000000000000000000
q12_none.txt 4.43000000000000000000

@github-actions
Copy link

Ordered Bluesky:

query time
q1.txt .30000000000000000000

@github-actions
Copy link

Unordered Bluesky:

query time
q1.txt 58.48000000000000000000
q2.txt 58.50000000000000000000
q3.txt 58.48000000000000000000
q4.txt 58.49000000000000000000

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants