Skip to content

Conversation

@KenVanHoeylandt
Copy link
Contributor

@KenVanHoeylandt KenVanHoeylandt commented Jan 24, 2026

Summary by CodeRabbit

  • New Features

    • Added concurrent Dispatcher and EventGroup APIs, timed mutex/recursive-mutex try-locks, and ESP32 error-to-project-error translation.
  • Bug Fixes

    • Fixed a potential mutex deadlock during dispatcher shutdown and improved dispatcher/consumer synchronization.
  • Refactor

    • Unified return values to error_t, introduced explicit success/error codes, standardized public names to camelCase, and simplified event-group error reporting.
  • Tests

    • Added Dispatcher tests and updated existing tests to the new error semantics.

✏️ Tip: You can customize this high-level summary in your review settings.

And cleanup+fixes for existing
@coderabbitai
Copy link

coderabbitai bot commented Jan 24, 2026

📝 Walkthrough

Walkthrough

This PR centralizes error semantics by adding an error_t typedef and new ERROR_* codes and converting many APIs from int/bool to error_t. It adds a queue-based Dispatcher and C-compatible EventGroup APIs with timed dispatch/consume and wait semantics, and implements timed try-locks for mutex and recursive mutex. Numerous public struct fields and parameters were renamed from snake_case to camelCase. ESP32 error translation helpers and driver adaptations were added. Miscellaneous fixes: dispatcher mutex shutdown handling, wait call argument reorderings, tests and timing adjustments.

🚥 Pre-merge checks | ✅ 1 | ❌ 2
❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 11.11% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title 'TactilityKernel improvements' is vague and generic, using non-descriptive terms that don't convey meaningful information about the changeset. Consider a more specific title that highlights the primary change, such as 'Convert error handling to error_t codes and refactor API signatures' or 'Improve EventGroup and Dispatcher with simplified signatures and error handling'.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (1)
TactilityKernel/Source/concurrent/Dispatcher.cpp (1)

47-53: dispatcher_free can still race in‑flight operations.

Even with shutdown atomic and a mutex lock/unlock, a consumer can be executing a callback outside the mutex (or blocked in event_group_wait) when dispatcher_free deletes DispatcherData, which risks use‑after‑free. Add lifetime coordination (e.g., in‑flight refcount + wait/handshake) or explicitly require all users to stop before free and document/enforce it.

🧹 Nitpick comments (1)
TactilityKernel/Source/drivers/GpioController.cpp (1)

10-27: Consider guarding against missing driver/API now that the return type is int.
If these wrappers can be called before a driver is bound, device_get_driver() or driver->api being null will crash. Please confirm the call contract or add a defensive check and return a non‑zero error code.

💡 Defensive guard (if call contract allows)
 int gpio_controller_set_level(Device* device, gpio_pin_t pin, bool high) {
     const auto* driver = device_get_driver(device);
+    if (driver == nullptr || driver->api == nullptr) {
+        return ERROR_INVALID_STATE;
+    }
     return GPIO_DRIVER_API(driver)->set_level(device, pin, high);
 }
 
 int gpio_controller_get_level(Device* device, gpio_pin_t pin, bool* high) {
     const auto* driver = device_get_driver(device);
+    if (driver == nullptr || driver->api == nullptr) {
+        return ERROR_INVALID_STATE;
+    }
     return GPIO_DRIVER_API(driver)->get_level(device, pin, high);
 }
 
 int gpio_controller_set_options(Device* device, gpio_pin_t pin, gpio_flags_t options) {
     const auto* driver = device_get_driver(device);
+    if (driver == nullptr || driver->api == nullptr) {
+        return ERROR_INVALID_STATE;
+    }
     return GPIO_DRIVER_API(driver)->set_options(device, pin, options);
 }
 
 int gpio_controller_get_options(Device* device, gpio_pin_t pin, gpio_flags_t* options) {
     const auto* driver = device_get_driver(device);
+    if (driver == nullptr || driver->api == nullptr) {
+        return ERROR_INVALID_STATE;
+    }
     return GPIO_DRIVER_API(driver)->get_options(device, pin, options);
 }

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
TactilityKernel/Source/Driver.cpp (1)

68-82: Fix return type mismatch in driver_remove.

The function returns ERROR_* codes but is declared bool, which flips success/failure semantics. Align the signature with error_t (or return true/false explicitly).

🐛 Proposed fix
-static bool driver_remove(Driver* driver) {
+static error_t driver_remove(Driver* driver) {
     LOG_I(TAG, "remove %s", driver->name);

     ledger.lock();
     const auto iterator = std::ranges::find(ledger.drivers, driver);
     // check that there actually is a 3 in our vector
     if (iterator == ledger.drivers.end()) {
         ledger.unlock();
         return ERROR_NOT_FOUND;
     }
     ledger.drivers.erase(iterator);
     ledger.unlock();

     return ERROR_NONE;
 }
♻️ Duplicate comments (3)
TactilityKernel/Include/Tactility/concurrent/EventGroup.h (1)

31-39: Documentation says "return value is true" but these return error_t.

The @param[out] outFlags descriptions on lines 35, 45, and 64 state the value is set "when the return value is true", but these functions return error_t where 0 (ERROR_NONE) indicates success. This was already flagged in a prior review.

Also applies to: 41-49, 57-77

TactilityKernel/Source/concurrent/Dispatcher.cpp (2)

49-55: Dispatcher lifetime still unsafe during free.
Line 49-55: dispatcher_free deletes the object after a lock/unlock, but concurrent dispatch/consume calls can still be in-flight or re-enter after the unlock, leading to use-after-free. Please add explicit lifetime coordination (refcount/handshake) or require external synchronization before free.


86-91: Surface event_group_set failures to the caller.
Line 86-91: On event_group_set failure you only log and still return ERROR_NONE, so callers can’t react even though the consumer may never wake. Propagate the error and decide how to document/handle the already-queued item to avoid duplicate retries.

🐛 Suggested change (propagate error)
-    if (event_group_set(data->eventGroup, WAIT_FLAG, nullptr) != ERROR_NONE) {
+    const auto setResult = event_group_set(data->eventGroup, WAIT_FLAG, nullptr);
+    if (setResult != ERROR_NONE) {
 `#ifdef` ESP_PLATFORM
         LOG_E(TAG, "Failed to set flag");
 `#endif`
-    }
-    return ERROR_NONE;
+        return setResult;
+    }
+    return ERROR_NONE;
🧹 Nitpick comments (5)
TactilityKernel/Include/Tactility/concurrent/EventGroup.h (1)

15-17: Unused struct EventGroup.

The EventGroup struct is declared but never used by the API — all functions operate on EventGroupHandle_t directly. Consider either removing it or using it consistently (e.g., having the construct/destruct functions operate on struct EventGroup*).

TactilityKernel/Source/concurrent/EventGroup.cpp (1)

10-29: ISR path returns inFlags rather than actual bits in outFlags.

In the ISR path (line 17), outFlags is set to inFlags (the requested flags), while the non-ISR path (line 25) sets it to the actual resulting bits. This is due to FreeRTOS API limitations (xEventGroupSetBitsFromISR doesn't return the new bits synchronously), but the behavioral difference could surprise callers. Consider documenting this in the header.

TactilityKernel/Include/Tactility/concurrent/Dispatcher.h (1)

20-20: Use (void) instead of () for C compatibility.

In C, dispatcher_alloc() declares a function with an unspecified number of parameters, whereas dispatcher_alloc(void) explicitly declares no parameters. While this header is wrapped in extern "C", using (void) is more precise and avoids potential issues with strict C compilers.

Suggested fix
-DispatcherHandle_t dispatcher_alloc();
+DispatcherHandle_t dispatcher_alloc(void);
Tests/TactilityKernel/DriverIntegrationTest.cpp (1)

39-42: Potential inconsistency: config uses literal 0 while tests expect ERROR_NONE.

The IntegrationDriverConfig uses literal 0 for startResult/stopResult, but the test assertions on lines 50-63 compare against ERROR_NONE. This works only if ERROR_NONE is defined as 0. For consistency and clarity, consider using ERROR_NONE here as well.

♻️ Suggested improvement
     static const IntegrationDriverConfig config {
-        .startResult = 0,
-        .stopResult = 0
+        .startResult = ERROR_NONE,
+        .stopResult = ERROR_NONE
     };
Platforms/PlatformEsp32/Source/Esp32I2c.cpp (1)

76-76: Minor: Prefer conventional static const ordering.

The declaration uses const static which is valid but unconventional. The standard C/C++ idiom is static const.

♻️ Suggested fix
-const static I2cControllerApi esp32_i2c_api = {
+static const I2cControllerApi esp32_i2c_api = {

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

♻️ Duplicate comments (1)
TactilityKernel/Source/concurrent/Dispatcher.cpp (1)

49-55: Dispatcher lifetime is still unsafe under concurrent use.

dispatcher_free can delete the dispatcher while other threads are still dispatching or consuming, which risks use‑after‑free. Add a shutdown handshake (e.g., in‑flight refcount + wait, or require external quiescence and document it).

🧹 Nitpick comments (2)
Devices/lilygo-tdeck/Source/Init.cpp (1)

33-35: Consider using pdMS_TO_TICKS() for explicit millisecond conversion.

vTaskDelay(100) delays for 100 ticks, not necessarily 100 milliseconds. The actual delay depends on configTICK_RATE_HZ. While ESP-IDF typically defaults to 1000Hz (making this effectively 100ms), using pdMS_TO_TICKS(100) makes the intent explicit and ensures portability.

♻️ Suggested change
     // Avoids crash when no SD card is inserted. It's unknown why, but likely is related to power draw.
-    vTaskDelay(100);
+    vTaskDelay(pdMS_TO_TICKS(100));
TactilityKernel/Source/Driver.cpp (1)

95-105: Decide whether driver_destruct should propagate driver_remove failures.

driver_remove now returns error_t but its result is ignored, so driver_destruct returns ERROR_NONE even if the driver wasn’t registered. Consider either propagating the error or explicitly treating ERROR_NOT_FOUND as success to avoid ambiguous API semantics.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (3)
TactilityKernel/Include/Tactility/concurrent/Dispatcher.h (1)

37-49: LGTM!

The ERROR_RESOURCE documentation has been added at line 42, addressing the previous review feedback. The inline wrapper correctly delegates to the timed variant with portMAX_DELAY.

TactilityKernel/Source/concurrent/EventGroup.cpp (1)

28-28: Unconditional portYIELD_FROM_ISR(pdTRUE) may cause unnecessary context switches.

Unlike event_group_set (line 16) which uses the yield variable returned by the FreeRTOS API, here the yield is forced to pdTRUE. Since xEventGroupClearBitsFromISR doesn't return a yield hint, and clearing bits typically doesn't unblock tasks, consider removing the unnecessary yield.

TactilityKernel/Source/Driver.cpp (1)

85-91: Handle driver_add failures to avoid leaking internal data.

driver_construct always returns success after allocation, even if adding to the ledger fails. That can leak DriverInternalData and hide the error. Consider returning the error and cleaning up. (Assumes driver_add is updated to return error_t.)

🛠️ Suggested fix
 error_t driver_construct(Driver* driver) {
     driver->internal.data = new(std::nothrow) DriverInternalData;
     if (driver->internal.data == nullptr) {
         return ERROR_OUT_OF_MEMORY;
     }
-    driver_add(driver);
-    return ERROR_NONE;
+    const error_t err = driver_add(driver);
+    if (err != ERROR_NONE) {
+        delete driver_internal_data(driver);
+        driver->internal.data = nullptr;
+        return err;
+    }
+    return ERROR_NONE;
 }
🧹 Nitpick comments (4)
TactilityKernel/Include/Tactility/concurrent/Dispatcher.h (3)

8-8: Remove unused stdbool.h include.

No bool type is used in this header. Removing this also resolves the static analysis error.

🧹 Proposed fix
-#include <stdbool.h>
-
 `#include` <Tactility/Error.h>

20-20: Use (void) for functions taking no parameters.

In C, empty parentheses () declare an unspecified number of parameters, while (void) explicitly declares zero parameters. This improves type safety and is more idiomatic.

🧹 Proposed fix
-DispatcherHandle_t dispatcher_alloc();
+DispatcherHandle_t dispatcher_alloc(void);

51-76: Minor grammar fix in documentation.

"dispatched function" should be "dispatched functions" (plural) since consuming "1 or more" implies potential plurality.

📝 Proposed fix
 /**
- * Consume 1 or more dispatched function (if any) until the queue is empty.
+ * Consume 1 or more dispatched functions (if any) until the queue is empty.
  *
  * `@warning` The timeout is only the wait time before consuming the message! It is not a limit to the total execution time when calling this method.
  *

Apply the same fix at line 65.

TactilityKernel/Include/Tactility/concurrent/Mutex.h (1)

15-16: TODO comment for debugging functionality.

Would you like me to help implement debugging functionality for the Mutex struct (e.g., owner tracking, lock count, acquisition timestamps) or open an issue to track this task?

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (4)
TactilityKernel/Source/Driver.cpp (1)

95-111: Race condition addressed with locking and destroying flag.

The implementation now correctly:

  1. Acquires lock before checking use_count and destroying
  2. Sets destroying = true under the lock to prevent new binds
  3. Releases lock before potentially blocking operations

This addresses the previously identified race between driver_destruct and concurrent bind/unbind calls.

TactilityKernel/Source/concurrent/EventGroup.cpp (1)

23-33: Unconditional portYIELD_FROM_ISR(pdTRUE) may cause unnecessary context switches.

Unlike event_group_set which uses the yield hint from the FreeRTOS API, here portYIELD_FROM_ISR(pdTRUE) is always forced. Since xEventGroupClearBitsFromISR doesn't provide a yield hint and clearing bits typically doesn't unblock tasks, consider removing this yield entirely.

Suggested fix
     if (xPortInIsrContext() == pdTRUE) {
         if (xEventGroupClearBitsFromISR(eventGroup, flags) == pdFAIL) {
             return ERROR_RESOURCE;
         }
-        portYIELD_FROM_ISR(pdTRUE);
     } else {
TactilityFreeRtos/Include/Tactility/Dispatcher.h (1)

38-71: Make shutdown atomic to avoid cross‑thread data races.

shutdown is read/written across threads without synchronization (dispatch/consume vs destructor), which is UB under the C++ memory model. Convert it to std::atomic<bool> (or guard all accesses with the mutex).

🔧 Proposed fix
@@
-#include <functional>
-#include <memory>
-#include <queue>
+#include <functional>
+#include <memory>
+#include <queue>
+#include <atomic>
@@
-    bool shutdown = false;
+    std::atomic<bool> shutdown{false};
@@
-        shutdown = true;
+        shutdown.store(true, std::memory_order_release);
         if (mutex.lock()) {
             mutex.unlock();
         }
@@
-        if (shutdown) {
+        if (shutdown.load(std::memory_order_acquire)) {
             mutex.unlock();
             return false;
         }
@@
-        if (shutdown) {
+        if (shutdown.load(std::memory_order_acquire)) {
             return 0;
         }
@@
-        } while (processing && !shutdown);
+        } while (processing && !shutdown.load(std::memory_order_acquire));

Also applies to: 101-129

TactilityKernel/Source/concurrent/Dispatcher.cpp (1)

49-55: Prevent dispatcher_free from racing active dispatch/consume paths (UAF risk).

dispatcher_free deletes the object while other tasks may still be inside dispatcher_dispatch_timed / dispatcher_consume_timed or blocked on the event group. The atomic shutdown flag alone doesn’t prevent use‑after‑free. Please add lifetime coordination (e.g., in‑flight refcount + wait, or a documented “no concurrent users” contract + shutdown/wakeup).

#!/bin/bash
# Verify call sites of dispatcher_free to see if it can run concurrently.
rg -nP '\bdispatcher_free\s*\(' -C 3
🧹 Nitpick comments (2)
TactilityKernel/Source/Driver.cpp (1)

168-195: Unbinding logic correctly updated with destruction guard.

The function properly mirrors the bind logic with the destroying flag check and error propagation.

Minor nit: Line 187 logs "unbound %s to %s" but grammatically it should be "unbound %s from %s" (compare with line 159 which uses "bound ... to").

📝 Suggested grammar fix
-    LOG_I(TAG, "unbound %s to %s", driver->name, device->name);
+    LOG_I(TAG, "unbound %s from %s", driver->name, device->name);
TactilityKernel/Source/concurrent/EventGroup.cpp (1)

51-53: Minor inconsistency in ISR context check pattern.

Line 51 uses if (xPortInIsrContext()) while other functions use if (xPortInIsrContext() == pdTRUE). Consider using the explicit comparison for consistency.

Suggested fix
-    if (xPortInIsrContext()) {
+    if (xPortInIsrContext() == pdTRUE) {
         return ERROR_ISR_STATUS;
     }

@KenVanHoeylandt KenVanHoeylandt merged commit dfe2c86 into main Jan 25, 2026
69 of 94 checks passed
@KenVanHoeylandt KenVanHoeylandt deleted the develop branch January 25, 2026 22:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants