architecture: establish a hardware-first real-time interaction loop

  ## Background

  RT-Claw already has the basic pieces in place: AI chat, tool use, scheduler, heartbeat, and early swarm support.

  What is still missing is a clear system-level direction for real-time behavior. Right now the execution model is still too AI-
  centered:

  - time-sensitive actions can still be shaped by AI latency
  - user input, hardware events, background work, and AI reasoning are not clearly separated
  - the project does not yet present a strong identity around RT responsiveness or smooth hardware interaction

  This makes RT-Claw feel more like an agent running on an RTOS than a real-time AI runtime for the physical world.

  This issue proposes a clear architectural direction:

  RT-Claw should not just run an LLM on embedded hardware. It should run AI on top of a runtime with real-time reflexes.

  ## Goals

  - establish a hardware-first real-time interaction loop
  - keep the fast path independent from LLM latency
  - give users immediate feedback before full completion or explanation
  - turn GPIO, PWM, ADC, LCD, timers, and swarm into part of the interaction model
  - make RT-Claw clearly different from a generic embedded agent

  ## Core Principles

  1. LLM must never sit in the fast path
  2. ack first, complete second, explain last
  3. events must be classified and prioritized, not pushed through one plain FIFO
  4. local actions and local rules come first
  5. AI handles reasoning, planning, and summarization, not every control loop

  ## Proposed Architecture

  ### 1. RT Event Fabric

  Evolve the current gateway into a real event fabric instead of a message queue skeleton.

  At minimum, events should be split into four classes:

  - P0 Reflex: interrupts, limit switches, threshold crossings, emergency stop, critical edge-triggered events
  - P1 Control: GPIO/PWM updates, display changes, scheduled device actions, node state changes
  - P2 Interaction: shell input, IM messages, WebSocket input, progress/status feedback
  - P3 AI/Background: LLM reasoning, heartbeat summaries, memory consolidation, archival work

  Each event should carry metadata such as:

  - source
  - priority
  - deadline
  - correlation_id
  - requires_ai
  - state_snapshot_id

  ### 2. Fast Path Runtime

  Provide a local execution path for hardware interaction that does not depend on LLM calls.

  Typical fast-path capabilities include:

  - GPIO input/output
  - PWM control
  - ADC sampling with local threshold/rule checks
  - partial LCD updates
  - simple rule evaluation
  - deadline-sensitive scheduled actions
  - swarm state updates

  The point is not "AI can call hardware tools".
  The point is "the runtime can perform the right hardware action immediately".

  ### 3. Slow AI Plane

  Keep `ai_engine` as the slow path for:

  - complex reasoning
  - natural language explanation
  - multi-step tool orchestration
  - periodic summaries
  - memory organization and writeback

  The AI plane should only consume events that actually need AI.
  It must not define the latency of the whole system.

  ### 4. Capability Registry

  The current tool model should be extended beyond "tools exposed to the LLM".

  RT-Claw should maintain a capability registry that both the runtime and the AI layer can use.

  Each capability should eventually describe properties such as:

  - latency_class
  - safe_in_irq
  - safe_in_worker
  - requires_ai
  - display_affinity
  - deadline_hint

  That gives the runtime enough information to decide whether something belongs on the fast path or the slow path.

  ## Real-Time Interaction Loop

  The target interaction loop should look like this:

  1. event arrives
  2. classify the event
  3. send immediate ack if user-facing
  4. execute local action if possible
  5. push state update to shell / LCD / IM / WebSocket
  6. call AI only if planning or explanation is needed
  7. write memory / logs asynchronously in the background

  The key idea is simple:

  action first, explanation later

  ## Example Flows

  ### Sensor / GPIO event

  - interrupt or sampling event arrives
  - runtime classifies it as P0 or P1
  - local rule runs immediately
  - GPIO / LCD / state changes are applied
  - user sees instant feedback
  - AI is called only if a summary or explanation is actually needed

  ### IM command

  - Feishu or future IM message arrives
  - system immediately replies with a short ack such as "received" or "executing"
  - local tools execute first
  - final result is returned
  - optional AI text is added only when needed

  ### Scheduled task

  - scheduler fires a local action or an AI-triggering task
  - local actions must not wait for AI
  - AI-based tasks must run in a worker and must never block scheduling behavior

  ## Why This Matters

  This direction gives RT-Claw a much clearer identity:

  - not just an embedded chatbot
  - not just a tool-calling agent on an RTOS
  - but a runtime that gives AI real-time reflexes in the physical world

  Put differently:

  - cloud AI provides intelligence
  - RT-Claw provides reflexes

  ## Impact on Existing Modules

  ### gateway
  Evolve into an RT Event Fabric with priority lanes, deadline-aware routing, and deferred work support.

  ### scheduler
  Move beyond a coarse polling scheduler.
  The current 1s tick is fine for early demos, but it is not enough for the real-time story.

  ### ai_engine
  Keep it as a serialized slow-path executor, but make it consume only AI-worthy work.
  AI execution should not block event/control handling.

  ### tools
  Separate "LLM tool" from "runtime capability".
  The runtime should be able to invoke capabilities directly without routing through the model.

  ### heartbeat
  Keep the current direction of aggregating events first and only calling AI when useful.

  ### swarm
  Treat swarm as a distributed event source and coordination layer, not just as a future messaging feature.

  ## Suggested Milestones

  ### Phase 1: Immediate feedback
  - unify ack/status reporting for shell, LCD, and IM
  - consistently report `received / executing / done / failed`
  - expose tool execution progress outside the local shell

  ### Phase 2: Event separation
  - introduce multiple queues or priority lanes
  - separate chat, control, hardware event, and background AI work
  - prevent AI work from blocking control/event handling

  ### Phase 3: Hardware-first interaction
  - add a local rule engine for fast actions
  - support partial LCD refresh / dirty-region updates
  - improve scheduler granularity
  - integrate swarm events into the same event fabric

  ## Acceptance Criteria

  - a time-sensitive local action can complete without waiting for AI
  - interactive requests always get an immediate ack
  - AI execution no longer blocks event/control handling
  - hardware capability metadata is available to runtime scheduling
  - shell / IM / LCD share one status reporting model
  - the architecture clearly separates reflex path from cognitive path

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

architecture: establish a hardware-first real-time interaction loop #2

Background

Goals

Core Principles

Proposed Architecture

1. RT Event Fabric

2. Fast Path Runtime

3. Slow AI Plane

4. Capability Registry

Real-Time Interaction Loop

Example Flows

Sensor / GPIO event

IM command

Scheduled task

Why This Matters

Impact on Existing Modules

gateway

scheduler

ai_engine

tools

heartbeat

swarm

Suggested Milestones

Phase 1: Immediate feedback

Phase 2: Event separation

Phase 3: Hardware-first interaction

Acceptance Criteria

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

architecture: establish a hardware-first real-time interaction loop #2

Description

Background

Goals

Core Principles

Proposed Architecture

1. RT Event Fabric

2. Fast Path Runtime

3. Slow AI Plane

4. Capability Registry

Real-Time Interaction Loop

Example Flows

Sensor / GPIO event

IM command

Scheduled task

Why This Matters

Impact on Existing Modules

gateway

scheduler

ai_engine

tools

heartbeat

swarm

Suggested Milestones

Phase 1: Immediate feedback

Phase 2: Event separation

Phase 3: Hardware-first interaction

Acceptance Criteria

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions