Skip to content

refactor(hud): event-driven PostMessage architecture for stability (v0.2.31)#12

Merged
Aalwattar merged 9 commits intomasterfrom
fix/hud-disappearance
Mar 29, 2026
Merged

refactor(hud): event-driven PostMessage architecture for stability (v0.2.31)#12
Aalwattar merged 9 commits intomasterfrom
fix/hud-disappearance

Conversation

@Aalwattar
Copy link
Copy Markdown
Owner

his production-ready PR integrates the comprehensive HUD stability refactor with recent updates to the application's evaluation framework.

🚀 Key Changes

  1. HUD Stability Refactor (PostMessage)
    The core of this PR is the transition from constant polling to a thread-safe Win32 PostMessage architecture.

Dynamic Repositioning: Resolves the "ghost window" bug by recalculating HUD coordinates on every WM_APP_SHOW event, ensuring it stays correctly anchored across sleep/wake cycles.
Thread-Safety: All UI state updates (text, status, provider) now occur on the HUD thread via async messaging, eliminating 64-bit Win32 race conditions.
Zero Idle CPU: The heartbeat timer is now only active when the HUD is visible.
2. Evaluation Framework Updates
Spec Improvements: Updates to conductor/archive/ reflects progress in the Arabic and Multilingual accuracy evaluation specs.
Data Persistence: Includes recent transcription outputs in eval_results/ for baseline comparisons.
3. Documentation & Infrastructure
Architecture: architecture.md has been updated to reflect the new HUD design.
Version Bump: Atomic synchronization of pyproject.toml, version_info.txt, uv.lock, and parrotink.iss to version 0.2.31.
✅ Verification
Passed full Definition of Done Gate (Ruff, MyPy, Pytest).
Verified local build and Inno Setup integration.
Confirmed zero HUD idle CPU footprint.

… upgrade HUD failure logs to WARNING

- ui.py: Remove redundant threading.Thread(target=self.indicator.start) call in
  TrayApp.run(). _ensure_indicator() in __init__ already started the indicator.
  The second thread could race with the first before _is_running was set, creating
  two concurrent Win32 message pumps on the same HWND and silently killing the HUD.
- hud_renderer.py: Remove orphaned ('VISIBILITY', ...) puts from show() and hide().
  ShowWindow() already handles visibility directly; the queue item was never consumed
  by _wnd_proc and served no purpose.
- hud_renderer.py: Upgrade 'HUD Run loop exited.' log from INFO to WARNING so HUD
  thread death is clearly visible at INFO log level.
- indicator_ui.py: Upgrade HudOverlay fallback log messages from INFO to WARNING so
  silent degradation to GdiFallbackWindow is immediately visible in the log.
After sleep/lock/unlock, the layered window bitmap goes stale and the
Win32 timer (WM_TIMER) can die. ShowWindow makes the HWND visible but
with no fresh content, leaving it transparent.

Fix: In show(), queue a REDRAW event to force changed=True on the next
timer tick, and re-register the 50ms SetTimer to resurrect it if it
died during a long session lock. The previous VISIBILITY queue items
served this same purpose (forcing changed=True) but were removed in
the prior commit under the assumption they were dead code.

Also stores _refresh_rate_ms as an instance variable so show() can
access it for the SetTimer call.
Added targeted diagnostics to HudOverlay to pinpoint why the window
remains invisible after long sleep/lock cycles:
- Log GetLastError() if UpdateLayeredWindow fails (Fix A)
- Log HudOverlay.show() calls at DEBUG level (Fix B)
- Log WM_TIMER periodic heartbeat every 60s at DEBUG level (Fix C)
- show(): log at INFO with immediate IsWindowVisible() result after
  ShowWindow to confirm the Win32 call's effect at INFO log level.
- hide(): log at INFO so hide/show pairs are traceable in the log.
- WM_TIMER heartbeat: upgrade from DEBUG to INFO so we can confirm
  the message loop is alive in normal (non-verbose) operation.
Root cause confirmed by logs (2026-03-26 10:30):
  show() called at 10:30:26
  hide() called at 10:30:27 (linger A fires, same session_id)
  hide() called at 10:30:28 (linger B fires, same session_id)

Multiple callers of _start_linger_timer() (click-away, CONNECTING state,
update_status_icon) all captured the same _session_id=N, so every one
of their timers would fire and hide the HUD — even immediately after a
valid show() call for the new reconnect.

Fix: _start_linger_timer() now increments _session_id BEFORE capturing
it as session_at_start. Each linger timer therefore gets a unique,
monotonically increasing ID. Any earlier pending timer sees a stale ID
and skips the hide(). Only the most recently scheduled linger can fire.
@Aalwattar Aalwattar merged commit 986d99b into master Mar 29, 2026
1 check passed
@Aalwattar Aalwattar deleted the fix/hud-disappearance branch March 29, 2026 19:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant