Skip to content

fix: prevent TTS temp file collision#5

Open
hakanensari wants to merge 7 commits intoPerIPan:mainfrom
hakanensari:fix/tts-temp-file-collision
Open

fix: prevent TTS temp file collision#5
hakanensari wants to merge 7 commits intoPerIPan:mainfrom
hakanensari:fix/tts-temp-file-collision

Conversation

@hakanensari
Copy link

Summary

  • Use longer mktemp template without .wav extension to prevent name collisions that silently break TTS
  • Update orphan cleanup glob to match new pattern

The previous tts_XXXXXX.wav template could leave a literal file behind on mktemp failure, blocking all future TTS until manually cleaned up.

When setup runs a second time (e.g. app relaunch or resetAndRerun),
`uv venv` fails because the venv directory already exists. This
prevents .setup-complete from being written, so the Python server
never starts. The --clear flag tells uv to replace an existing venv
instead of erroring out.
Add design document for three interaction modes: press-to-talk (existing),
hold-to-talk, and hands-free with silence detection and barge-in support.
Four-phase plan covering hold-to-talk, TTS latency optimization,
hands-free mode with silence detection and barge-in, and polish.
Phases 1 and 2 can run in parallel.
- Increase afplay volume to 4x (Kokoro output is quiet)
- Guard mktemp failure to prevent poison temp files
- Auto-submit fires on every transcription when enabled (no trigger word needed)
- Send plain Enter instead of Cmd+Enter (Cmd+Enter maximizes Ghostty)
- Auto-submit fires on every transcription when enabled (no trigger word needed)
- Send plain Enter with explicit zero modifier flags (fixes Ghostty maximizing on Ctrl+Enter)
- Increase delay to 1s to ensure Control key is released before Enter
- Boost afplay volume to 4x (Kokoro output is naturally quiet)
- Guard mktemp failure to prevent poison temp files
- Remove outdated "say submit or send" hint from menubar
- Use longer mktemp template without .wav extension to avoid name collisions
- Update orphan cleanup glob to match new pattern
Copilot AI review requested due to automatic review settings March 10, 2026 13:07
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR primarily targets reliability of the TTS hook by preventing temp-file name collisions and ensuring orphan cleanup matches the new temp-file pattern. It also includes changes related to auto-submit behavior/input submission and adds hands-free mode design/implementation planning docs.

Changes:

  • Update tts-hook.sh temp file creation to use a safer mktemp template and broaden orphan cleanup to match.
  • Adjust unified server auto-submit behavior and the way the “submit” keypress is synthesized.
  • Add hands-free mode design/implementation planning documentation and tweak setup/UI behavior.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
servers/unified_server.py Changes auto-submit flow and synthesizes a plain Enter keypress with a longer delay.
hooks/tts-hook.sh Fixes mktemp template/collision risk and updates orphan cleanup; adjusts playback invocation.
docs/plans/2026-03-10-hands-free-mode-implementation.md Adds detailed implementation plan for hands-free + hold-to-talk.
docs/plans/2026-03-10-hands-free-mode-design.md Adds design doc outlining modes, architecture, and state machine.
app/Sources/ClaudeWhisperer/SetupManager.swift Modifies venv creation to clear existing environment during setup.
app/Sources/ClaudeWhisperer/MenuBarView.swift Removes auto-submit “say submit/send…” helper text from the UI.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment on lines 268 to 272
should_submit = False
if os.path.exists(AUTO_SUBMIT_FLAG):
text, should_submit = check_submit_trigger(text)
text, _ = check_submit_trigger(text)
should_submit = True

Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should_submit is forced to True whenever the auto-submit flag exists, even when the transcription result is empty/whitespace. This can cause an Enter keypress to be sent with no text (e.g., silence or a failed transcription), potentially submitting an empty message. Consider gating auto-submit on text.strip() (while still calling check_submit_trigger only to strip trigger words when present).

Copilot uses AI. Check for mistakes.
Comment on lines 179 to +209
@@ -187,25 +187,26 @@ def press_cmd_enter():
_cg.CFRelease.argtypes = [ctypes.c_void_p]

kCGSessionEventTap = 1
kCGEventFlagMaskCommand = 0x00100000
kVK_Return = 0x24 # 36

key_down = _cg.CGEventCreateKeyboardEvent(None, kVK_Return, True)
key_up = _cg.CGEventCreateKeyboardEvent(None, kVK_Return, False)
_cg.CGEventSetFlags(key_down, kCGEventFlagMaskCommand)
_cg.CGEventSetFlags(key_up, kCGEventFlagMaskCommand)
# Explicitly clear all modifier flags so held keys (Ctrl, Cmd, etc.)
# don't bleed into the Enter event
_cg.CGEventSetFlags(key_down, 0)
_cg.CGEventSetFlags(key_up, 0)
_cg.CGEventPost(kCGSessionEventTap, key_down)
_cg.CGEventPost(kCGSessionEventTap, key_up)
_cg.CFRelease(key_down)
_cg.CFRelease(key_up)
except Exception:
logger.exception("press_cmd_enter failed")
logger.exception("press_enter failed")


async def _delayed_enter():
await asyncio.sleep(0.3)
await asyncio.sleep(1.0)
loop = asyncio.get_running_loop()
await loop.run_in_executor(None, press_cmd_enter)
await loop.run_in_executor(None, press_enter)
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is titled/described as a TTS temp-file collision fix, but this hunk also changes auto-submit semantics (always submit when the flag exists), changes the keypress behavior from Cmd+Enter to plain Enter, and increases the submit delay from 0.3s to 1.0s. Please either update the PR title/description to reflect these behavior changes or split them into a separate PR to keep the fix scoped.

Copilot uses AI. Check for mistakes.
Comment on lines 142 to 144
if [ -s "$TMPFILE" ]; then
afplay "$TMPFILE" 2>/dev/null
afplay -v 4 "$TMPFILE" 2>/dev/null
fi
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

afplay -v 4 hard-codes a much louder playback volume than the default and may cause clipping/distortion (and if afplay enforces a 0.0–1.0 range, it can fail silently due to stderr being redirected). Consider leaving volume at default or making it configurable via an env var/config file with a validated range.

Copilot uses AI. Check for mistakes.
Comment on lines 35 to 39
// Step 1: Create venv with bundled uv
guard runCommand(
Paths.uvBinary.path,
args: ["venv", Paths.venv.path, "--python", "3.13"],
args: ["venv", Paths.venv.path, "--python", "3.13", "--clear"],
step: "Creating Python environment..."
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding --clear changes setup behavior by forcibly wiping any existing venv at Paths.venv on every first-launch setup run. Since this is unrelated to the stated PR goal (TTS temp file collision), consider moving it to a separate PR or updating the PR description to include the setup/venv behavior change.

Copilot uses AI. Check for mistakes.
PerIPan added a commit that referenced this pull request Mar 10, 2026
Rename all refs, bundle ID, App Support path, source dir.
Apply PR #1 (uv venv --clear), PR #4 (plain Enter, TTS vol boost,
auto-submit always fires), PR #5 (mktemp hardening).
Fix .gitignore to track source files.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants