Skip to content

fix: auto-submit always fires, plain Enter, TTS volume#4

Open
hakanensari wants to merge 6 commits intoPerIPan:mainfrom
hakanensari:fix/auto-submit-improvements
Open

fix: auto-submit always fires, plain Enter, TTS volume#4
hakanensari wants to merge 6 commits intoPerIPan:mainfrom
hakanensari:fix/auto-submit-improvements

Conversation

@hakanensari
Copy link

Summary

  • Auto-submit always fires when enabled — no need to say "submit" or "send" at the end
  • Plain Enter instead of Cmd+Enter — fixes Ghostty maximizing the window. Explicitly clears modifier flags on the CGEvent so held Control key doesn't bleed into the Enter
  • TTS volume boost to 4x — Kokoro output is naturally quiet
  • mktemp guard — prevents poison temp files when mktemp fails
  • Removed outdated hint "say submit or send at end of phrase" from menubar UI

Test plan

  • Enable auto-submit, record a phrase, verify it auto-submits without a trigger word
  • Verify Ghostty does not maximize on auto-submit
  • Verify TTS playback is audible at comfortable volume

When setup runs a second time (e.g. app relaunch or resetAndRerun),
`uv venv` fails because the venv directory already exists. This
prevents .setup-complete from being written, so the Python server
never starts. The --clear flag tells uv to replace an existing venv
instead of erroring out.
Add design document for three interaction modes: press-to-talk (existing),
hold-to-talk, and hands-free with silence detection and barge-in support.
Four-phase plan covering hold-to-talk, TTS latency optimization,
hands-free mode with silence detection and barge-in, and polish.
Phases 1 and 2 can run in parallel.
- Increase afplay volume to 4x (Kokoro output is quiet)
- Guard mktemp failure to prevent poison temp files
- Auto-submit fires on every transcription when enabled (no trigger word needed)
- Send plain Enter instead of Cmd+Enter (Cmd+Enter maximizes Ghostty)
- Auto-submit fires on every transcription when enabled (no trigger word needed)
- Send plain Enter with explicit zero modifier flags (fixes Ghostty maximizing on Ctrl+Enter)
- Increase delay to 1s to ensure Control key is released before Enter
- Boost afplay volume to 4x (Kokoro output is naturally quiet)
- Guard mktemp failure to prevent poison temp files
- Remove outdated "say submit or send" hint from menubar
Copilot AI review requested due to automatic review settings March 10, 2026 13:03
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates Claude Whisperer’s automation and speech pipeline to make auto-submit unconditional when enabled, adjust the synthetic keypress to plain Enter (avoiding modifier bleed), and improve TTS playback loudness and robustness.

Changes:

  • Auto-submit now always triggers when the auto-submit flag is enabled (no trigger phrase required).
  • Synthetic submit keypress changed from Cmd+Enter to plain Enter via CGEvent, explicitly clearing modifier flags and increasing the delay before pressing Enter.
  • TTS hook adds a mktemp failure guard and attempts a volume boost during afplay playback; menubar hint text is removed; docs for hands-free mode planning/design are added.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
servers/unified_server.py Switches to plain Enter key event + modifier clearing; makes auto-submit unconditional when enabled.
hooks/tts-hook.sh Adds mktemp failure guard and changes playback invocation to include a volume flag.
app/Sources/ClaudeWhisperer/SetupManager.swift Adjusts uv venv invocation to include --clear.
app/Sources/ClaudeWhisperer/MenuBarView.swift Removes outdated “say submit/send” UI hint.
docs/plans/2026-03-10-hands-free-mode-implementation.md Adds detailed hands-free/hold-to-talk implementation plan.
docs/plans/2026-03-10-hands-free-mode-design.md Adds hands-free/hold-to-talk design doc.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

@@ -266,7 +267,8 @@ async def transcribe(

should_submit = False
if os.path.exists(AUTO_SUBMIT_FLAG):
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With auto-submit enabled, should_submit is set to True unconditionally, even when the transcription result is empty/whitespace. This will still schedule a synthetic Enter keypress and can end up submitting an empty message. Consider gating auto-submit on text.strip() (or similar) so Enter is only sent when there’s something to submit.

Suggested change
if os.path.exists(AUTO_SUBMIT_FLAG):
# Only auto-submit when there is non-whitespace transcription text.
if os.path.exists(AUTO_SUBMIT_FLAG) and text.strip():

Copilot uses AI. Check for mistakes.
Comment on lines 269 to +271
if os.path.exists(AUTO_SUBMIT_FLAG):
text, should_submit = check_submit_trigger(text)
text, _ = check_submit_trigger(text)
should_submit = True
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that auto-submit always fires, calling check_submit_trigger() solely to strip trailing trigger phrases will remove legitimate dictated content whenever it ends with any entry in SUBMIT_TRIGGERS (e.g., "enter", "go ahead", "send"). If the goal is just to drop the legacy "submit/send" habit, consider narrowing the strip list (or making stripping optional) so normal language endings aren’t silently deleted.

Copilot uses AI. Check for mistakes.
Comment on lines 36 to 39
guard runCommand(
Paths.uvBinary.path,
args: ["venv", Paths.venv.path, "--python", "3.13"],
args: ["venv", Paths.venv.path, "--python", "3.13", "--clear"],
step: "Creating Python environment..."
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uv venv is invoked with --clear. The repo’s setup.sh uses uv venv ... --python 3.13 without this flag, and not all uv versions support --clear. If the bundled uv doesn’t recognize it, first-launch setup will fail. Consider explicitly deleting Paths.venv when re-creating the environment (or probing supported flags / uv version) instead of relying on --clear.

Copilot uses AI. Check for mistakes.

if [ -s "$TMPFILE" ]; then
afplay "$TMPFILE" 2>/dev/null
afplay -v 4 "$TMPFILE" 2>/dev/null
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

afplay -v expects a normalized volume scalar (typically 0.0–1.0). Passing -v 4 may be clamped or cause playback to fail, which would undermine the intended volume boost. Consider using a supported range for -v and, if you need a true 4× gain, amplify/normalize the WAV data before playback or make the gain configurable with a safe ceiling.

Suggested change
afplay -v 4 "$TMPFILE" 2>/dev/null
# Use configurable volume, clamped to afplay's expected 0.0–1.0 range
RAW_VOLUME="${TTS_VOLUME:-1.0}"
VOLUME=$(awk -v v="$RAW_VOLUME" 'BEGIN { if (v == "" || v < 0) print 1.0; else if (v > 1.0) print 1.0; else print v }')
afplay -v "$VOLUME" "$TMPFILE" 2>/dev/null

Copilot uses AI. Check for mistakes.
PerIPan added a commit that referenced this pull request Mar 10, 2026
Rename all refs, bundle ID, App Support path, source dir.
Apply PR #1 (uv venv --clear), PR #4 (plain Enter, TTS vol boost,
auto-submit always fires), PR #5 (mktemp hardening).
Fix .gitignore to track source files.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants