Skip to content

feat: enable /voice mode with native audio binaries#92

Merged
claude-code-best merged 1 commit intoclaude-code-best:mainfrom
amDosion:feat/enable-voice-mode
Apr 3, 2026
Merged

feat: enable /voice mode with native audio binaries#92
claude-code-best merged 1 commit intoclaude-code-best:mainfrom
amDosion:feat/enable-voice-mode

Conversation

@amDosion
Copy link
Copy Markdown
Contributor

@amDosion amDosion commented Apr 3, 2026

Summary

  • 复制官方 cpal 原生 audio-capture.node 二进制(6 平台,含 Windows x64)
  • 替换 packages/audio-capture-napi/src/index.ts 的 SoX 子进程 stub 为原生 .node 加载器
  • scripts/dev.ts + build.ts"VOICE_MODE" 编译开关
  • DEV-LOG.md 追加 Voice Mode 章节

Why

/voice 命令不显示且报 native audio module could not be loaded。原因:

  1. feature('VOICE_MODE') 编译时为 false → 命令未注册
  2. audio-capture-napi 是 SoX stub → Windows 硬编码 return false
  3. 缺少原生 .node 二进制文件

src/ 下所有 voice 源码已与官方一致(0 行差异),只需补 vendor 文件 + 开开关。

Test plan

  • bun run dev/voice 命令可见(需 /login 登录 claude.ai)
  • isNativeAudioAvailable() 返回 true(Windows x64)
  • bun run build → 产物包含 voice 代码

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Voice mode is now enabled by default for all builds, activating speech input capabilities
    • Added microphone permission status checking to improve platform compatibility
    • Enhanced audio playback controls for voice features and interactive commands
  • Bug Fixes

    • Restored voice input functionality for executing speech commands and voice-based code
  • Documentation

    • Updated voice mode documentation with recovery procedures and setup guidance

Restore voice input by:
- Copying official cpal-based audio-capture.node binaries (6 platforms)
- Replacing SoX subprocess stub with native .node loader
- Adding VOICE_MODE to default build features

All voice source files in src/ already match the official CLI.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 3, 2026

📝 Walkthrough

Walkthrough

This PR enables VOICE_MODE functionality by replacing a non-functional child_process/SoX audio capture implementation with a native N-API module loader. Build configuration defaults are updated to include VOICE_MODE, and platform-specific native binaries with vendor-sourced module loading are introduced.

Changes

Cohort / File(s) Summary
Build Configuration
build.ts, scripts/dev.ts
Added VOICE_MODE to default feature flags, ensuring voice functionality is compiled and available by default without explicit environment configuration.
Audio Capture Implementation
packages/audio-capture-napi/src/index.ts
Replaced child_process-based recording (SoX/arecord) with native module delegation. Removed platform probing and process lifecycle management; added dynamic module loading with fallback paths and new playback surface (startNativePlayback, writeNativePlaybackData, stopNativePlayback, isNativePlaying). Added microphoneAuthorizationStatus() export.
Native Module Wrapper
vendor/audio-capture-src/index.ts
New module providing centralized native .node addon loading with memoization and multiple fallback paths (env variable, workspace relative, dev/installed layouts). Exports recording/playback control functions and microphone permission status query that safely delegate to or no-op when native addon unavailable.
Documentation
DEV-LOG.md
Documented voice mode restoration including configuration switch, native availability validation on Windows x64, and feature enablement confirmation.

Sequence Diagram(s)

sequenceDiagram
    participant App as Application
    participant Wrapper as audio-capture-src<br/>(Wrapper Module)
    participant Loader as Module Loader
    participant Native as audio-capture.node<br/>(Native Addon)
    
    App->>Wrapper: startNativeRecording(onData, onEnd)
    Wrapper->>Loader: loadModule()
    Loader->>Loader: Check env AUDIO_CAPTURE_NODE_PATH
    alt Path provided
        Loader->>Native: require(envPath)
    else Fallback to vendor paths
        Loader->>Native: require(platform/arch path)
    end
    Native-->>Loader: Module loaded & cached
    Loader-->>Wrapper: cachedModule
    Wrapper->>Native: startNativeRecording()
    Native-->>Wrapper: true (started)
    Wrapper-->>App: true (started)
    
    Native->>Native: Capture audio frames
    Native->>Wrapper: onData(audioBuffer)
    Wrapper->>App: onData(audioBuffer)
    
    App->>Wrapper: stopNativeRecording()
    Wrapper->>Native: stopNativeRecording()
    Native->>Wrapper: onEnd()
    Wrapper->>App: onEnd()
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Poem

🐰 Hop, hop! The voice module's alive,
Native bindings help it thrive,
From child processes we've broken free,
Now audio capture works, you see!
Microphones and speakers sing,
What native code can bring! 🎙️

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 10.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat: enable /voice mode with native audio binaries' directly and accurately summarizes the main changes: enabling the /voice feature through native audio binaries, which is the core objective of the PR.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@amDosion
Copy link
Copy Markdown
Contributor Author

amDosion commented Apr 3, 2026

image 按住空格键开始录音。注意:“中文”暂不支持作为语音听写语言,当前将使用英语。您可以通过 /config 进行更改。

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
vendor/audio-capture-src/index.ts (1)

58-61: Missing process.cwd() fallback compared to package version.

The package version at packages/audio-capture-napi/src/index.ts includes a third fallback path using process.cwd():

`${process.cwd()}/vendor/audio-capture/${platformDir}/audio-capture.node`

This fallback is documented as necessary "when loaded from a workspace package." If this vendor loader is intended to work in dev/workspace scenarios, consider adding the same fallback for consistency.

💡 Proposed fix to add process.cwd() fallback
   const fallbacks = [
     `./vendor/audio-capture/${platformDir}/audio-capture.node`,
     `../audio-capture/${platformDir}/audio-capture.node`,
+    `${process.cwd()}/vendor/audio-capture/${platformDir}/audio-capture.node`,
   ]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@vendor/audio-capture-src/index.ts` around lines 58 - 61, The fallback list in
vendor/audio-capture-src/index.ts (the const fallbacks using platformDir) is
missing the workspace-aware fallback; add the same third entry used in
packages/audio-capture-napi/src/index.ts — a path built with process.cwd() like
`${process.cwd()}/vendor/audio-capture/${platformDir}/audio-capture.node` — so
the loader will resolve when run from a workspace/dev environment.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@vendor/audio-capture-src/index.ts`:
- Around line 58-61: The fallback list in vendor/audio-capture-src/index.ts (the
const fallbacks using platformDir) is missing the workspace-aware fallback; add
the same third entry used in packages/audio-capture-napi/src/index.ts — a path
built with process.cwd() like
`${process.cwd()}/vendor/audio-capture/${platformDir}/audio-capture.node` — so
the loader will resolve when run from a workspace/dev environment.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6867bfe6-839f-4eea-bc5b-c120ca693948

📥 Commits

Reviewing files that changed from the base of the PR and between 29db9d9 and 7ae9432.

📒 Files selected for processing (11)
  • DEV-LOG.md
  • build.ts
  • packages/audio-capture-napi/src/index.ts
  • scripts/dev.ts
  • vendor/audio-capture-src/index.ts
  • vendor/audio-capture/arm64-darwin/audio-capture.node
  • vendor/audio-capture/arm64-linux/audio-capture.node
  • vendor/audio-capture/arm64-win32/audio-capture.node
  • vendor/audio-capture/x64-darwin/audio-capture.node
  • vendor/audio-capture/x64-linux/audio-capture.node
  • vendor/audio-capture/x64-win32/audio-capture.node

@claude-code-best
Copy link
Copy Markdown
Owner

@amDosion 有个问题, 我们很多人是没有 anthropic 官方端口的, 那么能否判断 base url 决定使用适配器呢

@amDosion
Copy link
Copy Markdown
Contributor Author

amDosion commented Apr 3, 2026

@amDosion 有个问题, 我们很多人是没有 anthropic 官方端口的, 那么能否判断 base url 决定使用适配器呢

可以利用兼容的,很多国产模型就是支持 anthropic 官方端口,要么就是改客户端(改客户端可以支持任意模型切换),客户端内部有校验(代码中有体现出来),目前有人已经改出来了,目前我还没有试验.
image
是这种么?

@amDosion
Copy link
Copy Markdown
Contributor Author

amDosion commented Apr 3, 2026

image

@claude-code-best
Copy link
Copy Markdown
Owner

我觉得还是改客户端好些, 抽象一个兼容层, 可以替换为其他实现. 刚刚还有个人发 issues 说百炼不支持 anthropic 缓存, 那没活

@claude-code-best claude-code-best merged commit 1314650 into claude-code-best:main Apr 3, 2026
2 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants