Add dictation command processing for voice input#1874

Open

liketheduck wants to merge 2 commits intofuto-org:masterfrom

liketheduck:feature/dictation-commands

liketheduck commented Feb 11, 2026 •

edited

Loading

Summary

Adds a post-processing step to the voice input pipeline that converts spoken command phrases into their corresponding characters and formatting. When enabled, phrases like "new line", "dollar sign", "caps on", and "open parenthesis" are replaced with the appropriate output before text is committed to the input field.

Covers 71 commands across 10 categories: formatting, capitalization, punctuation, symbols, math, currency, emoticons, and intellectual property marks. Each category can be independently toggled. The feature is enabled by default and lives behind its own settings sub-page under Voice Input.

Co-authored-by: Claude Code (claude-opus-4-6)

Changes

DictationCommandProcessor.kt -- Core processor. Stateless across calls. Handles multi-word command matching (up to 4 words, longest match first), stateful modes (caps on/off, no space on/off), numeral/roman numeral conversion, and per-command spacing rules. Tolerates Whisper auto-punctuation on command words.
VoiceInputAction.kt -- Wired into the voice input pipeline. Sanitizer runs first to clean Whisper output, then the processor handles command substitution.
VoiceInputSettingKeys.kt -- 9 DataStore preference keys (master toggle + 8 categories).
VoiceInput.kt / SettingsNavigator.kt -- Settings UI. Single navigation item on the Voice Input page opens a dedicated Dictation Commands sub-page with all toggles.
strings-uix.xml -- 18 string resources for setting titles and subtitles.
build.gradle -- Added java/test as a JVM unit test source set.
DictationCommandProcessorTest.kt -- 112 unit tests covering all command categories, stateful interactions, edge cases, Whisper punctuation tolerance, and cursor-context spacing.

Design decisions

Runs after ModelOutputSanitizer so that formatting characters (\n, \t) are not destroyed by trim().
Strips trailing Whisper auto-punctuation from words during matching, but only strips it from the output buffer when the preceding token was a regular word (not a command). This preserves explicit command sequences like "comma new line" while cleaning up Whisper artifacts like "hello, new line, world".
Matches the existing codebase style: Kotlin object singleton, @JvmStatic entry point, DataStore settings with SettingsKey pattern.

Test plan

112 JVM unit tests pass via ./gradlew testUnstableDebugUnitTest
Install APK and verify voice commands produce expected output (new line, period, dollar sign, caps on/off, etc.)
Verify settings sub-page renders correctly and toggles persist
Verify disabling the master toggle passes transcription through unchanged
Verify feature does not affect voice input when "Use system voice input" is enabled

futo-cla-pr-labler bot commented Feb 11, 2026

Please sign our contributor license agreement at https://cla.futo.org

futo-cla-pr-labler bot added the CLA-not-signed label

futo-cla-pr-labler bot commented Feb 11, 2026

Please sign our contributor license agreement at https://cla.futo.org

3 similar comments

futo-cla-pr-labler bot commented Feb 11, 2026

Please sign our contributor license agreement at https://cla.futo.org

futo-cla-pr-labler bot commented Feb 11, 2026

Please sign our contributor license agreement at https://cla.futo.org

futo-cla-pr-labler bot commented Feb 11, 2026

Please sign our contributor license agreement at https://cla.futo.org

liketheduck force-pushed the feature/dictation-commands branch from 1e707df to 37c69d0 Compare

February 11, 2026 01:55

futo-cla-pr-labler bot commented Feb 11, 2026

Please sign our contributor license agreement at https://cla.futo.org

Author

liketheduck commented Feb 11, 2026 •

edited

Loading

Please sign our contributor license agreement at https://cla.futo.org

It has been signed

futo-cla-pr-labler bot added CLA-signed and removed CLA-not-signed labels


          Add dictation command processor for voice input

2a09837

Adds a post-processing step to the voice input pipeline that converts
spoken command phrases (e.g. "new line", "dollar sign", "caps on") into
their corresponding characters and formatting. Covers 69 commands across
10 categories, each independently toggleable in a new Dictation Commands
settings sub-page under Voice Input. Includes 108 JVM unit tests.

liketheduck force-pushed the feature/dictation-commands branch from 95b8221 to 2a09837 Compare

February 11, 2026 02:07

Zvonimir-FUTO requested a review from abb128

February 11, 2026 07:53

Zvonimir-FUTO added Enhancement Voice Input labels

Zvonimir-FUTO assigned abb128


          Add exclamation point and exclamation as command aliases

b529f19

Whisper often transcribes spoken "exclamation point" as "Exclamation."
with capitalization and trailing period. The existing punctuation
tolerance handles both; this adds the missing command aliases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA-signed Enhancement Voice Input