Add streaming voice command support #35

zehnm · 2025-10-31T16:36:21Z

Support sending voice commands with the RemoteVoice* messages. Audio data needs to be 16-bit PCM, mono, 8000 Hz.
The Android TV Remote service appears to support audio configuration in the RemoteVoiceBegin message, but no information could be found so far.

Add voice features with PyAudio to the demo application to easily test voice commands:

Record and stream voice from the default audio input
Record and playback a local WAV file
Send a pre-recorded WAV file as a voice command

No new dependencies are required for the library, only for the demo app.

The following sources provided helpful information for implementing the voice command feature:

The maximum audio chunk size has been set to 20 KB as reported in the above links. While testing with an Nvidia Shield TV, the minimum chunk size seems to be around 8 KB.
A simple chunk handling is automatically done in the library. If the clients sends a bigger chunk, it's automatically split up to 20 KB messages. Chunks smaller than 8 KB are zero padded.
This logic could be enhanced in the future with an internal queue.
The demo app shows how to send a pre-recorded WAV file, and how to stream from the default audio input (usually the microphone) using chunk handling.

I'm not a Phython expert and tried to follow the existing library code. Let me know if things need to be changed, for example:

AndroidTVRemote constructor: I made the new enable_voice parameter set to False by default to keep the same behaviour as before.
When set to True, it influences the RemoteProtocol._active_features and changes the Android TV device behaviour when sending KEYCODE_SEARCH.
Demo application could be reduced to just include the streaming voice command, without the additional record / play / send commands.

Support sending voice commands with the RemoteVoice* messages. Audio data needs to be 16-bit PCM, mono, 8000 Hz. The Android TV Remote service appears to support audio configuration in the RemoteVoiceBegin message, but no information could be found so far. Add voice features with PyAudio to the demo application to easily test the voice command feature: - Record and stream voice from the default audio input - Record a playback a local WAV file - Send a pre-recorded WAV file as a voice command No new dependencies are required for the library, only for the demo app.

tronikos

Thanks! This is a great addition. A couple of small comments.

tronikos · 2025-10-31T19:21:02Z

src/androidtvremote2/remote.py

        self._send_message(msg)

-    def _handle_message(self, raw_msg: bytes) -> None:  # noqa: PLR0912
+    async def start_voice(self, timeout: float = VOICE_SESSION_TIMEOUT) -> int:


I think there is a potential race condition if this is called concurrently by multiple tasks because of the self._on_voice_begin future. Add a lock to ensure that only one voice session can be initiated at a time?

Makes sense, will fix it.

I've added a lock that prevents clients from creating a new voice session as long as the previous session hasn't been established. asyncio.TimeoutError is raised if a session is still being established. I couldn't find a better exception, but think it's ok and simplifies client exception handling.

Tested with the demo app by creating two _stream_voice tasks when pressing the v key.

src/androidtvremote2/remote.py

tronikos · 2025-10-31T19:28:01Z

src/androidtvremote2/remote.py

+        # Check if future completed successfully
+        if future.done():
+            if future.exception():
+                raise ConnectionClosed(future.exception())


Something like raise ConnectionClosed("...") from future.exception() (replace "...") might be better

I always learn something new, didn't know that Python construct :-) I will change it.

This code block was actually copied from PairingProtocol::_async_wait_for_future_or_con_lost and I forgot to ask about in the PR. I only slightly modified it by adding a timeout and returning the future's response.
I think it would make sense to move _async_wait_for_future_or_con_lost and _raise_if_not_connected to the ProtobufProtocol base class, so it can be used by RemoteProtocol and PairingProtocol. What prevented me from doing so was the pairing-specific log message "Connection has been lost, cannot pair" in _raise_if_not_connected.

If you like, I can move the methods into ProtobufProtocol with an optional timeout parameter (timeout: float | None = None) and generic "Connection has been lost" log message.

PR feedback

tronikos requested changes Oct 31, 2025

View reviewed changes

fixup! Add streaming voice command support

13cdad7

PR feedback

zehnm requested a review from tronikos November 1, 2025 09:54

tronikos approved these changes Nov 1, 2025

View reviewed changes

tronikos merged commit 5cb83bd into tronikos:main Nov 1, 2025
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add streaming voice command support #35

Add streaming voice command support #35

Uh oh!

zehnm commented Oct 31, 2025

Uh oh!

tronikos left a comment

Uh oh!

tronikos Oct 31, 2025

Uh oh!

zehnm Nov 1, 2025

Uh oh!

zehnm Nov 1, 2025

Uh oh!

Uh oh!

tronikos Oct 31, 2025

Uh oh!

zehnm Nov 1, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add streaming voice command support #35

Add streaming voice command support #35

Uh oh!

Conversation

zehnm commented Oct 31, 2025

Uh oh!

tronikos left a comment

Choose a reason for hiding this comment

Uh oh!

tronikos Oct 31, 2025

Choose a reason for hiding this comment

Uh oh!

zehnm Nov 1, 2025

Choose a reason for hiding this comment

Uh oh!

zehnm Nov 1, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

tronikos Oct 31, 2025

Choose a reason for hiding this comment

Uh oh!

zehnm Nov 1, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants