Skip to content

Conversation

@zehnm
Copy link
Contributor

@zehnm zehnm commented Oct 31, 2025

Support sending voice commands with the RemoteVoice* messages. Audio data needs to be 16-bit PCM, mono, 8000 Hz.
The Android TV Remote service appears to support audio configuration in the RemoteVoiceBegin message, but no information could be found so far.

Add voice features with PyAudio to the demo application to easily test voice commands:

  • Record and stream voice from the default audio input
  • Record and playback a local WAV file
  • Send a pre-recorded WAV file as a voice command

No new dependencies are required for the library, only for the demo app.

The following sources provided helpful information for implementing the voice command feature:

The maximum audio chunk size has been set to 20 KB as reported in the above links. While testing with an Nvidia Shield TV, the minimum chunk size seems to be around 8 KB.
A simple chunk handling is automatically done in the library. If the clients sends a bigger chunk, it's automatically split up to 20 KB messages. Chunks smaller than 8 KB are zero padded.
This logic could be enhanced in the future with an internal queue.
The demo app shows how to send a pre-recorded WAV file, and how to stream from the default audio input (usually the microphone) using chunk handling.

I'm not a Phython expert and tried to follow the existing library code. Let me know if things need to be changed, for example:

  • AndroidTVRemote constructor: I made the new enable_voice parameter set to False by default to keep the same behaviour as before.
    When set to True, it influences the RemoteProtocol._active_features and changes the Android TV device behaviour when sending KEYCODE_SEARCH.
  • Demo application could be reduced to just include the streaming voice command, without the additional record / play / send commands.

Support sending voice commands with the RemoteVoice* messages.
Audio data needs to be 16-bit PCM, mono, 8000 Hz.
The Android TV Remote service appears to support audio configuration in
the RemoteVoiceBegin message, but no information could be found so far.

Add voice features with PyAudio to the demo application to easily test
the voice command feature:
- Record and stream voice from the default audio input
- Record a playback a local WAV file
- Send a pre-recorded WAV file as a voice command

No new dependencies are required for the library, only for the demo app.
Copy link
Owner

@tronikos tronikos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! This is a great addition. A couple of small comments.

self._send_message(msg)

def _handle_message(self, raw_msg: bytes) -> None: # noqa: PLR0912
async def start_voice(self, timeout: float = VOICE_SESSION_TIMEOUT) -> int:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is a potential race condition if this is called concurrently by multiple tasks because of the self._on_voice_begin future. Add a lock to ensure that only one voice session can be initiated at a time?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, will fix it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a lock that prevents clients from creating a new voice session as long as the previous session hasn't been established. asyncio.TimeoutError is raised if a session is still being established. I couldn't find a better exception, but think it's ok and simplifies client exception handling.

Tested with the demo app by creating two _stream_voice tasks when pressing the v key.

# Check if future completed successfully
if future.done():
if future.exception():
raise ConnectionClosed(future.exception())
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like raise ConnectionClosed("...") from future.exception() (replace "...") might be better

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I always learn something new, didn't know that Python construct :-) I will change it.

This code block was actually copied from PairingProtocol::_async_wait_for_future_or_con_lost and I forgot to ask about in the PR. I only slightly modified it by adding a timeout and returning the future's response.
I think it would make sense to move _async_wait_for_future_or_con_lost and _raise_if_not_connected to the ProtobufProtocol base class, so it can be used by RemoteProtocol and PairingProtocol. What prevented me from doing so was the pairing-specific log message "Connection has been lost, cannot pair" in _raise_if_not_connected.

If you like, I can move the methods into ProtobufProtocol with an optional timeout parameter (timeout: float | None = None) and generic "Connection has been lost" log message.

@zehnm zehnm requested a review from tronikos November 1, 2025 09:54
@tronikos tronikos merged commit 5cb83bd into tronikos:main Nov 1, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants