-
Notifications
You must be signed in to change notification settings - Fork 13
Add streaming voice command support #35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Support sending voice commands with the RemoteVoice* messages. Audio data needs to be 16-bit PCM, mono, 8000 Hz. The Android TV Remote service appears to support audio configuration in the RemoteVoiceBegin message, but no information could be found so far. Add voice features with PyAudio to the demo application to easily test the voice command feature: - Record and stream voice from the default audio input - Record a playback a local WAV file - Send a pre-recorded WAV file as a voice command No new dependencies are required for the library, only for the demo app.
tronikos
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! This is a great addition. A couple of small comments.
| self._send_message(msg) | ||
|
|
||
| def _handle_message(self, raw_msg: bytes) -> None: # noqa: PLR0912 | ||
| async def start_voice(self, timeout: float = VOICE_SESSION_TIMEOUT) -> int: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is a potential race condition if this is called concurrently by multiple tasks because of the self._on_voice_begin future. Add a lock to ensure that only one voice session can be initiated at a time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, will fix it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a lock that prevents clients from creating a new voice session as long as the previous session hasn't been established. asyncio.TimeoutError is raised if a session is still being established. I couldn't find a better exception, but think it's ok and simplifies client exception handling.
Tested with the demo app by creating two _stream_voice tasks when pressing the v key.
src/androidtvremote2/remote.py
Outdated
| # Check if future completed successfully | ||
| if future.done(): | ||
| if future.exception(): | ||
| raise ConnectionClosed(future.exception()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something like raise ConnectionClosed("...") from future.exception() (replace "...") might be better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I always learn something new, didn't know that Python construct :-) I will change it.
This code block was actually copied from PairingProtocol::_async_wait_for_future_or_con_lost and I forgot to ask about in the PR. I only slightly modified it by adding a timeout and returning the future's response.
I think it would make sense to move _async_wait_for_future_or_con_lost and _raise_if_not_connected to the ProtobufProtocol base class, so it can be used by RemoteProtocol and PairingProtocol. What prevented me from doing so was the pairing-specific log message "Connection has been lost, cannot pair" in _raise_if_not_connected.
If you like, I can move the methods into ProtobufProtocol with an optional timeout parameter (timeout: float | None = None) and generic "Connection has been lost" log message.
PR feedback
Support sending voice commands with the RemoteVoice* messages. Audio data needs to be 16-bit PCM, mono, 8000 Hz.
The Android TV Remote service appears to support audio configuration in the RemoteVoiceBegin message, but no information could be found so far.
Add voice features with PyAudio to the demo application to easily test voice commands:
No new dependencies are required for the library, only for the demo app.
The following sources provided helpful information for implementing the voice command feature:
The maximum audio chunk size has been set to 20 KB as reported in the above links. While testing with an Nvidia Shield TV, the minimum chunk size seems to be around 8 KB.
A simple chunk handling is automatically done in the library. If the clients sends a bigger chunk, it's automatically split up to 20 KB messages. Chunks smaller than 8 KB are zero padded.
This logic could be enhanced in the future with an internal queue.
The demo app shows how to send a pre-recorded WAV file, and how to stream from the default audio input (usually the microphone) using chunk handling.
I'm not a Phython expert and tried to follow the existing library code. Let me know if things need to be changed, for example:
enable_voiceparameter set to False by default to keep the same behaviour as before.When set to True, it influences the
RemoteProtocol._active_featuresand changes the Android TV device behaviour when sendingKEYCODE_SEARCH.