-
Notifications
You must be signed in to change notification settings - Fork 3
Open
Labels
enhancementNew feature or requestNew feature or requestgood first issueGood for newcomersGood for newcomers
Description
Part of providing pronunciation feedback involves aligning a target phonemic sequence (the sounds made by the actor reference) with the user phonemic sequence (the sounds the user makes when trying to mimic the actor dialogue). An appropriate algorithm for this is Needleman-Wunsch. We currently have a Python implementation of this that we run after the user phonemic sequence has been transcribed. We want this to be as fast as possible to provide low latency feedback. Here are some potential solutions to try out:
- Implement it in C, or use an existing C implementation, and call that from Python
- Explore alternative algorithms with better than
$O(nm)$ time and space complexities - Explore a streaming version that can be calculated as the user phonemic sequence is being transcribed rather than having to wait until the end (this would also avoid having to do the full work of re-calculating it every time the transcription changes and we want to update the word colorings)
- Explore calculating it client-side (in JavaScript/WASM) to avoid network delays (the code would go in FeedbackGiver.js)
Since the sequences will be fairly short, experimental evaluation results that take into account network delays etc. will be more relevant than the asymptotic time complexity.
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or requestgood first issueGood for newcomersGood for newcomers