Skip to content

Improved Current-Word Detection #5

@SanderGi

Description

@SanderGi

As the user is speaking a dialogue phrase, we highlight the current word they are on to make the UI feel responsive and to detect when they are done speaking. This has to be very low-latency and light on computational resources. Currently, we use the Web Speech API. However, it is only supported on Chrome and Safari.

We need improved browser support, through some fallback either to a small local model or to one running remotely (either on the server or a cloud solution like Azure/GCP). A small local model might be ideal for low latency, especially if we take advantage of the fact that we know which words the user is trying to say. Hence we don't need a full transcription model, just one that can detect when a given word has been said.

This task will involve some research and evaluation of the best approach.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions