ALSA 'underrun occurred' Causing Audio Overlap in Piper TTS on WSL2

Hey, im trying to use this repo to stream a hebrew response from an LLM to a TTS engine in real time. Since The machine i'm working on is too slow to run most LLMs, im connected via SSH to my more powerful gaming PC using WSL2. also, I use VoiceMeeter's VBAN functionallity to stream audio from the gaming PC to my workstation.

As the LLM generates its output, I send chunks of 4-5 words at a time to RealtimeTTS. However, when the program starts to play audio there are multiple "underrun occurred" errors like this:
```
ALSA lib pcm.c:8568:(snd_pcm_recover) underrun occurred
```

an example:

```
input:hello there

[respond] starting...
[respond] speech: hello there

[respond] generating response...
['שלום!']
שלום!
מה
אוכל
לעשות
['מה', 'אוכל', 'לעשות', 'בשבילך?']
בשבילך?
⚡ synthesizing → 'שלום! מה אוכל לעשות בשבילך ?'

[respond] done!
SYNTHESIS FINISHED
ALSA lib pcm.c:8568:(snd_pcm_recover) underrun occurred
ALSA lib pcm.c:8568:(snd_pcm_recover) underrun occurred
input:
```

for each error like this, the *correct* audio is played once - so if two underruns occur, i hear the same audio twice. when using stream.play(log_synthesized_text=True) it seems like synthesis is happening as expected, yet the audio itself plays multiple times.
I'm not sure what's causing this - maybe a side effect of the way audio is handled in WSL2?
thanks !

my code:

```
from LLM_utils import respond
from phonikud_tts import Phonikud, phonemize
from RealtimeTTS import PiperVoice, PiperEngine, TextToAudioStream
import time

phonikud = Phonikud('TTS/phonikud-1.0.int8.onnx')
voice = PiperVoice(
    model_file="TTS/tts-model.onnx",
    config_file="TTS/tts-model.config.json"
)

engine = PiperEngine(
    piper_path= "/home/roee/docs/victor/venv/bin/piper",
    voice=voice
)

stream = TextToAudioStream(engine)

def generator(text):
    word_list = []
    answer = respond(text)
    for word in answer:
        word_list.append(word)
        if len(word_list) >= 5 or word.endswith((".", "?", "!")):
            yield " ".join(word_list) + " "
            print(word_list)
            word_list = []
    if word_list:
        yield " ".join(word_list) + " "

try:
    while True:
        text = input("input:")

        stream.feed(generator(text))
        stream.play(log_synthesized_text=True)
except KeyboardInterrupt:
    print("fine then...")
```
*respond()*:

```
# yields LLMs response word by word (for the TTS)
def respond(question):
    buffer = ""
    print("\n[respond] starting...")
    print(f"[respond] speech: {question}")
    messeges = [
            SystemMessage(DEFINING_PROMPT),
            HumanMessage(question),
    ]
    print("\n[respond] generating response...")

    for chunk in MODEL.stream(messeges):
        token = str(chunk.content)

        if token is None:
            continue

        buffer += token

        matches = list(re.finditer(r"\S+[ \n.,!?]", buffer))

        last_index = 0
        for match in matches:
            word = match.group().strip()
            yield word
            print(word)
            last_index = match.end()

        buffer = buffer[last_index:]

    if buffer.strip():
        yield buffer.strip()
        print(buffer.strip())

    print("\n[respond] done!")
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ALSA 'underrun occurred' Causing Audio Overlap in Piper TTS on WSL2 #340

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

ALSA 'underrun occurred' Causing Audio Overlap in Piper TTS on WSL2 #340

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions