Skip to content

Memory usage grow over time? #360

@mfxuus

Description

@mfxuus

This is likely issues with my implementation, but I was wondering if it is expected that the GPU memory usage is expected to grow as the engine is utilized to generate audio? I ran into an OOM error a few times when the system is left running for a long time. I don't have exact reproducible steps but just wanted to check if this is expected at all.

The high-level setup is that the PC controls 2 robots and we use two separate coqui engines to generate 2 distinct voices. We also use the realtimeSTT for speech recognition, but recently had to switch to the tiny model (from small) to alleviate some of the OOM issues.

Here is the rough setup:

from RealtimeTTS import (
    TextToAudioStream,
    CoquiEngine
)


class AudioController:

    def __init__(self, on_audio_stream_stop=None):
        # multiple voice and stream?
        engine_A = CoquiEngine(
            voice="voices/openai-fm-echo-chill-surfer.wav"
        )
        engine_B = CoquiEngine(
            voice="voices/openai-fm-nova-audio.wav"
        )
        
        if not on_audio_stream_stop:
            on_audio_stream_stop = self.on_audio_stream_stop
        
        self.stream_A = TextToAudioStream(
            engine_A,
            on_audio_stream_stop=on_audio_stream_stop,
            frames_per_buffer=1024
        )
        self.stream_B = TextToAudioStream(
            engine_B,
            on_audio_stream_stop=on_audio_stream_stop,
            frames_per_buffer=1024
        )

    def speak_for_robot(self, robot, content):
        if robot == 'a':
            self.stream_A.feed(content)
            self.stream_A.play()
        elif robot == 'b':
            self.stream_B.feed(content)
            self.stream_B.play()

    def on_audio_stream_stop(self):
        print('Audio Done.')

And this helper class is imported in the main code

from audio_control import AudioController
.... 
self.audio_controller = AudioController()

and the method self.audio_controller.speak_for_robot(robot, text) is called upon when needed.

Below are the nvidia-smi when the program just starts, and furhter down when it runs after a while (and the coqui engines generated some audio outputs).

Sun Jan 25 17:00:50 2026
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.14              Driver Version: 550.54.14      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce GTX 1080 Ti     Off |   00000000:01:00.0 Off |                  N/A |
| 25%   33C    P8             10W /  250W |    4300MiB /  11264MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A      1288      G   /usr/lib/xorg/Xorg                              9MiB |
|    0   N/A  N/A      1375      G   /usr/bin/gnome-shell                            3MiB |
|    0   N/A  N/A    206856      C   ...rl/robot-display/env/bin/python        342MiB |  <---- STT 
|    0   N/A  N/A    206924      C   ...rl/robot-display/env/bin/python       1970MiB | <---- Voice 1
|    0   N/A  N/A    206971      C   ...rl/robot-display/env/bin/python       1970MiB | <---- Voice 2
Sun Jan 25 20:05:31 2026
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.14              Driver Version: 550.54.14      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce GTX 1080 Ti     Off |   00000000:01:00.0 Off |                  N/A |
| 25%   30C    P8              9W /  250W |    6850MiB /  11264MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A      1288      G   /usr/lib/xorg/Xorg                              9MiB |
|    0   N/A  N/A      1375      G   /usr/bin/gnome-shell                            3MiB |
|    0   N/A  N/A    206856      C   ...rl/robot-display/env/bin/python        342MiB | <---- STT
|    0   N/A  N/A    206924      C   ...rl/robot-display/env/bin/python       2956MiB | <---- Voice 1
|    0   N/A  N/A    206971      C   ...rl/robot-display/env/bin/python       3534MiB | <---- Voice 2
+-----------------------------------------------------------------------------------------+

Here are some package versions, let me know there are others that would be relevant:

coqui-tts==0.27.2
coqui-tts-trainer==0.3.1
ctranslate2==4.6.1
nvidia-cublas-cu12==12.4.5.8
nvidia-cuda-cupti-cu12==12.4.127
nvidia-cuda-nvrtc-cu12==12.4.127
nvidia-cuda-runtime-cu12==12.4.127
nvidia-cudnn-cu12==9.1.0.70
nvidia-cufft-cu12==11.2.1.3
nvidia-cufile-cu12==1.13.1.3
nvidia-curand-cu12==10.3.5.147
nvidia-cusolver-cu12==11.6.1.9
nvidia-cusparse-cu12==12.3.1.170
nvidia-cusparselt-cu12==0.6.2
nvidia-nccl-cu12==2.21.5
nvidia-nvjitlink-cu12==12.4.127
nvidia-nvshmem-cu12==3.3.20
nvidia-nvtx-cu12==12.4.127
onnxruntime==1.23.2
realtimestt==0.3.104
realtimetts==0.5.0
torch==2.6.0+cu124
torchaudio==2.6.0+cu124
torchvision==0.21.0+cu124
transformers==4.52.1

I'll try to keep digging, but in the meantime, any pointers or advice to proactively tighten the implementation would be very much appreciated! Thanks in advance!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions