-
Notifications
You must be signed in to change notification settings - Fork 383
Memory usage grow over time? #360
Description
This is likely issues with my implementation, but I was wondering if it is expected that the GPU memory usage is expected to grow as the engine is utilized to generate audio? I ran into an OOM error a few times when the system is left running for a long time. I don't have exact reproducible steps but just wanted to check if this is expected at all.
The high-level setup is that the PC controls 2 robots and we use two separate coqui engines to generate 2 distinct voices. We also use the realtimeSTT for speech recognition, but recently had to switch to the tiny model (from small) to alleviate some of the OOM issues.
Here is the rough setup:
from RealtimeTTS import (
TextToAudioStream,
CoquiEngine
)
class AudioController:
def __init__(self, on_audio_stream_stop=None):
# multiple voice and stream?
engine_A = CoquiEngine(
voice="voices/openai-fm-echo-chill-surfer.wav"
)
engine_B = CoquiEngine(
voice="voices/openai-fm-nova-audio.wav"
)
if not on_audio_stream_stop:
on_audio_stream_stop = self.on_audio_stream_stop
self.stream_A = TextToAudioStream(
engine_A,
on_audio_stream_stop=on_audio_stream_stop,
frames_per_buffer=1024
)
self.stream_B = TextToAudioStream(
engine_B,
on_audio_stream_stop=on_audio_stream_stop,
frames_per_buffer=1024
)
def speak_for_robot(self, robot, content):
if robot == 'a':
self.stream_A.feed(content)
self.stream_A.play()
elif robot == 'b':
self.stream_B.feed(content)
self.stream_B.play()
def on_audio_stream_stop(self):
print('Audio Done.')
And this helper class is imported in the main code
from audio_control import AudioController
....
self.audio_controller = AudioController()
and the method self.audio_controller.speak_for_robot(robot, text) is called upon when needed.
Below are the nvidia-smi when the program just starts, and furhter down when it runs after a while (and the coqui engines generated some audio outputs).
Sun Jan 25 17:00:50 2026
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.14 Driver Version: 550.54.14 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce GTX 1080 Ti Off | 00000000:01:00.0 Off | N/A |
| 25% 33C P8 10W / 250W | 4300MiB / 11264MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 1288 G /usr/lib/xorg/Xorg 9MiB |
| 0 N/A N/A 1375 G /usr/bin/gnome-shell 3MiB |
| 0 N/A N/A 206856 C ...rl/robot-display/env/bin/python 342MiB | <---- STT
| 0 N/A N/A 206924 C ...rl/robot-display/env/bin/python 1970MiB | <---- Voice 1
| 0 N/A N/A 206971 C ...rl/robot-display/env/bin/python 1970MiB | <---- Voice 2
Sun Jan 25 20:05:31 2026
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.14 Driver Version: 550.54.14 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce GTX 1080 Ti Off | 00000000:01:00.0 Off | N/A |
| 25% 30C P8 9W / 250W | 6850MiB / 11264MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 1288 G /usr/lib/xorg/Xorg 9MiB |
| 0 N/A N/A 1375 G /usr/bin/gnome-shell 3MiB |
| 0 N/A N/A 206856 C ...rl/robot-display/env/bin/python 342MiB | <---- STT
| 0 N/A N/A 206924 C ...rl/robot-display/env/bin/python 2956MiB | <---- Voice 1
| 0 N/A N/A 206971 C ...rl/robot-display/env/bin/python 3534MiB | <---- Voice 2
+-----------------------------------------------------------------------------------------+
Here are some package versions, let me know there are others that would be relevant:
coqui-tts==0.27.2
coqui-tts-trainer==0.3.1
ctranslate2==4.6.1
nvidia-cublas-cu12==12.4.5.8
nvidia-cuda-cupti-cu12==12.4.127
nvidia-cuda-nvrtc-cu12==12.4.127
nvidia-cuda-runtime-cu12==12.4.127
nvidia-cudnn-cu12==9.1.0.70
nvidia-cufft-cu12==11.2.1.3
nvidia-cufile-cu12==1.13.1.3
nvidia-curand-cu12==10.3.5.147
nvidia-cusolver-cu12==11.6.1.9
nvidia-cusparse-cu12==12.3.1.170
nvidia-cusparselt-cu12==0.6.2
nvidia-nccl-cu12==2.21.5
nvidia-nvjitlink-cu12==12.4.127
nvidia-nvshmem-cu12==3.3.20
nvidia-nvtx-cu12==12.4.127
onnxruntime==1.23.2
realtimestt==0.3.104
realtimetts==0.5.0
torch==2.6.0+cu124
torchaudio==2.6.0+cu124
torchvision==0.21.0+cu124
transformers==4.52.1
I'll try to keep digging, but in the meantime, any pointers or advice to proactively tighten the implementation would be very much appreciated! Thanks in advance!