Not sure if this is possible, but the normal Qwen3-TTS models are incredibly slow for me, taking four times longer than real time to generate audio.
However, the optimizations on https://github.com/andimarafioti/faster-qwen3-tts make it run 2x faster than real time on my hardware. I'm not sure how hard it would be to add, but could support for this set of optmizations be added to the program? Your UI is extremely clean.