Skip to content

feat: speech-to-text NVIDIA/non-NVIDIA split — CPU-capable alternative for laptops #11

@LTSCommerce

Description

@LTSCommerce

Problem

play-speech-to-text.yml currently installs faster-whisper with CUDA support, which is only useful on machines with a discrete NVIDIA GPU. On laptops without NVIDIA (integrated Intel/AMD graphics only), CUDA is unavailable and the CUDA backend is pointless — wastes install time and may produce confusing errors or silent fallback to CPU anyway.

Current Behaviour

Single installation path: faster-whisper + CUDA regardless of hardware.

Expected Behaviour

Detect whether NVIDIA GPU is present and choose the appropriate backend:

  • NVIDIA GPU detected → faster-whisper with CUDA (current behaviour, keep as-is)
  • No NVIDIA GPU → CPU/ROCm-compatible alternative

Investigation Needed

Research and evaluate speech-to-text systems that work well on CPU or integrated graphics:

  • faster-whisper with CPU backend — same tool, just skip CUDA deps; check if performance is acceptable on modern laptop CPUs
  • whisper.cpp — pure C++ implementation, no Python deps, runs well on CPU, supports Metal/OpenCL
  • vosk — lightweight offline STT, very low resource usage, runs on CPU
  • RealtimeSTT with CPU — check if the current wsi-stream wrapper can work without CUDA
  • sherpa-onnx — ONNX-based, good CPU performance, supports Whisper models

Acceptance Criteria

  • Playbook detects NVIDIA GPU (reuse logic from check_hardware / lspci | grep -i nvidia)
  • NVIDIA path: faster-whisper + CUDA (current)
  • Non-NVIDIA path: working alternative with acceptable latency on laptop CPU
  • Both paths use the same wsi-stream interface (or document differences)
  • play-speech-to-text.yml removed from auto_run_common in run.bash until this is resolved (currently auto-runs on all hardware)

Related

  • playbooks/imports/optional/common/play-speech-to-text.yml
  • files/home/.local/bin/wsi-stream
  • run.bash auto_run_common array

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions