Skip to content

Developing Whisper Service

Bennett Wu edited this page Sep 24, 2025 · 7 revisions

Resources

Core Technologies

Tools

Other

Setup

  1. Ensure Python an pip are installed on your computer.
    • Download from: https://www.python.org/
      • Node: Whisper service was developed with Python 3.12 and up in mind.
    • Python comes with pip, which we will use for managing our dependencies.
    • The commands python --version and pip --version will return the version numbers of Python and pip respectively.
      • Make sure both commands run successfully. If they don't, make sure Python and pip are in your system path configuration.
  2. Ensure Git is installed on your computer. See: https://git-scm.com/
  3. Clone the repository
    git clone https://github.com/scribear/ScribeAR-NodeServer
    
  4. Move into the whisper-service directory
    cd ./ScribeAR-NodeServer/whisper-service
    
  5. (Optional but recommended) Create a python virtual environment
    • https://docs.python.org/3/library/venv.html
    • Create the environment in folder called .venv
      python -m venv .venv
      
    • Activate the virtual environment
      • POSIX systems (e.g. Linux, MacOS)
        • bash/zsh (probably what you are using)
          source .venv/bin/activate
          
        • fish
          source .venv/bin/activate.fish
          
        • csh/tcsh
          source .venv/bin/activate.csh
          
        • pwsh
          .venv/bin/Activate.ps1
          
      • Windows
        • cmd.exe
          .venv\Scripts\activate.bat
        • PowerShell
          .venv\Scripts\Activate.ps1
          
  6. Install dependencies
    pip install -r requirements.txt
    
  7. Make a copy of template.env and name it .env
  8. Make a copy of device_config.template.json and name it device_config.json
  9. Edit .env and device_config.json to configure whisper service.
    • The templates already contain sensible defaults. A good place to start is to leave every thing except API_KEY default.
    • See Configuring Whisper Service for details about each option.
  10. Start up your local instance. This will start the development server and automatically restart the app when you make changes.
    python index.py --dev
    

Unit Testing

Whisper service is unit tested using Pytest. These are used to check that individual components are working as expected in isolation.

  • To run tests without code coverage
    pytest
    
  • To run tests with code coverage. Coverage results can be found in htmlcov folder, you can open the .html files in your browser to see the coverage report.
    pytest --cov=. --cov-report=html
    
  • To create new tests, create file ending with _test.py in the same folder as the function/object you want to test.
    • The names of the test file should correspond to the name of the file containing the thing you are testing.
      • Try to avoid writing tests that involve multiple files. If you find yourself doing so, it might be a sign that the scope of your function is too big.
    • See https://docs.pytest.org/en/stable/ to learn how to use Pytest

Code Style

Whisper service uses Pylint to ensure a consistent code style. Pylint uses the PEP-8 Style Guide (https://peps.python.org/pep-0008/).

  • Run linter for a single file:
    pylint [path_to_file]
    
  • Run linter for all .py files
    pylint ./**/*.py *.py
    

Containerization

  1. Ensure you have Docker installed. See: https://www.docker.com/
  2. Build container
    • For CPU only container
      docker build -t scribear-whisper-service -f ./Dockerfile_CPU .
      
    • For CUDA support
      docker build -t scribear-whisper-service-cuda -f ./Dockerfile_CUDA .
      
  3. Make a copy of template.env and name it .env
  4. Edit .env to configure container. See Configuring Docker Containers for details.
  5. Run container (listens on port 8000)
    • For CPU only container
      docker run --env-file .env -p 8000:80 whisper-service:latest
      
    • For CUDA support
      docker run --env-file .env -p 8000:80 --gpus all whisper-service-cuda:latest
      

Dependencies

When modifying dependencies:

Implementing a New Transcription Backend

Whisper service is designed to be extendable to use multiple transcription backends. Here are the steps for implementing a new backend.

  1. Implement TranscriptionModelBase found in model_bases/transcription_model_base.py
    • This can be achieved by directly implementing TranscriptionModelBase or by implementing any of children classes defined in model_bases/. These children classes implement commonly used methods and patterns to make development easier.
    • See Transcription Model Bases for documentation about implementing these.
  2. Add python dependencies for your model to requirements.txt.
  3. Associate an unique implementation_id to your implementation.
    • Create a new ModelImplementationId enum for your implementation_id in custom_types/config_types.py
    • Update import_model_implementation() function model_implementations/import_model_implementations.py to make a mapping to return your implementation
  4. Update documentation Model Implementations and Configuration to include your model.

File Structure

  • app_config/
    • load_config.py
      • This function loads the .env file and returns the AppConfig object.
    • model_factory.py
      • This function takes instantiates a model instance corresponding to the model key defined in DeviceConfig.
      • Returns the instantiated model instance.
  • custom_types/
    • Contains type hint definitions for objects used throughout the applications
  • model_bases/
    • Contains the transcription model base implementations.
    • These do not provide transcriptions, but provide an interface and common methods for developing a transcription model.
    • See Transcription Model Bases to learn about these.
  • model_implementations/
    • Contains implementations of transcription models.
    • These do the hard work to provide transcriptions.
    • import_model_implementations.py
      • Defines the mappings from implementation id to model implementation.
      • Also defines a function to dynamically import model implementation by implementation id.
  • server/
    • Contains files related to defining FastAPI webserver
    • helpers/
      • Contains helper functions for FastAPI server
    • create_server.py
      • Defines the FastAPI server
  • utils/
    • Contains utility functions.
  • .dockerignore
  • .gitignore
  • create_server.py
    • This function takes in the app configuration object and returns a FastAPI instance.
    • The /healthcheck and /whisper routes are defined here.
  • Dockerfile_CPU
  • Dockerfile_CUDA
  • index.py
    • This is the entrypoint into our application, the first thing that is run.
    • It loads the configuration, creates the FastAPI server, and starts the server.
  • requirements.txt
  • template.env

Github Actions

The Github Actions definitions for node server can be found in .github/workflows/whisper-service-ci.yml.

The following jobs are defined:

  • test-lint-whisper-service

    • Runs pytest, pylint --disable=import-error $(git ls-files '*.py') to ensure that code passes unit tests and has no code style errors.
    • If any of these commands fail, the job is failed.
  • build-cpu-container-whisper-service

    • Builds the scribear-whisper-service-cpu Docker container and pushes to Dockerhub
    • Images are tagged with pull request id, branch, and Github tags.
    • Runs after test-lint-whisper-service finishes successfully
  • build-cuda-container-whisper-service

    • Builds the scribear-whisper-service-cpu Docker container and pushes to Dockerhub
    • Images are tagged with pull request id, branch, and Github tags.
    • Runs after test-lint-whisper-service finishes successfully

Additional Documentation

Additional documentation can be found in Documentation.

Clone this wiki locally