Welcome to the Python Audio Processing Tutorials repository. This repository contains two main projects focusing on real-time audio processing using different technologies and frameworks.
This project demonstrates how to set up a real-time streaming Speech-to-Text (STT) API using gRPC with microphone interface capabilities. It requires the installation of the PyAudio library and additional setup for gRPC communication.
- Requirements: PyAudio (see installation instructions).
- Setup: Download necessary
.protofiles and generate gRPC client code. For details, check the project's README.
This project showcases the use of the Triton Inference Server and Tritony Voice Activity Detection (VAD) to process and analyze audio streams effectively. It involves setting up a Docker container and running a Triton server.
- Requirements: pydub, tritonclient, tritony
- Setup: Build and run a Docker image for the Triton Inference Server. Testing is facilitated through predefined scripts and pytest. For more information, visit the project's README.
Implementation of a web app using only Python's Streamlit and Return Zero's API, without knowledge of frontend and backend development. This web app include functionality to convert audio files to text using Return Zero's API for Speech-to-Text (STT), and then summarizes the converted text.
- Requirements: streamlit, requests, pytorch, transformers
- Setup: Install requirement librarys, Run streamlit and submit your information. For more information, visit the project's README.
Most project dependencies can be installed via pip:
pip install -r requirements.txt