Speech Activity Detection (SAD) service

Network Protocol: HTTP

Server Framework: Flask

Endpoint: url:port:/sad

Message Format:

A FormData object including two fields:

file: An audio data file (MP3 or WAV).
threshold: A floating-point number between 0 and 1.

Output Format:

A JSON file with a field named sad_annotation, which is a list of speech segments within the audio file. Each segment contains begin and end fields indicating the start and end times of the speech segment in seconds.

Output Example:

{
  "sad_annotation": [
    {
      "begin": 0.76,
      "end": 1.06
    },
    {
      "begin": 1.66,
      "end": 2.36
    }
  ]
}

Limits and Constraints:

Audio chunks should not exceed 10MB.

Launching the Server:

To launch the server using Docker and docker-compose, run the following command in the root directory of project where the docker-compose.yml file is located:

sudo docker-compose up

This command will start the server container based on the configuration in the docker-compose.yml file. You can now access the server at http://localhost:5005/sad or the appropriate URL and port based on your configuration.

Make sure you have Docker and docker-compose installed and properly configured on your machine before running these commands.

Feel free to adjust the configuration in the Dockerfile and docker-compose.yml file based on your specific server setup and requirements.

Acknowledgements

This project includes code from the SincNet by mravanelli. The following file have been adapted or used from the original repository:

dnn_models

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
cfg		cfg
exp		exp
log		log
temp_data		temp_data
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
dnn_models.py		dnn_models.py
docker-compose.yml		docker-compose.yml
requirements.txt		requirements.txt
server.py		server.py
spu.py		spu.py
vad.py		vad.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Speech Activity Detection (SAD) service

Launching the Server:

Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Speech Activity Detection (SAD) service

Launching the Server:

Acknowledgements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages