🎙️ SpeechLine

SpeechLine is a speech labeling pipeline that handles end-to-end, offline, batch audio categorization, transcription, segmentation, and logging. It supports multiple state-of-the-art speech recognition models including Wav2Vec2, Whisper, Parakeet, Parakeet TDT, and Canary.

Pipeline Overview

Available Modules

Figure inspired by BERTopic's Modularity Diagram

Examples

SpeechLine on AWS SageMaker

Name		Name	Last commit message	Last commit date
Latest commit History 280 Commits
.github		.github
data		data
docs		docs
examples		examples
notebooks		notebooks
scripts		scripts
speechline		speechline
tests		tests
.gitignore		.gitignore
=1.29.0		=1.29.0
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Makefile		Makefile
README.md		README.md
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
requirements_test.txt		requirements_test.txt
run_validation.sh		run_validation.sh
setup.py		setup.py
tox.ini		tox.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🎙️ SpeechLine

Pipeline Overview

Available Modules

Examples

Contributors

About

Uh oh!

Releases 2

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🎙️ SpeechLine

Pipeline Overview

Available Modules

Examples

Contributors

About

Topics

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 2

Uh oh!

Contributors

Uh oh!

Languages