Audio Tools API (by Nobz)

This repository provides a small, self-contained Flask-based API exposing three audio tools useful for building audio applications:

Audio analysis (tempo/BPM and musical key detection)
Audio format conversion (MP3, WAV, FLAC)
Stem separation (vocals, drums, bass, melody) using Demucs

This project is intended to be an open-source developer-focused API that other applications can call to integrate audio processing features. Anyone may download, modify, and redistribute this code according to the license (see the License section).

Contents

Overview
Requirements
Quick start (macOS / Linux)
Configuration
Running the API
Endpoints and examples
File lifecycle and cleanup
Development notes and recommendations
Troubleshooting
Security considerations
Contributing and License

Overview

The codebase is organized with a clear separation of concerns:

app.py — Main Flask application that registers routes and blueprints.
routes/audio.py — Blueprint exposing the HTTP API endpoints for audio tools.
services.py — Service-layer helpers and higher-level wrappers.
utils.py — File I/O, audio analysis, FFmpeg-based conversion, health checks.
config.py — Basic runtime configuration (upload and converted folders).

Requirements

Python dependencies are listed in requirements.txt. This project also depends on system tools:

Python 3.11.13 (must be this version, for now)
System ffmpeg (install via brew install ffmpeg on macOS)
Demucs (either Python package or CLI; see notes below)

Note on PyTorch and Demucs:

demucs and torch may require platform-specific wheels. On macOS, you may want to install a compatible PyTorch wheel for CPU or MPS. If you plan to run stem separation locally, follow Demucs and PyTorch install instructions for your platform.

Quick start (macOS / Linux)

Clone the repository:

git clone https://github.com/nobz226/audio-tools-API.git
cd repo

Create and activate a virtual environment:

python -m venv venv
source venv/bin/activate

Install Python dependencies:

pip install -r requirements.txt

Install system dependencies (macOS example):

# install ffmpeg
brew install ffmpeg

# optional: follow Demucs installation docs; you can install the demucs pip package
pip install demucs

Start the API:

python app.py

By default the Flask app in this repository runs with app.run(..., port=5002); the server will listen on port 5002 when started this way.

Configuration

Basic runtime settings are in config.py:

UPLOAD_FOLDER — where uploaded files are saved (default: static/uploads)
CONVERTED_FOLDER — where converted or generated files are stored (default: static/converted)
SESSION_TIMEOUT — session timeout value used by the original app (may be unused in the stripped API)

You can edit config.py to change these directories, or override settings on application startup by modifying app.config.

Running the API

Start the app as shown in Quick start. Once running, the following endpoints are available (all under the /audio blueprint):

GET /audio/test — simple health endpoint for the blueprint
POST /audio/analyze — analyze an uploaded audio file
POST /audio/convert — convert an uploaded audio file to a target format
POST /audio/separate — perform stem separation on an uploaded audio file

There is also a general health endpoint at /api/health which returns diagnostics about external tools.

API: endpoints and examples

All POST endpoints accept multipart form uploads. Below are cURL examples.

Analyze audio (tempo and key)

curl -X POST "http://localhost:5002/audio/analyze" \
  -F "file=@/path/to/audio.mp3"

Successful response (JSON):

{ "success": true, "analysis": { "success": true, "tempo": 120, "key": "Em" } }

Convert audio (format conversion)

curl -X POST "http://localhost:5002/audio/convert" \
  -F "file=@/path/to/audio.wav" \
  -F "format=mp3"

The endpoint returns the converted file as a download (Content-Disposition: attachment).

Separate stems (Demucs)

curl -X POST "http://localhost:5002/audio/separate" \
  -F "file=@/path/to/song.mp3" \
  -F "model=htdemucs"   # optional — defaults to htdemucs

This endpoint runs stem separation and returns a ZIP file containing stems (if the separate_audio helper is used). Stem separation may be slow (minutes) depending on model and hardware.

File lifecycle and cleanup

Uploaded files are saved into the UPLOAD_FOLDER with UUID-prefixed names. Converted or generated files are saved under CONVERTED_FOLDER.

The code schedules cleanup of temporary files. Be aware that the current implementation schedules some files for deletion after a short delay; if you need to keep outputs, adjust the cleanup policy in services.py or utils.py.

Development notes and recommendations

Consolidate conversion logic: There are two conversion helper implementations in the repository. One (utils.convert_audio) invokes FFmpeg via subprocess, and another wrapper exists in services.py. Consider unifying them into a single well-documented function.
Use safe subprocess invocation: avoid shell=True and prefer argument lists with subprocess.run([...], check=True) to prevent shell injection and quoting issues.
Demucs: choose either the CLI invocation or the Python API and standardize output directories and filenames for predictability.
Add upload size limits: set app.config['MAX_CONTENT_LENGTH'] to protect the server from very large uploads.
For production: run behind a WSGI server (Gunicorn / uWSGI) and add HTTPS/authorization if the API will be publicly accessible.

Troubleshooting

If ffmpeg is not found: install it system-wide (brew install ffmpeg on macOS).
If demucs errors or model files are missing: install or download the required Demucs model weights per Demucs documentation.
PyTorch installation: installing the correct torch wheel for your platform/GPU is critical. Refer to https://pytorch.org/get-started/locally/ for platform-specific instructions.

Security considerations

This project is a developer API and is intentionally minimal. Before exposing it publicly, consider:

Authentication and authorization (API keys, OAuth, or JWT)
Rate limiting to prevent abuse
Strict upload size limits and timeouts
Input validation — the code currently saves uploaded files and relies on ALLOWED_AUDIO_EXTENSIONS but more checks (MIME type, scanning) can help
Error handling — do not expose stack traces in production responses; log them securely instead
Run the app in a sandboxed environment or container if you process untrusted audio

Contributing

Contributions are welcome. Suggested steps:

Fork the repository
Create a feature branch
Add tests and documentation for changes
Open a pull request with a clear description of your changes

License

MIT License

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Contact and further help

If you run into problems, open an issue in the repository with details about your environment, the commands you ran, and the errors you saw. Include python --version, pip freeze output, and ffmpeg -version where relevant.

Name		Name	Last commit message	Last commit date
Latest commit History 57 Commits
.github		.github
routes		routes
static		static
templates		templates
tests		tests
.gitignore		.gitignore
LICENSE.md		LICENSE.md
README.md		README.md
app.py		app.py
audio_app.py		audio_app.py
config.py		config.py
requirements.txt		requirements.txt
services.py		services.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Audio Tools API (by Nobz)

Overview

Requirements

Quick start (macOS / Linux)

Configuration

Running the API

API: endpoints and examples

File lifecycle and cleanup

Development notes and recommendations

Troubleshooting

Security considerations

Contributing

License

Contact and further help

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Audio Tools API (by Nobz)

Overview

Requirements

Quick start (macOS / Linux)

Configuration

Running the API

API: endpoints and examples

File lifecycle and cleanup

Development notes and recommendations

Troubleshooting

Security considerations

Contributing

License

Contact and further help

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages