This is a web interface for the Mel-Band-Roformer Vocal Model that separates vocals from music tracks.
- Simple web interface
- Supports multiple audio formats (MP3, WAV, OGG, FLAC, M4A)
- Automatic model and configuration download
- Real-time processing status updates
- Clone the repository:
git clone https://github.com/Atm4x/Mel-Band-Roformer-Vocal-Model-GUI
cd audio-separator- Create and activate a virtual environment (optional but recommended):
python -m venv venv
source venv/bin/activate # On Windows use: venv\Scripts\activate- Install requirements:
pip install -r requirements.txtSimply run:
python inference.pyThe application will:
- Automatically download the required model and configuration files
- Start a web server at http://localhost:5000
- Open your web browser and go to http://localhost:5000
- Upload an audio file using the web interface
- Wait for processing to complete
- Download the separated vocal and instrumental tracks
- The first run will download the model (~900MB) and configuration files
- Processing time depends on your computer's specifications and the length of the audio file
- Generated files are stored in the
outputsfolder
- Python 3.7 or higher
- At least 6GB RAM
- GPU is recommended but not required
This project is licensed under the MIT License - see the LICENSE file for details.