Skip to content

dariowsz/tone-tinker

Repository files navigation

Tone Tinker

Tone Tinker is a machine learning project that predicts Native Instruments Massive VST parameters from audio samples. The project uses a two-stage model architecture to reverse-engineer synthesizer parameters from raw audio.

Overview

The project consists of two main components:

  1. Variational Autoencoder: Compresses audio spectrograms into a lower-dimensional latent space
  2. Sound Designer: Predicts synthesizer parameters from the encoded audio representation

Architecture

tone-tinker-arch

Data Pipeline

  1. Raw audio (.wav) files → Log spectrograms
  2. Spectrograms → Variational Autoencoder → Latent representation
  3. Latent representation → Sound Designer → Synthesizer parameters

Models

  • Variational Autoencoder: Convolutional neural network that learns to compress and reconstruct audio spectrograms
  • Sound Designer: Multi-Layer Perceptron that predicts:
    • 2 continuous parameters (regression)
    • 1 categorical parameter (5-class classification for wavetable selection)

Dataset

The dataset was generated programmatically using:

  • Spotify's Pedalboard library to interface with the Massive VST
  • 1000 training presets with randomized parameters
  • Parameters sampled:
    • 2 continuous parameters
    • Wavetable selection (5 options: "Squ-Sw I", "Sin-Tri", "Plysaw II", "Esca II", "A.I.")

Examples

Original:

output_0.mp4
output_1.mp4
output_2.mp4

Reconstructed

reconstructed_output_0.mp4
reconstructed_output_1.mp4
reconstructed_output_2.mp4

Getting Started

Prerequisites

  • Python 3.11+
  • Poetry 1.8.3
  • Native Instruments Massive VST3

Installation

git clone https://github.com/dariowsz/tone-tinker.git
cd tone-tinker
poetry install
pip install -r requirements.macos.txt

Training

  1. Generate the dataset: Follow the instructions in research/20240627_generate_dataset.py.

  2. Preprocess the data (this will generate the spectrograms and save them to disk):

python src/preprocess.py
  1. Train the variational autoencoder:
python src/vae_trainer.py
  1. Train the sound designer:
python src/sound_designer_trainer.py

Demo

To run the Streamlit demo, use the following command:

streamlit run demo.py
tone-tinker-demo

Project Status

This is an initial proof of concept with a limited parameter set. Future improvements may include:

  • Expanding to more synthesizer parameters
  • Exploring more complex model architectures
  • Using other AI techniques like Reinforcement Learning to train the sound designer

Acknowledgments

  • Spotify's Pedalboard library for VST automation
  • Native Instruments Massive VST

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages