Audiogenesis is a background music generation model that harnesses the power of Recurrent Neural Networks (RNNs) specifically Long Short-Term Memory (LSTM) units. This model generates soothing background music tailored to text prompts and advertisements by learning from a vast database of MIDI files.
- MIDI Encoding: Audiogenesis starts by encoding MIDI files into a format understandable by RNNs with features step, pitch, and duration. MIDI files contain musical notes, timing, and other parameters crucial for music generation.
- LSTM Architecture: The core of Audiogenesis lies in its LSTM units. LSTMs are a type of RNN designed to capture dependencies over long sequences, making them ideal for generating music with coherent structures.
- Text Prompt Embedding: Text prompts and advertisements are embedded into a format compatible with the LSTM network. This embedding serves as the input to the LSTM units, guiding the generation process towards music that complements the given text.
- Training on MIDI Database: Audiogenesis is trained on a diverse database of MIDI files encompassing various musical styles, genres, and compositions. This extensive training enables the model to learn the nuances of musical composition and style.
celeste_piano.mp4
music_hype.mp4
drum.mp4
The architecture of Audiogenesis draws inspiration from the advancements in LSTM-based music generation models.
To utilize Audiogenesis and generate background music from text prompts, follow these steps:
Inside the repository, you will find:
- Model Definition: Details regarding the architecture of Audiogenesis, including LSTM units and text prompt embedding techniques.
- Training Scripts: Scripts and code for training the Audiogenesis model using your own MIDI dataset or pre-existing data.
- Inference Code: Code for generating background music using the trained model. You can input your text prompts to receive musical compositions.
- Evaluation Techniques: Techniques for evaluating the quality of generated music, ensuring coherence and relevance to the given text prompts.
The repository includes MIDI files alongside corresponding text prompts for training and evaluation purposes.