Skip to content

zackrone/MSTransformer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

75 Commits
 
 
 
 
 
 

Repository files navigation

MSTransformer

This project introduces two transformer-based architectures for music separation, MSTransformer and MSTU. MSTransformer is a purely transformer-based architecture, following the method from "Attention Is All You Need" (Vaswani et al., 2017). MSTU utilizes both the transformer encoder and the UNet architecture to encode additional information and improve performance.

Layout

Transformer

Transformer directory includes all sublayers and model implementations:

  • decoder.py: Individual transformer decoder block. Decoder class lays consecutive decoder blocks.
  • encoder.py: Individual transformer encoder block. Encoder class lays consecutive encoder blocks.
  • mstransformer.py: MSTransformer implementation using preprocessing layer, encoder, decoder, and postprocessing layer.
  • mstu.py: MSTU implementation using downsampling blocks, transformer encoder, bottleneck, and upsampling blocks.
  • position.py: Positional encodings for transformer encoder.
  • preprocess.py: Both preprocessing and postprocessing layers for MSTransformer.
  • sampling.py: Implementation of upsampling and downsampling blocks for MSTU.

Utils directory includes all additional utilities for training, preprocessing, and analysis:

  • analysis.py: SDR evaluation. Calculation for number of parameters in a model.
  • dataset.py: Implementation of torch dataset wrapper class for MUSDB18 dataset.
  • fourier.py: Short-Time Fourier Transform and Inverse Short-Time Fourier Transform.

Diagrams

We provide diagrams of both MSTransformer and MSTU architectures for clarity.

MSTransformer

diagram of MSTransformer architecture

MSTU

diagram of MSTU architecture

About

A transformer-based architecture for music source separation.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages