Code for the paper : High-dimensional Asymptotics of Denoising Autoencoders (link to paper)
(Figs. 1 - 3, solide lines)
-
Theory.ipynb provides a Jupyter notebook implementation of equations (13) for
$K=2, p=1$ , returning a sharp theoretical characterization for the denoising test mean-squared error and associated summary statistics. The statistics of the data distribution can be specified in the variables$\mu,\Sigma p,\Sigma m$ .
(Figs. 1 - 3, dots)
- simulations.py contains a Pytorch implementation of the related numerical experiments, for synthetic Gaussian mixture data.
- simulations_MNIST.py contains a similar implementation for the MNIST dataset, see Fig. 2. To train on the FashionMNIST dataset, simply change the loading to
mnist_trainset= datasets.FashionMNIST(root='data', train=True, download=True, transform=None)
Versions: These notebooks employ Python 3.12 , and Pytorch 2.5. The theory code utilizes the quadpy package for multidimensional numerical integration. This scheme is not adaptative, and convergence issues may appear e.g. for large cluster variance
