Generative Models

Autoencoders (basic research domain independent)

Tutorial: Deriving the Standard Variational Autoencoder (VAE) Loss Function (Jul 2019)
Tutorial for the loss function of VAEs

beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework (Apr 2017)
Varational Autoencoder with an adjustable hyperparameter beta that balances latent channel capacity and independence constraints with reconstruction accuracy. It helps to disentangled the latent space.

Adversarial Autoencoders [code] (Nov 2015)
Similar to the variational autoencoders but the distribution of the latent space is forced with a discriminator as in the GANs.

GANs Generative Adversarial Networks (basic research domain independent)

MSG-GAN: Multi-Scale Gradients for Generative Adversarial Networks [code] (Jun 2020)
Flow of gradients from the discriminator to the generator at multiple scales.

Detecting GAN Generated Errors (Dec 2019)
New network which discriminates each element (pixel) as real of fake.

Improved Precision and Recall Metric for Assessing Generative Models (Apr 2019)
Precision and recall metrics for generative models using mainfolds with features vectors (in an already trained network as Inception) samples of the generated and the real dataset distributions.

Towards a Deeper Understanding of Adversarial Losses (Jan 2019)
Comparion of different GAN loss functions and gradiente penalties for regularization.

MetaGAN: An Adversarial Approach to Few-Shot Learning (Dec 2018)
New approach for few shot learners, it uses GANs to generate images as the sample domain but classifies them as fake. They help the classifier to improve.

Robustness via curvature regularization, and vice versa (Nov 2018)
New regularizer that directly minimizes curvature of the loss surface, and leads to adversarial robustness that is on par with adversarial training.

Self-Attention Generative Adversarial Networks (May 2018)
Added self-attention to the GAN in order to create more realistic images. Attention helps to keep spatial contextual information and avoid things like dogs with 2 heads of 8 legs.

Primal-Dual Wasserstein GAN (May 2018)
WGAN that uses the primal formulation of the optimal transport problem in the Generator and the dual formulation in the discriminator.

Evolutionary Generative Adversarial Networks (Mar 2018)
Evolutionary algorithm applied to GAM. The offspring is generated by mutations, where each mutation is a different way to train the generator.

Training Generative Adversarial Networks via Primal-Dual Subgradient Methods: A Lagrangian Perspective on GAN (Feb 2018)
They relate the minimax game of GANs to finding the saddle points of the Lagrangian function for a convex optimization problem.

Spectral Normalization for Generative Adversarial Networks (Feb 2018)
New technique for weight normalization that helps to stabilize the training of the discriminator.

MINE: Mutual Information Neural Estimation (Jan 2018)
Added a new loss term based on KL divergence between the latent variables of the generator and the output of a statistic network which inputs are the generated distribution (they use the discriminator network for this).

Wasserstein GAN (Jan 2017)
Improved stability of learning in GANs by using the Wasserstein distance.

InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets (Jun 2016)
Generative adversarial network that also maximizes the mutual information between a small subset of the latent variables and the observation.

Image

GANSpace: Discovering Interpretable GAN Controls [code] (Apr 2020)
Indentifies PCA components in latent spaces to modify features in the generated images.

Adversarial Latent Autoencoders [code] (Apr 2020)
Adversarial Latent Autoencoder (ALAE) which mixes an autoencoder and GAN to generate images.

MSG-GAN: Multi-Scale Gradients for Generative Adversarial Networks [code] (Nov 2019)
Skip connections between the generator and discriminator (as in the U-net) in order to transfer the gradients to all layers in the generator. It can generate big images and the progressive GANs.

SinGAN: Learning a Generative Model from a Single Natural Image (Sep 2019)
Create new images with a cascade of GANs, from lower resolution to higher. The image is improved in each GAN. Can be use paint to image, editing, superresolution, ...

HoloGAN: Unsupervised learning of 3D representations from natural images [code] (Apr 2019)
3D generation based on unlabeled 2D images with explicit control over the pose of the generated objects. It includes rotation and projection modules in the generator to achieve the results.

SC-FEGAN: Face Editing Generative Adversarial Network with User’s Sketch and Color [code] (Feb 2019)
Uses image restoration based on sketches to change details of faces.

InstaGAN: Instance-aware Image-to-Image Translation [code] (Dec 2018)
Image translation aware of the shape of the elements to transform with segmentation masks and a loss to preserve the same background.

A Style-Based Generator Architecture for Generative Adversarial Networks [code] (Dec 2018)
Face generation based on a source image to have the style of a target image.

Large Scale GAN Training for High Fidelity Natural Image Synthesis (Sep 2018)
GAN to create images trained over ImageNet and JFT-300M.

Glow: Generative Flow with Invertible 1×1 Convolutions [code] (Jul 2018)
Flow generative model (the decode is the inverse of the encoder, special ways to calculate the inverse of the matrix) of realistic images which uses invertible 1x1 convolutions and affine coupling layers.

Progressive Growing of GANs for Improved Quality, Stability, and Variation [code] (Apr 2018)
The key idea is to grow both the generator and discriminator progressively: starting from a low resolution, we add new layers that model increasingly fine details as training progresses. This both speeds the training up and greatly stabilizes it. CelebA dataset

Attention-GAN for Object Transfiguration in Wild Images (Mar 2018)
Translation (or style transfer) of images using attention to focus in the part of the image that needs to change.

StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation [code] (Nov 2017)
Image to Image translation with variational inference to capture attributes. Used CelebA for facial attribute translation.

Text

Fake Sentence Detection as a Training Task for Sentence Encoding (Aug 2018)
Using fake sentences (word shuffling, word dropping) to improve the encoding of sentences.

Audio

High Fidelity Speech Synthesis with Adversarial Networks (Sep 2019)
GAN-TTS Text to speech mode created with GANs. New cuantitative metric based on Frechet Inception Distance. Slightly worse than WaveNet.

Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis [code] (Jan 2019)
Creates audio based on text using the voice of a speaker. It only requires a few seconds of real audio to create an embedding and use it to produce a similar voice.

StarGAN-VC: Non-parallel many-to-many voice conversion with star generative adversarial networks [code] (Jun 2018)
StarGan for voices. Requires only several minutes of speech, it is fast and it learns many-to-many mappings.

Video

Efficient Video Generation on Complex Datasets (Jul 2019)
DVD-GAN: BigGAN for videos. It uses 2 discriminators: a Spatial Discriminator (for frames) and a Temporal Discriminator in a downscale version of the video (for movement); and a special attention to be efficient.

Video-to-Video Synthesis [code] (Dec 2018)
Video-to-video synthesis with GANs with a spatio-temporal adversarial objective to maintain the spacial coherence of the elements in the video.

Towards Accurate Generative Models of Video: A New Metric & Challenges (Dec 2018)
A new metric to measure video generated, the Frechet Video Distance based on the Frechet Inception Distance, and a dataset for video creation using Starcraft.

Photo Wake-Up: 3D Character Animation from a Single Photo (Dec 2018)
Method and application for animating a human subject from a single photo. E.g., the character can walk out, run, sit, or jump in 3D.

3D

3D Point-Capsule Networks (Dec 2018)
An autoencoder for 3D objects using 3D capsule networks.

Graphs

Neural Turtle Graphics for Modeling City Road Layouts (Oct 2019)
A GAN to generate graphs for road modeling with recurrent encoders and decoder.

How to add a paper / dataset:

Check in which category the paper fits
Check in which subcategory the paper fits (create a new one if needed)
Add the title, link, the month and year it was published, a link to the code if exits and the contribution of the paper. Papers should be sorted by more recent first in each category. Example:

Examples:

Title of the paper [code] (Jun 2018)
A couple of lines describing the main contribution of the paper. Do not copy the abstract or write more than 2 lines in order to keep the wiki tidy.

Title of the paper (Jan 2018)
A couple of lines describing the main contribution of the paper. Do not copy the abstract or write more than 2 lines in order to keep the wiki tidy.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Generative Models

Autoencoders (basic research domain independent)

GANs Generative Adversarial Networks (basic research domain independent)

Image

Text

Audio

Video

3D

Graphs

How to add a paper / dataset:

Examples:

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Home

Categories:

Datasets

Clone this wiki locally