Skip to content

gouthamvgk/adversarial_net

Repository files navigation

Adversarial image generation using Stack GAN

Overview

A modified version of GAN called Stack GAN proposed in the paper StackGAN is used to generate adversarial images conditioned on text descriptions. A new type of augmentation called conditioning augmentation is used which introduces more robustness in generated images.

Dependencies

  • Python 3.6
  • Pytorch
  • PIL

Installation

All the dependency packages can be installed with pip install command. If anaconda distribution is installed then conda install can be used.

Data

Oxford flowers dataset is used for training the model. Every image in the dataset has 10 text descriptions which is used for conditioning. The descriptions are converted into sentence embedding. The pretrained sentence embedding and text description can be downloaded here. The image data can be downloaded from here.

Pre processing and Augmentation

The text description is converted to pretrained sentence embedding vector and conditioning augmentation is performed on it which is detailed in the paper mentioned above.

Architecture

Stack GAN is a two stage model. Each stage consists of both generator and discriminator. The image generated by stage 1 is improved by stage 2. In the original paper implementation 64x64 image is generated in stage 1 and 256x256 image is generated in stage 2. But due to GPU limitation, the model implemented here produces 64x64 image in both the stages.

Training

The hyper parameters for training and sampling can be changed by changing the corresponding values in configuration.py and then running python configuration.py Both the stages of Stack GAN is trained separately. To train the stage 1 from terminal run python train_samp_stage1.py. This trains the stage 1 model and checkpoints it. To train the stage 2 model run python train_samp_stage2.py. This takes the trained stage 1 model and uses it to train the stage 2 model.

References

The following papers were referred:

About

GAN for text conditioned image generation.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages