Skip to content

A framework for tabular data generation using GANs, featuring conditional generation and benchmarking tools.

License

Notifications You must be signed in to change notification settings

ZanderNic/TabDataGAN

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

67 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation


Conditional GAN for Tabular Data Generation


Infos

This repository hosts a Tabular GAN project that I created for fun and learning. It was inspired by the CTGAN paper (Paper) and other works in the GAN domain, including CTAB-GAN (Paper). My implementation supports conditional data generation and includes extra loss functions and additional features that I experimented with.

The implementation is not an exact copy of the CTGAN paper, as the specific training sample sampling process introduced in their work is not implemented here. If you need that feature, you can use the authors' official implementation.

It's worth noting that GANs can be quite challenging to train effectively, especially for tabular data. This difficulty is illustrated in Figure 1 of the GReaT paper (Paper), which demonstrates how even simple datasets can be difficult to model with GANs.

Due to time constraints, I haven’t included a detailed tutorial on the various loss functions and features, but feel free to explore the code and try them out on your own.

🚀 Sample Usage

$ git clone https://github.com/ZanderNic/SATGan.git
$ pip install .

Generate data

  • Load and train a CTGAN
from table_gan.Model.Gans.WCTGan import WCTGan
from table_gan.Data.dataset import CTGan_data_set
from sklearn.datasets import load_iris

iris = load_iris(as_frame=True)
df = iris['frame']

data_set = CTGan_data_set(
    data=df,
    cond_cols=["target"],
    cat_cols=["target"]  
)

wctgan = WCTGan()

crit_loss, gen_loss = wctgan.fit(
    data_set, 
    n_epochs=10, 
)
  • Generate new synthetic tabular data
cond_df = pd.DataFrame([{"target" : 1}]*160)
syn_df = wctgan.gen(cond_df=cond_df)

Credits and Acknowledgments

This project is inspired by CTGAN and the methods described in their work.

About

A framework for tabular data generation using GANs, featuring conditional generation and benchmarking tools.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published