This repository hosts a Tabular GAN project that I created for fun and learning. It was inspired by the CTGAN paper (Paper) and other works in the GAN domain, including CTAB-GAN (Paper). My implementation supports conditional data generation and includes extra loss functions and additional features that I experimented with.
The implementation is not an exact copy of the CTGAN paper, as the specific training sample sampling process introduced in their work is not implemented here. If you need that feature, you can use the authors' official implementation.
It's worth noting that GANs can be quite challenging to train effectively, especially for tabular data. This difficulty is illustrated in Figure 1 of the GReaT paper (Paper), which demonstrates how even simple datasets can be difficult to model with GANs.
Due to time constraints, I haven’t included a detailed tutorial on the various loss functions and features, but feel free to explore the code and try them out on your own.
$ git clone https://github.com/ZanderNic/SATGan.git
$ pip install .- Load and train a CTGAN
from table_gan.Model.Gans.WCTGan import WCTGan
from table_gan.Data.dataset import CTGan_data_set
from sklearn.datasets import load_iris
iris = load_iris(as_frame=True)
df = iris['frame']
data_set = CTGan_data_set(
data=df,
cond_cols=["target"],
cat_cols=["target"]
)
wctgan = WCTGan()
crit_loss, gen_loss = wctgan.fit(
data_set,
n_epochs=10,
)- Generate new synthetic tabular data
cond_df = pd.DataFrame([{"target" : 1}]*160)
syn_df = wctgan.gen(cond_df=cond_df)This project is inspired by CTGAN and the methods described in their work.