Skip to content

Conversation

@Siddharth7269
Copy link
Collaborator

Description:
This PR brings in my custom Tabular Denoising Diffusion Probabilistic Model (TabDDPM) implementation, organized in Models/TabDDPMSiddharth. It includes:

Core model code

tabddpm.py – the main diffusion class defining forward/backward processes

modules.py – neural network building blocks (MLP layers, embedding layers, etc.)

utils.py – data loaders, preprocessing, and helper functions

Training & evaluation

train_tabddpm.py – end-to-end training script with CLI arguments

evaluate_tabddpm.py – script to compute statistical and ML-based metrics on synthetic vs. real data

Example notebooks

demo_tabddpm_adult.ipynb – demo on the UCI Adult dataset

demo_tabddpm_car.ipynb – demo on the Car Evaluation dataset

Datasets & outputs

data/Adult.csv, data/car.csv, etc. – raw UCI CSVs

synthetic/ – generated synthetic datasets (e.g. Adult_synth.csv)

checkpoints/ – saved model weights for reproducibility

Configuration

config.yaml – default hyperparameters (learning rate, batch size, timesteps)

How to test / run:

Install dependencies (e.g. requirements.txt lists torch, pandas, numpy).

Train on a small dataset:

bash
Copy
Edit
python Models/TabDDPMSiddharth/train_tabddpm.py
--config Models/TabDDPMSiddharth/config.yaml
--data Models/TabDDPMSiddharth/data/Adult.csv
--epochs 50
Generate synthetic samples:

bash
Copy
Edit
python Models/TabDDPMSiddharth/evaluate_tabddpm.py
--checkpoint Models/TabDDPMSiddharth/checkpoints/last.pt
--output Models/TabDDPMSiddharth/synthetic/Adult_synth.csv
Review notebooks by opening demo_tabddpm_*.ipynb in Jupyter Lab.

Notes & Next Steps:

The raw CSVs are ~100 MB in total; consider moving to Git LFS or external storage.

We may refactor shared utilities (e.g. data loading) into a common Models/common directory.

Future enhancements: support conditional sampling, integration with the main benchmarking suite.

Please let me know if you’d like any restructuring or additional documentation!

pooyafo and others added 30 commits April 6, 2025 10:39
Signed-off-by: pooyafo <80544904+pooyafo@users.noreply.github.com>
Signed-off-by: pooyafo <80544904+pooyafo@users.noreply.github.com>
Signed-off-by: pooyafo <80544904+pooyafo@users.noreply.github.com>
Signed-off-by: kalshana <168319665+kalshana@users.noreply.github.com>
Add Kamala Ctgan and Kamala Tabddpm modules
Upload updated GANBLR model to Models folder
Added Muzamils MedGAN model to Models folder
pooyafo and others added 26 commits May 16, 2025 16:19
…load

Added sachin-ganblrpp module in models folder
Signed-off-by: pasinduambegoda1 <142980744+pasinduambegoda1@users.noreply.github.com>
Add AshbinMedgan(UI) folder for MedGAN UI
…ions

Add Khushi's TableGan and CTAB-GAN-Plus models
Final code including data pipeline
Signed-off-by: ashbinbenoy91 <s223968166@deakin.edu.au>
Signed-off-by: ashbinbenoy91 <s223968166@deakin.edu.au>
Signed-off-by: ashbinbenoy91 <s223968166@deakin.edu.au>
Signed-off-by: ashbinbenoy91 <s223968166@deakin.edu.au>
CRGAN Implementations and Datasets for Adult, Credit and Nursery
@Siddharth7269 Siddharth7269 changed the title Add Siddharth’s TabDDPM implementation under Models/TabDDPMSiddharth-main Added TabDDPM model implementation under Models/TabDDPMSiddharth-main May 24, 2025
Signed-off-by: viloshini89 <127585383+viloshini89@users.noreply.github.com>
Signed-off-by: viloshini89 <127585383+viloshini89@users.noreply.github.com>
Signed-off-by: viloshini89 <127585383+viloshini89@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.