Create Siddharth_TabDDPM #88

Siddharth7269 · 2025-05-25T17:10:20Z

Added the completed model for TabDDPM Model

Implemented TabDDPM training pipeline for synthetic tabular data generation
Built end-to-end diffusion-based code following Kotelnikov et al.’s architecture (2209.15421v2), including scheduler, sampler and noise schedule modules.

Completed experiments using two evaluation protocols

50 / 50 real–synthetic split with 2-fold cross-validation, repeated 3 times per dataset

70 / 30 train–test split matching the paper’s original setup

Integrated comprehensive performance metrics

TSTR accuracy (MLP, Logistic Regression, XGBoost, Random Forest)

Jensen–Shannon Divergence (JSD)

Wasserstein Distance (WD)

Developed class-injection logic for missing labels
Automatically detect underrepresented classes in synthetic outputs and inject real samples to ensure compatibility with XGBoost and other downstream models.

Added dynamic epoch configuration

100 epochs for small / medium datasets

150 epochs for large datasets

Aligned hyperparameters and benchmarking with the paper
Matched learning rate schedule, batch size, and model depth exactly as in 2209.15421v2 to validate reproducibility.

Surpassed published benchmarks
Achieved an F1-score of 0.80 on the UCI Adult dataset versus the paper’s 0.795 benchmark
.

Containerized the full pipeline with Docker
Created Dockerfile and Compose scripts to encapsulate all dependencies for environment-independent execution.

Modularized dataset handling and preprocessing scripts
Encapsulated loading, cleaning, encoding and splitting logic into reusable modules for rapid onboarding of new datasets.

Managed experiments via GitHub
Employed feature branches, structured commits and CI-driven validation to track code, hyperparameters and results.

Automated final result aggregation
Wrote scripts to compile averaged evaluation metrics and divergence scores across repeats into a consolidated CSV report.

Added the completed model for TabDDPM Model Signed-off-by: Siddharth Yadav <55278616+Siddharth7269@users.noreply.github.com>

Create Siddharth_TabDDPM

da71275

Added the completed model for TabDDPM Model Signed-off-by: Siddharth Yadav <55278616+Siddharth7269@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create Siddharth_TabDDPM #88

Create Siddharth_TabDDPM #88

Uh oh!

Siddharth7269 commented May 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Create Siddharth_TabDDPM #88

Are you sure you want to change the base?

Create Siddharth_TabDDPM #88

Uh oh!

Conversation

Siddharth7269 commented May 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants