Add full Covertype hyperparameter sweep notebook with Colab + WeightWatcher integration by charlesmartin14 · Pull Request #67 · CalculatedContent/xgboost2ww

charlesmartin14 · 2026-03-05T06:16:29Z

Provide a runnable Colab-ready notebook to sweep/tune XGBoost on the UCI Covertype dataset and export WeightWatcher diagnostics for representative models.
Make the sweep robust to different runtimes and large stacked multiclass matrices and add graceful fallbacks when xgboost2ww/WeightWatcher conversion fails.

Replace and extend notebooks/XGBoost2WWCovertypeHyperparameterSweep.ipynb with a complete Colab-friendly notebook that includes a Colab badge, Drive-mount logic, optional pip install cell, and notebook-level runtime knobs (USE_SAMPLING, SAMPLE_N, SWEEP_SEEDS, T_POINTS_SWEEP, WW_MAX_N, etc.).
Add deterministic data loading with a fetch_openml fallback, stratified train/val/test split helpers (build_split) and reproducible training/evaluation functions (fit_eval_config_once, fit_eval_config) that return accuracy/log-loss/F1 plus WeightWatcher metrics.
Integrate xgboost2ww.convert + WeightWatcher analysis with robust try/fallback handling (fallback to multiclass='avg' and reduced eval settings), include convert diagnostics and torch output handling, and increase WW matrix cap (WW_MAX_N).
Add a debug cell that trains one config, prints training/test stats, converts to torch, and runs deep WeightWatcher diagnostics (power-law fit, layer info and plots); many example outputs from a debug run were captured in the notebook.

Executed the initial Colab mounting and checkpoint setup cells successfully (Drive mounted and checkpoint paths printed).
Ran the debug training cell: model trained with early-stopping and printed best_iteration, train_acc and test_acc, and xgboost2ww.convert returned a torch Sequential with expected weight shapes.
Ran WeightWatcher analysis on the debug model and observed successful analysis logs (power-law fit started) and captured metrics/plots without crashing; the notebook shows these diagnostics as outputs.

Update sweep to aggregate across split seeds and log uncertainty

0bbe11c

charlesmartin14 added the codex label Mar 5, 2026 — with ChatGPT Codex Connector

Provide feedback