Skip to content

Add full Covertype hyperparameter sweep notebook with Colab + WeightWatcher integration#67

Open
charlesmartin14 wants to merge 1 commit intomainfrom
codex/add-plots-with-restricted-y-axis-for-accuracy
Open

Add full Covertype hyperparameter sweep notebook with Colab + WeightWatcher integration#67
charlesmartin14 wants to merge 1 commit intomainfrom
codex/add-plots-with-restricted-y-axis-for-accuracy

Conversation

@charlesmartin14
Copy link
Member

Motivation

  • Provide a runnable Colab-ready notebook to sweep/tune XGBoost on the UCI Covertype dataset and export WeightWatcher diagnostics for representative models.
  • Make the sweep robust to different runtimes and large stacked multiclass matrices and add graceful fallbacks when xgboost2ww/WeightWatcher conversion fails.

Description

  • Replace and extend notebooks/XGBoost2WWCovertypeHyperparameterSweep.ipynb with a complete Colab-friendly notebook that includes a Colab badge, Drive-mount logic, optional pip install cell, and notebook-level runtime knobs (USE_SAMPLING, SAMPLE_N, SWEEP_SEEDS, T_POINTS_SWEEP, WW_MAX_N, etc.).
  • Add deterministic data loading with a fetch_openml fallback, stratified train/val/test split helpers (build_split) and reproducible training/evaluation functions (fit_eval_config_once, fit_eval_config) that return accuracy/log-loss/F1 plus WeightWatcher metrics.
  • Integrate xgboost2ww.convert + WeightWatcher analysis with robust try/fallback handling (fallback to multiclass='avg' and reduced eval settings), include convert diagnostics and torch output handling, and increase WW matrix cap (WW_MAX_N).
  • Add a debug cell that trains one config, prints training/test stats, converts to torch, and runs deep WeightWatcher diagnostics (power-law fit, layer info and plots); many example outputs from a debug run were captured in the notebook.

Testing

  • Executed the initial Colab mounting and checkpoint setup cells successfully (Drive mounted and checkpoint paths printed).
  • Ran the debug training cell: model trained with early-stopping and printed best_iteration, train_acc and test_acc, and xgboost2ww.convert returned a torch Sequential with expected weight shapes.
  • Ran WeightWatcher analysis on the debug model and observed successful analysis logs (power-law fit started) and captured metrics/plots without crashing; the notebook shows these diagnostics as outputs.

Codex Task

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant