GitHub - Thiggel/Letting-NNs-Think

Project Structure

├── experiment                          - all the code is in here
│   ├── LanguageDataModule.py           - all the dataset loading and creation of next-token prediction labels happens here
│   ├── LMLightningModule.py            - all the model logic including making layers recurrent etc. is in here
│   ├── __main__.py
│   ├── RecurrentTransformerLayer.py    - this module is used my LMLightningModule for recurrent Transformer layers
│   ├── SSMTransformerLayer.py          - SSM-based layer to replace a Transformer layer with
│   └── utils
│       ├── accuracy.py                 - Next-token accuracy function
│       ├── hippo_init.py               - Initialize a matrix using the HiPPO framework
│       ├── add_pad_token.py
│       ├── args.py                     - Command line arguments class (so that my python linter can infer the cmd arg types)
│       ├── get_num_workers.py
│       ├── get_training_args.py        - All command line args are parsed with this function
│       ├── print_mean_std.py
│       ├── run_different_seeds.py
│       ├── run.py                      - A distinct run of the application is initiated with this function
│       └── set_seed.py

How to run the experiment

All available Snellius jobs are in the directory jobs.

The experiment can be invoked like this:

usage: python -m experiment [-h] [--seeds SEEDS] [--num_runs NUM_RUNS] [--model_name MODEL_NAME]
                   [--finetune_layers FINETUNE_LAYERS] [--remove_layers REMOVE_LAYERS]
                   [--make_layer_recurrent MAKE_LAYER_RECURRENT] [--use_ssm] [--use_hippo]
                   [--dataset {ultrafeedback,csqa_full,arc_full,piqa_full,siqa_full,openhermes,alpaca,gsm8k}]
                   [--seq_length SEQ_LENGTH] [--train_batch_size TRAIN_BATCH_SIZE]
                   [--eval_batch_size EVAL_BATCH_SIZE] [--no_logger]
                   [--experiment_name EXPERIMENT_NAME] [--max_epochs MAX_EPOCHS]
                   [--warmup_steps WARMUP_STEPS]

Training arguments

options:
  -h, --help            show this help message and exit
  --seeds SEEDS         Random seeds
  --num_runs NUM_RUNS   The number of runs
  --model_name MODEL_NAME
                        The model name to be used
  --finetune_layers FINETUNE_LAYERS
                        The layers to fine-tune
  --remove_layers REMOVE_LAYERS
                        The layers to remove
  --make_layer_recurrent MAKE_LAYER_RECURRENT
                        The layer to make recurrent
  --use_ssm             Whether to use an SSM as recurrent layer
  --use_hippo           Whether to initialize the SSM using HiPPO
  --dataset {ultrafeedback,csqa_full,arc_full,piqa_full,siqa_full,openhermes,alpaca,gsm8k}
                        The dataset to use for training
  --seq_length SEQ_LENGTH
                        The maximum sequence length
  --train_batch_size TRAIN_BATCH_SIZE
                        The training batch size
  --eval_batch_size EVAL_BATCH_SIZE
                        The evaluation batch size
  --no_logger           Whether to use a logger
  --experiment_name EXPERIMENT_NAME
                        The name of the experiment
  --max_epochs MAX_EPOCHS
                        The maximum number of epochs
  --warmup_steps WARMUP_STEPS
                        The number of warmup steps

Name		Name	Last commit message	Last commit date
Latest commit History 561 Commits
analysis		analysis
experiment		experiment
jobs		jobs
.gitignore		.gitignore
README.md		README.md
SyntheticDataGenerator.py		SyntheticDataGenerator.py
convert-checkpoint.py		convert-checkpoint.py
environment.yml		environment.yml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Project Structure

How to run the experiment

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Thiggel/Letting-NNs-Think

Folders and files

Latest commit

History

Repository files navigation

Project Structure

How to run the experiment

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages