-
Install the Python project's dependencies
conda create -n recurrent-env python=3.11 conda activate recurrent-env pip install -r requirements.txt
-
Use the following folder structure
-
Support the default huggingface's folder:
~/.cache/huggingfaceIt is not supported natively because huggingface.co is not accesible from the author's location, making automated download not feasible. For now, please download the model's weights and datasets from huggingface using `huggingface-cli`. Note: If the datasets have Python files, you need to manually download it as `huggingface-cli` will only not download it.
WORKSPACE/ |- PROJECT/ |- transformers/ |- datasets/ |- experiments/The
$WORKSPACE/transformersfolder is where you store the model's weights. For example,$WORKSPACE/transformers/huginn-0125/model-00003-of-00004.safetensors.The
$WORKSPACE/datasetsfolder is where you store the datasets, with the following format$WORKSPACE/<user-name>/<dataset-name>. For example,$WORKSPACE/cais/mmlu. -
-
Manually download datasets.zip
- Upload the
datasets.zipto GitHub LFS for future proofing
Extract the
datasets.zipto$WORKSPACE/datasets/lirefs.Note: We use
datasets.zipfor fair performance comparison purpose as the method we are comparing to use this datasets. - Upload the
See the subsections below for the examples.
You can see the list of jobs in the runner/jobs folder.
A job consists of multiple sub-jobs. If you want to run specific
sub-job, then see the runner/jobs/[job_name]/_[job_name].py file
How to save hidden states?
WORKSPACE_PATH="/media/npu-tao/disk4T/jason"
python runner/main.py \
--workspace_path "$WORKSPACE_PATH" \
--jobs save_hidden_states_model_name_Meta-Llama-3-8B \
--output_path "$WORKSPACE_PATH/experiments/runner"Jobs:
- save_hidden_states
- save_hidden_states_model_name_Meta-Llama-3-8B
WORKSPACE_PATH="/media/npu-tao/disk4T/jason"
python runner/main.py \
--workspace_path "$WORKSPACE_PATH" \
--jobs train_linear_probes_model_name_Meta-Llama-3-8B \
--output_path "$WORKSPACE_PATH/experiments/runner"Jobs:
- train_linear_probes_model_name_huginn-0125
- train_linear_probes_model_name_Meta-Llama-3-8B
WORKSPACE_PATH="/root/autodl-fs"
python runner/main.py \
--workspace_path "$WORKSPACE_PATH" \
--jobs save_candidate_directions \
--output_path "$WORKSPACE_PATH/experiments/runner"Jobs:
- save_candidate_directions
- save_candidate_directions_model_name_Meta-Llama-3-8B
- save_candidate_directions_model_name_huginn-0125
Note:
- mmlu_pro_save_candidate_directions_all_tokens with batch size 2 requires 1x 4090 or equivalent
WORKSPACE_PATH="/media/npu-tao/disk4T/jason"
python runner/main.py \
--workspace_path "$WORKSPACE_PATH" \
--jobs analyze_steering_effect_per_layer_model_name_Meta-Llama-3-8B \
--output_path "$WORKSPACE_PATH/experiments/runner"Jobs:
- mmlu_pro_analyze_steering_effect_per_layer
- mmlu_pro_analyze_steering_effect_per_layer_all_tokens
- mmlu_pro_meta-llama-3-8b_analyze_steering_effect_per_layer
- mmlu_pro_meta-llama-3-8b_analyze_steering_effect_per_layer_all_tokens
- analyze_steering_effect_per_layer_model_name_huginn-0125
- analyze_steering_effect_per_layer_model_name_Meta-Llama-3-8B
Note:
- mmlu_pro_analyze_steering_effect_per_layer and its derivatives with batch size 2 require
4x 3090(2x 3090) or equivalent
WORKSPACE_PATH="/media/npu-tao/disk4T/jason"
python runner/main.py \
--workspace_path "$WORKSPACE_PATH" \
--jobs evaluate_accuracy_reasoning_memorizing_model_name_Meta-Llama-3-8B_with_intervention_use_linear_probes_with_hidden_states_post_hook_layer_indices_12_modification_mode_last_token_scale_-5e-2 \
--output_path "$WORKSPACE_PATH/experiments/runner"Tasks:
- mmlu_pro_evaluate_accuracy_reasoning_memorizing
- mmlu_pro_evaluate_accuracy_reasoning_memorizing_with_intervention
- mmlu_pro_evaluate_accuracy_reasoning_memorizing_with_intervention_129
- mmlu_pro_evaluate_accuracy_reasoning_memorizing_with_intervention_1
- mmlu_pro_evaluate_accuracy_reasoning_memorizing_with_intervention_1_all_tokens
- mmlu_pro_meta-llama-3-8b_evaluate_accuracy_reasoning_memorizing
- mmlu_pro_meta-llama-3-8b_evaluate_accuracy_reasoning_memorizing_with_intervention
- evaluate_accuracy_reasoning_memorizing_model_name_Meta-Llama-3-8B
- evaluate_accuracy_reasoning_memorizing_model_name_Meta-Llama-3-8B_with_intervention_layer_indices_8_scale_-5e-2
- evaluate_accuracy_reasoning_memorizing_model_name_Meta-Llama-3-8B_with_intervention_layer_indices_2_scale_-5e-2
- evaluate_accuracy_reasoning_memorizing_model_name_Meta-Llama-3-8B_with_intervention_use_linear_probes_with_hidden_states_post_hook_layer_indices_12_modification_mode_last_token_scale_-5e-2
Note:
- mmlu_pro_evaluate_accuracy_reasoning_memorizing and its derivatives with batch size 1 require
2x 3090(1x 3090) or equivalent
WORKSPACE_PATH="/media/npu-tao/disk4T/jason"
python runner/main.py \
--workspace_path "$WORKSPACE_PATH" \
--jobs evaluate_lm_eval_tasks_piqa_use_linear_probes \
--output_path "$WORKSPACE_PATH/experiments/runner" \
--shutdown_after_experimentJobs:
- mmlu_pro_evaluate_lm_eval
- mmlu_pro_evaluate_lm_eval_with_intervention
- mmlu_pro_evaluate_lm_eval_with_intervention_129
- mmlu_evaluate_lm_eval
- mmlu_evaluate_lm_eval_few_shots_1
- mmlu_evaluate_lm_eval_with_intervention
- mmlu_evaluate_lm_eval_with_intervention_scale_with_overall_magnitude_feature_amplification
- mmlu_evaluate_lm_eval_with_intervention_1
- mmlu_evaluate_lm_eval_with_intervention_127
- mmlu_evaluate_lm_eval_with_intervention_129
- mmlu_evaluate_lm_eval_with_intervention_1_all_tokens
- mmlu_evaluate_lm_eval_with_intervention_1_all_tokens_scale_with_overall_magnitude
- mmlu_evaluate_lm_eval_with_intervention_use_linear_probes
- mmlu_evaluate_lm_eval_with_intervention_use_linear_probes_few_shots_1
- evaluate_lm_eval_tasks_piqa
- evaluate_lm_eval_tasks_piqa_use_linear_probes
- evaluate_lm_eval_model_name_Meta-Llama-3-8B
- evaluate_lm_eval_model_name_Meta-Llama-3-8B_with_intervention_user_linear_probes
Note:
- mmlu_pro_evaluate_lm_eval and its derivatives with batch size 4 require 2x 3060 or equivalent
$ python -m unittest discover testsdatasets.zipis downloaded from this repositorymodels/recprefolder is downloaded from this repository. However, theraven_config_minimal.pyandraven_modeling_minimal.pyfiles are downloaded from this repository. This is required because we needhidden_statesper layer
2x NVIDIA 3060 was used for the following jobs:
mmlu_evaluate_lm_evalmmlu_evaluate_lm_eval_with_intervention
1x NVIDIA 3090 was used for the following jobs:
mmlu_pro_save_hidden_states
2x NVIDIA 3090 was used for the following jobs:
mmlu_pro_evaluate_accuracy_reasoning_memorizingmmlu_pro_evaluate_accuracy_reasoning_memorizing_with_interventionmmlu_pro_evaluate_lm_evalmmlu_pro_evaluate_lm_eval_with_intervention
4x NVIDIA 3090 was used for the following jobs:
mmlu_pro_analyze_steering_effect_per_layer