Skip to content

jasonrichdarmawan/steering-model-with-latent-reasoning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

103 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Installation

  1. Install the Python project's dependencies

    conda create -n recurrent-env python=3.11
    conda activate recurrent-env
    pip install -r requirements.txt
  2. Use the following folder structure

    • Support the default huggingface's folder: ~/.cache/huggingface

      It is not supported natively because huggingface.co is not accesible from the author's location, making automated download not feasible. For now, please download the model's weights and datasets from huggingface using `huggingface-cli`.
      
      Note: If the datasets have Python files, you need to manually download it as `huggingface-cli` will only not download it.
      
    WORKSPACE/
    |- PROJECT/
    |- transformers/
    |- datasets/
    |- experiments/
    

    The $WORKSPACE/transformers folder is where you store the model's weights. For example, $WORKSPACE/transformers/huginn-0125/model-00003-of-00004.safetensors.

    The $WORKSPACE/datasets folder is where you store the datasets, with the following format $WORKSPACE/<user-name>/<dataset-name>. For example, $WORKSPACE/cais/mmlu.

  3. Manually download datasets.zip

    • Upload the datasets.zip to GitHub LFS for future proofing

    Extract the datasets.zip to $WORKSPACE/datasets/lirefs.

    Note: We use datasets.zip for fair performance comparison purpose as the method we are comparing to use this datasets.

How to reproduce the experiment?

See the subsections below for the examples.

You can see the list of jobs in the runner/jobs folder.

A job consists of multiple sub-jobs. If you want to run specific sub-job, then see the runner/jobs/[job_name]/_[job_name].py file

How to save hidden states?

WORKSPACE_PATH="/media/npu-tao/disk4T/jason"

python runner/main.py \
--workspace_path "$WORKSPACE_PATH" \
--jobs save_hidden_states_model_name_Meta-Llama-3-8B \
--output_path "$WORKSPACE_PATH/experiments/runner"

Jobs:

  • save_hidden_states
  • save_hidden_states_model_name_Meta-Llama-3-8B

How to train linear probes?

WORKSPACE_PATH="/media/npu-tao/disk4T/jason"

python runner/main.py \
--workspace_path "$WORKSPACE_PATH" \
--jobs train_linear_probes_model_name_Meta-Llama-3-8B \
--output_path "$WORKSPACE_PATH/experiments/runner"

Jobs:

  • train_linear_probes_model_name_huginn-0125
  • train_linear_probes_model_name_Meta-Llama-3-8B

How to save candidate directions?

WORKSPACE_PATH="/root/autodl-fs"

python runner/main.py \
--workspace_path "$WORKSPACE_PATH" \
--jobs save_candidate_directions \
--output_path "$WORKSPACE_PATH/experiments/runner"

Jobs:

  • save_candidate_directions
  • save_candidate_directions_model_name_Meta-Llama-3-8B
  • save_candidate_directions_model_name_huginn-0125

Note:

  • mmlu_pro_save_candidate_directions_all_tokens with batch size 2 requires 1x 4090 or equivalent

How to analyze steering effect per layer?

WORKSPACE_PATH="/media/npu-tao/disk4T/jason"

python runner/main.py \
--workspace_path "$WORKSPACE_PATH" \
--jobs analyze_steering_effect_per_layer_model_name_Meta-Llama-3-8B \
--output_path "$WORKSPACE_PATH/experiments/runner"

Jobs:

  • mmlu_pro_analyze_steering_effect_per_layer
  • mmlu_pro_analyze_steering_effect_per_layer_all_tokens
  • mmlu_pro_meta-llama-3-8b_analyze_steering_effect_per_layer
  • mmlu_pro_meta-llama-3-8b_analyze_steering_effect_per_layer_all_tokens
  • analyze_steering_effect_per_layer_model_name_huginn-0125
  • analyze_steering_effect_per_layer_model_name_Meta-Llama-3-8B

Note:

  • mmlu_pro_analyze_steering_effect_per_layer and its derivatives with batch size 2 require 4x 3090 (2x 3090) or equivalent

How to evaluate accuracy reasoning and memorization?

WORKSPACE_PATH="/media/npu-tao/disk4T/jason"

python runner/main.py \
--workspace_path "$WORKSPACE_PATH" \
--jobs evaluate_accuracy_reasoning_memorizing_model_name_Meta-Llama-3-8B_with_intervention_use_linear_probes_with_hidden_states_post_hook_layer_indices_12_modification_mode_last_token_scale_-5e-2 \
--output_path "$WORKSPACE_PATH/experiments/runner"

Tasks:

  • mmlu_pro_evaluate_accuracy_reasoning_memorizing
  • mmlu_pro_evaluate_accuracy_reasoning_memorizing_with_intervention
  • mmlu_pro_evaluate_accuracy_reasoning_memorizing_with_intervention_129
  • mmlu_pro_evaluate_accuracy_reasoning_memorizing_with_intervention_1
  • mmlu_pro_evaluate_accuracy_reasoning_memorizing_with_intervention_1_all_tokens
  • mmlu_pro_meta-llama-3-8b_evaluate_accuracy_reasoning_memorizing
  • mmlu_pro_meta-llama-3-8b_evaluate_accuracy_reasoning_memorizing_with_intervention
  • evaluate_accuracy_reasoning_memorizing_model_name_Meta-Llama-3-8B
  • evaluate_accuracy_reasoning_memorizing_model_name_Meta-Llama-3-8B_with_intervention_layer_indices_8_scale_-5e-2
  • evaluate_accuracy_reasoning_memorizing_model_name_Meta-Llama-3-8B_with_intervention_layer_indices_2_scale_-5e-2
  • evaluate_accuracy_reasoning_memorizing_model_name_Meta-Llama-3-8B_with_intervention_use_linear_probes_with_hidden_states_post_hook_layer_indices_12_modification_mode_last_token_scale_-5e-2

Note:

  • mmlu_pro_evaluate_accuracy_reasoning_memorizing and its derivatives with batch size 1 require 2x 3090 (1x 3090) or equivalent

How to evaluate with lm_eval?

WORKSPACE_PATH="/media/npu-tao/disk4T/jason"

python runner/main.py \
--workspace_path "$WORKSPACE_PATH" \
--jobs evaluate_lm_eval_tasks_piqa_use_linear_probes \
--output_path "$WORKSPACE_PATH/experiments/runner" \
--shutdown_after_experiment

Jobs:

  • mmlu_pro_evaluate_lm_eval
  • mmlu_pro_evaluate_lm_eval_with_intervention
  • mmlu_pro_evaluate_lm_eval_with_intervention_129
  • mmlu_evaluate_lm_eval
  • mmlu_evaluate_lm_eval_few_shots_1
  • mmlu_evaluate_lm_eval_with_intervention
  • mmlu_evaluate_lm_eval_with_intervention_scale_with_overall_magnitude_feature_amplification
  • mmlu_evaluate_lm_eval_with_intervention_1
  • mmlu_evaluate_lm_eval_with_intervention_127
  • mmlu_evaluate_lm_eval_with_intervention_129
  • mmlu_evaluate_lm_eval_with_intervention_1_all_tokens
  • mmlu_evaluate_lm_eval_with_intervention_1_all_tokens_scale_with_overall_magnitude
  • mmlu_evaluate_lm_eval_with_intervention_use_linear_probes
  • mmlu_evaluate_lm_eval_with_intervention_use_linear_probes_few_shots_1
  • evaluate_lm_eval_tasks_piqa
  • evaluate_lm_eval_tasks_piqa_use_linear_probes
  • evaluate_lm_eval_model_name_Meta-Llama-3-8B
  • evaluate_lm_eval_model_name_Meta-Llama-3-8B_with_intervention_user_linear_probes

Note:

  • mmlu_pro_evaluate_lm_eval and its derivatives with batch size 4 require 2x 3060 or equivalent

How to test?

$ python -m unittest discover tests

Disclaimer

  • datasets.zip is downloaded from this repository
  • models/recpre folder is downloaded from this repository. However, the raven_config_minimal.py and raven_modeling_minimal.py files are downloaded from this repository. This is required because we need hidden_states per layer

Hardware used

2x NVIDIA 3060 was used for the following jobs:

  • mmlu_evaluate_lm_eval
  • mmlu_evaluate_lm_eval_with_intervention

1x NVIDIA 3090 was used for the following jobs:

  • mmlu_pro_save_hidden_states

2x NVIDIA 3090 was used for the following jobs:

  • mmlu_pro_evaluate_accuracy_reasoning_memorizing
  • mmlu_pro_evaluate_accuracy_reasoning_memorizing_with_intervention
  • mmlu_pro_evaluate_lm_eval
  • mmlu_pro_evaluate_lm_eval_with_intervention

4x NVIDIA 3090 was used for the following jobs:

  • mmlu_pro_analyze_steering_effect_per_layer

About

Minimal code to extract and apply steering vectors to a model with latent reasoning from the paper "Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages