Issue 39 support othellogpt in SAELens#317
Conversation
|
Thanks @decandido and @zhenningdavidliu! This has some overlap with #294. Specifically, I'd like to use the seqpos slicing instead of the start and end pos offsets. Other than that this looks good I think! Requests:
Once that's done I'll accept :) |
chanind
left a comment
There was a problem hiding this comment.
If I userstand this correctly, these offset parameters only makes sense in cases where the model is given sequences of a fixed length, and context_size is set to that length, e.g. OthelloGPT?
|
|
||
| stacked_activations = torch.zeros((n_batches, n_context, 1, self.d_in)) | ||
| # For some models, we might want to exclude some positions from the sequence to train on | ||
| training_context_slice = list( |
There was a problem hiding this comment.
This is inefficient to create a list of indices like this. It's better to create an actual slice object, like training_context_slice = slice(self.start_pos_offset, n_context - self.end_pos_offset). Alternative, you can probably just do something like the following directly when selecting the specific indices.
stacked_activations[:, :, 0] = layerwise_activations[self.hook_name][
:, start_pos:end_pos, self.hook_head_index
]There was a problem hiding this comment.
Hey David, thanks for your thorough review of our PR! As Joseph mentioned in his comment above, we've rebased on Callum's PR and added a few tests of our own. These problems are not present in the rebased code, but they are really helpful for @zhenningdavidliu and me! 🙏
7c333e8 to
1b39d82
Compare
…han dataset lengths for tokenized datasets.
8539145 to
9130ff9
Compare
Thanks @jbloomAus for reviewing our PR! We've rebased on #294 and removed the start and end pos offsets. We also added a short script to train SAEs on othelloGPT ( |
|
Hey, thanks for all your awesome work here! Could I possibly bump on this PR? Just cause it would also be super useful for the final bits of the ARENA material to run smoothly (-: |
chanind
left a comment
There was a problem hiding this comment.
LGTM! Will leave it to Joseph to merge
|
Thanks! (I'm actually also realizing that the context length issue wasn't a part of this PR, it was an earlier one, and that was the main thing that was making things a bit clunky, so less important thank I thought - but would still be great!) |
* support seqpos slicing * add basic tests, ensure it's in the SAE config * format * fix tests * fix tests 2 * fix: Changing the activations store to handle context sizes smaller than dataset lengths for tokenized datasets. * fix: Found bug which allowed for negative context lengths. Removed the bug * Update pytest to test new logic for context size of tokenized dataset * Reformat code to pass CI tests * Add warning for when context_size is smaller than the dataset context_size * feat: adding support for start and end position offsets for token sequences * Add start_pos_offset and end_pos_offset to the SAERunnerConfig * Add tests for start_pos_offset and end_pos_offset in the LanguageModelSAERunnerConfig * feat: start and end position offset support for SAELens. * Add test for CacheActivationsRunnerConfig with start and end pos offset * Test cache activation runner wtih valid start and end pos offset * feat: Enabling loading of start and end pos offset from saes. Adding tests for this * fix: Renaming variables and a test * adds test for position offests for saes * reformats files with black * Add start and end pos offset to the base sae dict * fix test for sae training runner config with position offsets * add a benchmark test to train an SAE on OthelloGPT * Remove double import from typing * change dead_feature_window to int * remove print statements from test file * Rebase on seqpos tuple implementation and remove start/end pos offset * Reword docstring for seqpos to be clearer. * Added script to train an SAE on othelloGPT --------- Co-authored-by: callummcdougall <cal.s.mcdougall@gmail.com> Co-authored-by: jbloomAus <jbloomaus@gmail.com> Co-authored-by: liuman <zhenninghimme@gmail.com>
Description
Adding support for OthelloGPT. Adding logic to train on sequences with a start and end position offset similar to PR 294. Adding tests for the added logic and to benchmark the OthelloGPT SAETrainingRunner.
Fixes # Issue 39
Type of change
Please delete options that are not relevant.
Checklist:
You have tested formatting, typing and unit tests (acceptance tests not currently in use)
make check-cito check format and linting. (you can runmake formatto format code if needed.)