Proposal
It would be nice to include OLMo (1B and 7B) and their checkpoints as available compatible models for HookedTransformer.
Motivation
OLMo-1B would be a great model to do some mechanistic interpretability, especially as it is very open-source, allowing us to see relations between training data/processes, checkpoints and model performance. It should have fairly similar architecture to already compatible models. If it is possible to get it to run already, I would really appreciate a link to some information, as I've tried to look through the documentation myself in the meantime.
Pitch
Add OLMo-1B, -7B. Add OLMo2-7B and -13B. Add model checkpoints?
Checklist
Proposal
It would be nice to include OLMo (1B and 7B) and their checkpoints as available compatible models for HookedTransformer.
Motivation
OLMo-1B would be a great model to do some mechanistic interpretability, especially as it is very open-source, allowing us to see relations between training data/processes, checkpoints and model performance. It should have fairly similar architecture to already compatible models. If it is possible to get it to run already, I would really appreciate a link to some information, as I've tried to look through the documentation myself in the meantime.
Pitch
Add OLMo-1B, -7B. Add OLMo2-7B and -13B. Add model checkpoints?
Checklist