v3.0.0a1
Pre-releaseBig release coming up! This release will add a new module named TransformerBridge, which will greatly increase flexibility and expandability of TransformerLens. This is a very experimental module right now, but we are looking for people who are ready to test it. This version already supports more models than any of the existing HookedModules, and we are in the middle of working through a number of scripts to assure full compatibility with any existing code utilized any of those HookedModules.
If you are interested in helping as test some of this, let us know on the slack channel! If you want to be able to use any models not currently supported in HookedModules, then please feel free to submit any scripts currently running with an existing HookedModule to https://github.com/TransformerLensOrg/BridgeComaptibilityScripts. All scripts in this repo will be confirmed to be running, and matching the current HookedTransformer output before the final 3.0.0 release is published.
What's Changed
- Refactor the utilities file into utilities folder by @starship006 in #628
- Raise exception when BERT is loaded with HookedTransformer instead of… by @degenfabian in #795
- Circular dependency resolution by @bryce13950 in #803
- fixed corner param by @bryce13950 in #817
- bumped python min version by @bryce13950 in #802
- Updates torch to use the most recent version by @bryce13950 in #822
- updated python requirements by @bryce13950 in #821
- Recent releases by @bryce13950 in #841
- updated mypy limit by @bryce13950 in #880
- Activation utils cleanup by @bryce13950 in #879
- Restore consistency of hook_normalized between LayerNorm and RMSNorm by @degenfabian in #770
- Fix that padding_side always defaults to "right" when no value is explicitly passed by @degenfabian in #814
- Unified conversions by @bryce13950 in #881
- Flatten state dictionary for proper weight loading by @degenfabian in #860
- enabled actions on action pr by @bryce13950 in #882
- Add weight conversion for Phi model by @degenfabian in #863
- Add weight conversion for T5 models by @degenfabian in #859
- Visualize weight conversions by @degenfabian in #852
- Fixed test for ensuring weight conversions are provided by @bryce13950 in #883
- Drop python 3.9 by @bryce13950 in #885
- Conversion improved test coverage by @bryce13950 in #886
- Component test coverage by @bryce13950 in #890
- Bug new loading by @bryce13950 in #891
- Weight conversion llama by @bryce13950 in #892
- Refactor supported models module by @bryce13950 in #893
- Bug neox by @bryce13950 in #895
- added conditional check for hugging face by @bryce13950 in #919
- created a seperate list of models to test for public PRs by @bryce13950 in #920
- added alternative when hf token is not included by @bryce13950 in #921
- shrunk loss test by @bryce13950 in #922
- Fix broken test, per issue #913 by @JasonBenn in #914
- Fix loading on specific device by @mntss in #906
- Feature model adapter by @bryce13950 in #928
- added test for making sure formatting works well by @bryce13950 in #932
- Refactor final issues by @bryce13950 in #933
- restored tokenizer content by @bryce13950 in #935
- Refactor weight conversion by @bryce13950 in #931
- Add qwen3 by @mntss in #937
- Improve ActivationCache docs by @BorisTheBrave in #901
- Feature: Get the value for rotary base from the hugging face config, only for Qwen for now. by @Gusanidas in #887
- added python 3.13 to CI by @bryce13950 in #843
- updated mypy by @bryce13950 in #940
- updated numpy dependency by @bryce13950 in #943
- upated torch by @bryce13950 in #942
- updated transformers by @bryce13950 in #939
- Fixed Qwen 3 docs issues by @bryce13950 in #946
- upstream fixes from dev by @bryce13950 in #941
- Flexible component mapping by @bryce13950 in #938
- updated sphinx by @bryce13950 in #948
- removed dependency by @bryce13950 in #951
- Move flatten dictionary to architecture_conversion by @degenfabian in #936
- made new transformer bridge extend nn module properly by @bryce13950 in #955
- brought in remaining hooked transformer functions by @bryce13950 in #954
- Setup tokenizer in boot function by @degenfabian in #959
- Bridged Robust Model Structure by @bryce13950 in #960
- Remove transformers dependency from bridge tokenization by @degenfabian in #963
- Dynamically add boot function to bridge by @degenfabian in #964
New Contributors
- @JasonBenn made their first contribution in #914
- @BorisTheBrave made their first contribution in #901
- @Gusanidas made their first contribution in #887
Full Changelog: v2.15.4...v3.0.0a1