This is the official code repository for the paper:
Blending Complementary Memory Systems in Hybrid Quadratic-Linear Transformers
fla(contains common model implementations used in all the experiments)language_modeling(Table 1 and 3)synthetic_algorithmic(Table 2)reinforcement_learning(Figure 2)
Please refer to the README file in the corresponding directory for further instructions.
-
Each of the experimental directories,
language_modeling,synthetic_algorithmic, andreinforcement_learning, usesfla. Please create a symbolic link (e.g.,ln -s ../fla/fla .; i.e.,fladirectory underfla) under each of these directories. -
We used the same conda environment for all the settings. The corresponding environment file is
environment.yml. We usedPython 3.10.16. -
weights & biases is used in all our experiments.
This repository contains forks of code from the following publicly available resources. LICENSE files are included in the corresponding directories. We extend our heartfelt thanks to all the corresponding authors for making these very useful toolkits openly available.
language_modelingis a fork of fla-org/flame.flais a fork of fla-org/flash-linear-attention.synthetic_algorithmicis a fork of automl/unlocking_state_tracking which itself is based on NX-AI/xlstm.reinforcement_learningis a fork of twni2016/Memory-RL.