Large Language Models Share Representations of Latent Grammatical Concepts Across Typologically Diverse Languages

This repository contains code, data, and links to autoencoders for replicating the experiments of the paper Large Language Models Share Representations of Latent Grammatical Concepts Across Typologically Diverse Languages.

Setup

Counterfactual Data. We design datasets consisting of minimal pairs of inputs that differ only with respect to the presence of a grammatical concept. For example, we generate counterfactual pairs that elicit singular or plural verbs based on the grammatical number of the subject:

a. The parents near the cars were
b. The parent near the cars was

This is an adaptation and translation of data from Arora et al. (2024).

Universal Dependencies. For our experiments, we selected 23 languages from Universal Dependencies 2.1 (UD; Nivre et al., 2017), a multilingual treebank containing dependency-parsed sentences. These correspond to the 23 languages that Aya-23 was trained on. The dataset can be downloaded at Universal Dependencies. Each word in each sentence in UD is annotated with its part of speech and morphosyntactic features, as defined in the UniMorph schema.

Sparse Autoencoders. To run experiments with Llama-3-8B or Aya-23-8B, you will need to either train or download sparse autoencoders for each model. You can download dictionaries for Aya-23-8B and Llama-3-8B from HuggingFace.

Demo Notebooks

Citation

If you use any of the code or ideas presented here, please cite our paper:

@misc{brinkmann2025largelanguagemodelsshare,
      title={Large Language Models Share Representations of Latent Grammatical Concepts Across Typologically Diverse Languages}, 
      author={Jannik Brinkmann and Chris Wendler and Christian Bartelt and Aaron Mueller},
      year={2025},
      eprint={2501.06346},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2501.06346}, 
}

If you use the dataset, please also cite:

@inproceedings{arora-etal-2024-causalgym,
    title = "{C}ausal{G}ym: Benchmarking causal interpretability methods on linguistic tasks",
    author = "Arora, Aryaman and Jurafsky, Dan and Potts, Christopher",
    editor = "Ku, Lun-Wei and Martins, Andre and Srikumar, Vivek",
    booktitle = "Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
    month = aug,
    year = "2024",
    address = "Bangkok, Thailand",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2024.acl-long.785",
    doi = "10.18653/v1/2024.acl-long.785",
    pages = "14638--14663"
}

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
data		data
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
bleu.py		bleu.py
get_max_activation.py		get_max_activation.py
probe_feature_effect.py		probe_feature_effect.py
probe_feature_intersection.py		probe_feature_intersection.py
probe_features.py		probe_features.py
probing.py		probing.py
specificity.py		specificity.py
translation.py		translation.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Large Language Models Share Representations of Latent Grammatical Concepts Across Typologically Diverse Languages

Setup

Demo Notebooks

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Large Language Models Share Representations of Latent Grammatical Concepts Across Typologically Diverse Languages

Setup

Demo Notebooks

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages