Support for functional localization #240

BKHMSI · 2024-07-06T05:43:58Z

Users can now perform functional localization as described in Brain-Like Language Processing via a Shallow Untrained Multihead Attention Network

Changes:

Localization stimuli saved in data/fedorenko2010_localization and can be loaded via the data_registry
Localization script can be found in model_helpers/localize that computes language mask according to the paper mentioned above
Language mask is cached in .brainio
HuggingfaceSubject class was adapted to extract activations from multiple layers at once and make use of the localization script if the use_localizer flag is set to True. This extracts only the language selective units from all the activations.

Usage:

Example script can be found in examples/score_localization

benchmark = load_benchmark('Pereira2018.243sentences-linear')

num_blocks = 12
layer_names = [f'transformer.h.{block}.{layer_type}' 
    for block in range(num_blocks) 
    for layer_type in ['ln_1', 'attn', 'ln_2', 'mlp']
]

model = HuggingfaceSubject(model_id='gpt2', 
    region_layer_mapping={ArtificialSubject.RecordingTarget.language_system: layer_names},
    use_localizer=True,
    localizer_kwargs={
        'hidden_dim': 768,
        'batch_size': 16,
        "top_k": 4096,
    }
)

model_score = benchmark(model)

mschrimpf · 2024-07-30T14:38:07Z

brainscore_language/data/fedorenko2010_localization/__init__.py

+    for stimuli_idx in range(3, 14):
+        data["sent"] += " " + data[f"stim{stimuli_idx}"].apply(str.lower)


what does this do? add comment

mschrimpf · 2024-07-30T14:40:58Z

brainscore_language/model_helpers/localize.py

+from brainscore_language import load_dataset
+
+BRAINIO_CACHE = os.environ.get("BRAINIO", f"{Path.home()}/.brainio")
+os.environ["TOKENIZERS_PARALLELISM"] = "False"


comment why this is necessary

mschrimpf · 2024-07-30T14:45:05Z

brainscore_language/model_helpers/localize.py

+
+class Fed10_langlocDataset(Dataset):
+    def __init__(self):
+        self.num_samples = 240


where is this being used?

line #103 in the extract_representations function

ok I'm not actually sure what this does -- looks like it's just used to zero-fill layer_name (??)
Could this not also be derived from self.sentences?

final_layer_representations = { "sentences": {layer_name: np.zeros((langloc_dataset.num_samples, hidden_dim)) for layer_name in layer_names}, "non-words": {layer_name: np.zeros((langloc_dataset.num_samples, hidden_dim)) for layer_name in layer_names} }

replaced langloc_dataset.num_samples with len(langloc_dataset.sentences)

mschrimpf

looks good; but please check comments

mschrimpf · 2025-01-27T12:18:19Z

.gitignore

 ### project specific additions:

-brainscore_language/data
+# brainscore_language/data


uncommented in the new commit

mschrimpf · 2025-01-27T12:20:00Z

brainscore_language/metrics/rdm/__init__.py

@@ -0,0 +1,13 @@
+from brainscore_language import metric_registry


could we just import this metric from brain-score-vision?

that would require brain-score-vision to be a dependency for brain-score-language, not sure if that's a good idea?

mschrimpf · 2025-03-12T19:42:04Z

@BKHMSI OK to merge?

BKHMSI · 2025-03-12T19:46:42Z

Yes, it was tested using Python 3.11

BKHMSI added 3 commits July 6, 2024 07:32

added support for localization

ea5a6d8

changed variable names in localization example

f98c6ab

Update .gitignore

5961de0

mschrimpf approved these changes Jul 30, 2024

View reviewed changes

BKHMSI added 6 commits August 8, 2024 09:21

added comments

da7672e

removed num_samples from Fed10_langlocDataset

d35d254

SUMA now supported

a07c9d2

added support for ridge regression

aa4fac8

added rdm and cka metrics

856f530

tuckute2024 and fedorenko2016 benchmarks added

8185c86

mschrimpf approved these changes Jan 27, 2025

View reviewed changes

data in gitignore

59cc2bc

mschrimpf merged commit c505c70 into brain-score:main Mar 13, 2025
0 of 3 checks passed

		for stimuli_idx in range(3, 14):
		data["sent"] += " " + data[f"stim{stimuli_idx}"].apply(str.lower)

		@@ -0,0 +1,13 @@
		from brainscore_language import metric_registry

Support for functional localization #240

Support for functional localization #240

Uh oh!

Conversation

BKHMSI commented Jul 6, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mschrimpf left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mschrimpf commented Mar 12, 2025

Uh oh!

BKHMSI commented Mar 12, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants