Problem in reproducing attention analysis from the paper "What Do They Capture? - A Structural Analysis of Pre-Trained Language Models for Source Code"

Hi,

First of all, thank you for such detailed writing for discussion on pre-trained models for source code.

I am currently trying to reproduce the result, but in `compute_edge_features.py`, line 133, you are referring to a path `../data/code_new/code_contact_map/noneighbor/train.json`, which I could not find anywhere.

I did try to change the path to the `train.ast` file provided in the Python AST dataset, but another error is raised.
```
Layers: 12
Heads: 12
Loading dataset
100% 5000/5000 [00:00<00:00, 1458178.28it/s]
  0% 0/5000 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "compute_edge_features.py", line 155, in <module>
    min_attn=min_attn)
  File "compute_edge_features.py", line 64, in compute_mean_attention
    feature_map=item['feature_map']
KeyError: 'feature_map'
```

I hope you can give me an instruction to resolve the problem.

Many thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problem in reproducing attention analysis from the paper "What Do They Capture? - A Structural Analysis of Pre-Trained Language Models for Source Code" #19

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Problem in reproducing attention analysis from the paper "What Do They Capture? - A Structural Analysis of Pre-Trained Language Models for Source Code" #19

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions