Skip to content

Problem in reproducing attention analysis from the paper "What Do They Capture? - A Structural Analysis of Pre-Trained Language Models for Source Code" #19

@dfighter1312

Description

@dfighter1312

Hi,

First of all, thank you for such detailed writing for discussion on pre-trained models for source code.

I am currently trying to reproduce the result, but in compute_edge_features.py, line 133, you are referring to a path ../data/code_new/code_contact_map/noneighbor/train.json, which I could not find anywhere.

I did try to change the path to the train.ast file provided in the Python AST dataset, but another error is raised.

Layers: 12
Heads: 12
Loading dataset
100% 5000/5000 [00:00<00:00, 1458178.28it/s]
  0% 0/5000 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "compute_edge_features.py", line 155, in <module>
    min_attn=min_attn)
  File "compute_edge_features.py", line 64, in compute_mean_attention
    feature_map=item['feature_map']
KeyError: 'feature_map'

I hope you can give me an instruction to resolve the problem.

Many thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions