Skip to content

KeyError in /code/BERT_NER/utils_fine_tune/labels_seg.txt` #5

@cuevasclemente

Description

@cuevasclemente

Hi,

I'm trying to run E2E_SoftNER.py. I think I have been able to resolve the references to the locations of a lot of the models and files that are associated with the repo, however, I'm getting an error, here's the traceback:

Exception has occurred: KeyError
8
  File "/Users/clemente/src/python/github/StackOverflowNER/code/BERT_NER/softner_segmenter_preditct_from_file.py", line 298, in evaluate
    preds_list[i].append(label_map[preds[i][j]])
  File "/Users/clemente/src/python/github/StackOverflowNER/code/BERT_NER/softner_segmenter_preditct_from_file.py", line 638, in predict_segments
    result, predictions = evaluate(args, model, tokenizer, labels, pad_token_label_id, mode="", path=input_file)
  File "/Users/clemente/src/python/github/StackOverflowNER/code/BERT_NER/E2E_SoftNER.py", line 186, in Extract_NER
    softner_segmenter_preditct_from_file.predict_segments(segmenter_input_file, segmenter_output_file)
  File "/Users/clemente/src/python/github/StackOverflowNER/code/BERT_NER/E2E_SoftNER.py", line 206, in <module>
    Extract_NER(input_file)

It looks like there might be something off with what this code expects for the format of './utils_fine_tune/labels_seg.txt'. Looking at label_map here, it is just a dictionary that doesn't have a key for 8:

> label_map
{0: 'B-Name', 1: 'O', 2: 'CTC_PRED:0', 3: 'CTC_PRED:1', 4: 'md_label:O', 5: 'md_label:Name'}

whereas preds here seems to be an array with a pretty high number of values:

> preds
array([[ 0,  8, 13, ..., 10,  1,  0],
       [ 4, 13,  1, ...,  7,  3,  9],
       [ 9,  2,  0, ...,  9,  1,  9],
       ...,
       [ 0,  2, 13, ...,  0, 12,  0],
       [ 4,  2,  5, ..., 10,  5,  1],
       [ 4,  2,  6, ...,  9,  9,  9]])

Everything in the utils_fine_tune directory came from the megaupload link you provided, so it could be possible that there was some issue with either the archive, or the data.

If you find the time to take a look at this issue, thanks very much for contributing this code to the community and please let me know if there is anything else you might be interested in from me to help debug or further understand this issue. Hopefully it's just some misunderstanding on my end.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions