-
Notifications
You must be signed in to change notification settings - Fork 40
Description
Hi,
I'm trying to run E2E_SoftNER.py. I think I have been able to resolve the references to the locations of a lot of the models and files that are associated with the repo, however, I'm getting an error, here's the traceback:
Exception has occurred: KeyError
8
File "/Users/clemente/src/python/github/StackOverflowNER/code/BERT_NER/softner_segmenter_preditct_from_file.py", line 298, in evaluate
preds_list[i].append(label_map[preds[i][j]])
File "/Users/clemente/src/python/github/StackOverflowNER/code/BERT_NER/softner_segmenter_preditct_from_file.py", line 638, in predict_segments
result, predictions = evaluate(args, model, tokenizer, labels, pad_token_label_id, mode="", path=input_file)
File "/Users/clemente/src/python/github/StackOverflowNER/code/BERT_NER/E2E_SoftNER.py", line 186, in Extract_NER
softner_segmenter_preditct_from_file.predict_segments(segmenter_input_file, segmenter_output_file)
File "/Users/clemente/src/python/github/StackOverflowNER/code/BERT_NER/E2E_SoftNER.py", line 206, in <module>
Extract_NER(input_file)
It looks like there might be something off with what this code expects for the format of './utils_fine_tune/labels_seg.txt'. Looking at label_map here, it is just a dictionary that doesn't have a key for 8:
> label_map
{0: 'B-Name', 1: 'O', 2: 'CTC_PRED:0', 3: 'CTC_PRED:1', 4: 'md_label:O', 5: 'md_label:Name'}
whereas preds here seems to be an array with a pretty high number of values:
> preds
array([[ 0, 8, 13, ..., 10, 1, 0],
[ 4, 13, 1, ..., 7, 3, 9],
[ 9, 2, 0, ..., 9, 1, 9],
...,
[ 0, 2, 13, ..., 0, 12, 0],
[ 4, 2, 5, ..., 10, 5, 1],
[ 4, 2, 6, ..., 9, 9, 9]])
Everything in the utils_fine_tune directory came from the megaupload link you provided, so it could be possible that there was some issue with either the archive, or the data.
If you find the time to take a look at this issue, thanks very much for contributing this code to the community and please let me know if there is anything else you might be interested in from me to help debug or further understand this issue. Hopefully it's just some misunderstanding on my end.