-
Notifications
You must be signed in to change notification settings - Fork 3
Description
Based on the code provided in usage_example.ipynb It is not clear what the source/target datasets should be — presuming the source dataset is a list of strings, and the target is an ontology or a terminology, however in the code below the target dataset seems to be also a text file and not a terminology.
Similarly, it is not clear how the source and target weights are computed. To be able to run this tool, it would be useful to clarify these issues and to have some code to process inputs and generate BioBERT weights accordingly.
source_dataset="../data/input.txt"
target_dataset="../data/target.txt"
source_weight="../data/input.jsonl"
target_weight="../data/target.jsonl"
The data folder used in the given example usage is missing. What exactly should these files be and how can they be generated by potential users?