Skip to content

Label ascending #4

@sj584

Description

@sj584

Hi, I'm trying to compare my model with your work with same data split

I am in the process of replicating the label-ascending evaluation.

However, it seems that paper description is somehow not enough to understand.

Could you clarify it?

This paragraph is from your article in method section.
"
The second approach is the label-ascending dataset split method for the experiment to figure out the robustness of our model. This approach offers consistent partitioning protocols across different datasets, reducing the need for extensive training and evaluation times while simultaneously assessing the model’s capacity to handle diverse data intricacies. With this method, data are initially organized based on the antigen–antibody taxonomy. Subsequently, mutant labels for each complex are arranged in ascending order. By adhering to predefined ratios, we select the subset containing the highest labels within the mutant group for each complex as our test set. The remaining subset, comprising a smaller proportion of data, is designated as the training set. After this division, our model and baselines are trained and evaluated using these distinct datasets. This strategy ensures a comprehensive evaluation of our model’s performance across different data distributions and complexities.
"

Here are the things that I understood and raises questions accordingly.

[By adhering to predefined ratios, we select the subset containing the highest labels within the mutant group for each complex as our test set.]

  1. given *order.csv file, I get some portion of highest label in each PDB set as test set

is the portion 20%? I somehow saw the ratio 20% in the github

but I think function data.load.split is not present in this github repository

[The remaining subset, comprising a smaller proportion of data, is designated as the training set.]

  1. remaining becomes train set, but why the training set is the smaller proportion of it compared to the test set?

Thank you in advance for your reply

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions