Inquiry Regarding Inconsistent Results Replicating Table 3 Experiments

I am currently in the process of replicating the experiments from your paper, specifically focusing on Table 3. However, I have encountered an unexpected phenomenon, particularly with the Photo_RW dataset.

When using the default split method (split=time), the observed accuracies are as follows:

PLM-Based (tiny): ≈66%
GNN-based (T-SAGE): ≈83%
Co-Training Based (SAGE(T)): ≈82%
However, when employing the random split method (split=random), the accuracies change to:

PLM-Based (tiny): ≈73%
GNN-based (T-SAGE): ≈88%
Co-Training Based (SAGE(T)): ≈87%
In both split methods, it seems that GNN-based accuracy is consistently higher than Co-Training Based accuracy. This contrasts with the values reported in Table 3 of your paper, where PLM-Based (tiny) ≈73%, GNN-based (T-SAGE) ≈83%, and Co-Training Based (SAGE(T)) ≈86%. According to the paper, Co-Training Based should outperform GNN-based.

I have provided my parameter configurations for your reference. Do you have any suggestions or insights into this inconsistency? Alternatively, could you please share specific running commands or the wandb experiment records for further clarification?



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inquiry Regarding Inconsistent Results Replicating Table 3 Experiments #3

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Inquiry Regarding Inconsistent Results Replicating Table 3 Experiments #3

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions