-
Notifications
You must be signed in to change notification settings - Fork 40
Description
Hi! Thank you for providing such an efficient framework. I’ve encountered an issue where running the same code twice produces different outcomes, even after using seed_everything with the same seed and setting loc_seed in sample_core.cpp as a constant. I found that this inconsistency is caused by two main factors:
- The
update_memoryfunction of classMailBoxmay update the same indices of a tensor simutaneously (Specifically, the lineself.node_memory[nid.long()] = memoryandself.node_memory_ts[nid.long()] = ts) sincenidmay contain duplicate IDs. - The
update_mailboxfunction of classMailBoxattempts to handle the issue mentioned above. However, the implementation does not work as expected. The lineperm = inv.new_empty(uni.size(0)).scatter_(0, inv, perm)will not return the samepermeach time for the same reason as the first factor.
To resolve these issues and obtain consistent results across runs (in addition to using seed_everything and setting loc_seed), I made the following modifications.
For theupdate_memory function, I added:
np_nid = nid.detach().cpu().numpy()
reversed_indices = np_nid[::-1]
unique_indices, first_indices = np.unique(
reversed_indices, return_index=True)
last_indices = len(nid)-first_indices-1
tc_last_indices = torch.from_numpy(
last_indices).to(self.device)
nid = nid[tc_last_indices]
memory = memory[tc_last_indices]
ts = ts[tc_last_indices]before
self.node_memory[nid.long()] = memory
self.node_memory_ts[nid.long()] = tsSimilar modifications were applied to update_mailbox. With the modifications, I obtained the same outcomes for two runs.
I hope this helps clarify the problem and look forward to improved solutions from you. Thank you for your attention to this matter.