-
Notifications
You must be signed in to change notification settings - Fork 96
Open
Description
Hi,
I am currently trying to reproduce the HippoRAG results reported in the paper using the MultihopRAG dataset.
However, I am observing significantly lower performance:
- Accuracy: ~42 (vs. 53 in the paper)
- Recall: ~21 (vs. 47 in the paper)
I have a few questions regarding reproducibility:
-
Is there a specific commit hash corresponding to the experiments reported in the paper?
-
Could you clarify which hyperparameters in
HippoRAG.yamlwere used for the reported results?- I first ran it without changing the hyperparameters, then I set:
llm_model_max_token_size = 8000top_k = 4(for all three parameters)- other parameters unchanged
- Both lead to very similar (low) results for me.
- I first ran it without changing the hyperparameters, then I set:
-
My generated knowledge graph has ~22k nodes and ~15k edges,
while the preprint reports ~35,953 nodes and ~37,173 edges.- Were those numbers obtained using Llama-3-8B, or a different model?
Any clarification would be greatly appreciated!
Thanks for making the code available.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels