Skip to content

Problems Reproducing HippoRAG Results #87

@IsabelleKonrad

Description

@IsabelleKonrad

Hi,

I am currently trying to reproduce the HippoRAG results reported in the paper using the MultihopRAG dataset.

However, I am observing significantly lower performance:

  • Accuracy: ~42 (vs. 53 in the paper)
  • Recall: ~21 (vs. 47 in the paper)

I have a few questions regarding reproducibility:

  1. Is there a specific commit hash corresponding to the experiments reported in the paper?

  2. Could you clarify which hyperparameters in HippoRAG.yaml were used for the reported results?

    • I first ran it without changing the hyperparameters, then I set:
      • llm_model_max_token_size = 8000
      • top_k = 4 (for all three parameters)
      • other parameters unchanged
    • Both lead to very similar (low) results for me.
  3. My generated knowledge graph has ~22k nodes and ~15k edges,
    while the preprint reports ~35,953 nodes and ~37,173 edges.

    • Were those numbers obtained using Llama-3-8B, or a different model?

Any clarification would be greatly appreciated!

Thanks for making the code available.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions