Skip to content

World knowledge versus RAG knowledge #6

@drdsgvo

Description

@drdsgvo

The main purpose for RAG, as I understand it, is the following use case:

You have a knowledge base containing several documents.
These documents are seen as the ground truth. There is no other ground truth.
You simply cannot trust the world knowledge that was fed in to an LLM during pretraining.
We want the LLM to respond in a trustful way. That cannot be accomplished by trusting the knowledge that already resides in the LLM.

Just one example: "Every" website states that cookies are text files.
This is completely wrong. Almost any LLM is fed with this wrong knowledge during pretraining.
Cookies have never been text files. In former days, popular browsers stored them arbitrarily as text files. But who knows if every browser did it that way? Nowadays, cookies are stored within databases which are not text files or do not need to be text files.

HotpotQA as well as many other tests utilize Wikipedia and thus the same dataset which in most cases had been used for pretraining the very same LLM that should be tested against HotpotQA questions.

So trying to detect hallucinations does not make much sense if you refer to the same knowledge that resides in the LLM, that is asked, as well as in the RAG documents you feed in to that LLM.

A solution could be to rephrase key words in questions that are feed into DRAGIN.
Just before doing the RAG step/index search (elastic search or SGPT) reverse that rephrasing (e.g. use the original question). For the fetched RAG documents, do the rephrasing again before analyzing/using them in the DRAGIN loop.

So the solution would be to eliminate as muss "knowledge" as possible from the LLM by using different terms when asking the LLM. The "different terms" are ideally not included in the documents, that the LLM was pretrained with.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions