WikiChat is part of a research project at Stanford University's Open Virtual Assistant Lab.
Large language model (LLM) chatbots like ChatGPT and GPT-4 are great tools for quick access to knowledge. But they get things wrong a lot, especially if the information you are looking for is recent ("Tell me about the 2024 Super Bowl.") or about less popular topics ("What are some good movies to watch from [insert your favorite foreign director]?").
WikiChat uses an LLM as its backbone, but it makes sure the information it provides comes from a reliable source like Wikipedia, so that its responses are more factual.
We are hosting WikiChat to better understand the system in the wild. Thank you for giving it a try! For further research on factual chatbots, we store conversations conducted on this website in a secure database. Only the text that you submit is stored. We do NOT collect or store any other information.
Given the user input and the history of the conversation, WikiChat performs the following actions:
- Searches Wikipedia to retrieve relevant information.
- Summarizes and filters the retrieved passages.
- Generates a response using a Language Learning Model (LLM).
- Extracts claims from the LLM response.
- Fact-checks the claims in the LLM response using additional retrieved evidence it retrieves from Wikipedia.
- Drafts a response.
- Refines the drafted response.
The following figure shows how these steps are applied during a sample conversation about an upcoming movie at the time, edited for brevity.
Check out our paper!
Sina J. Semnani, Violet Z. Yao*, Heidi C. Zhang*, and Monica S. Lam. 2023. WikiChat: Stopping the Hallucination of Large Language Model Chatbots by Few-Shot Grounding on Wikipedia. In Findings of the Association for Computational Linguistics: EMNLP 2023, Singapore. Association for Computational Linguistics. [arXiv] [ACL Anthology]
Email: genie@cs.stanford.edu