v-factscore

Evaluates LLM outputs by extracting atomic facts, identifying key entities, and calculating the percentage of facts supported by the provided database.

Compared to the original FActScore, this enhanced pipeline quantifies LLM truthfulness by:

Extracting atomic facts from generations;
Retrieving supporting entities (NER);
Verifying facts with a provided knowledge source;

This version also provides significant improvements:

Boosted performance through asynchronous API queries;
Boosted accuracy via Named Entity Recognition (NER) integration;
More reliable document retrieval using a sharded FAISS vector index that matches titles by semantic similarity rather than character-level comparison;
Automatic topic extraction;

Prerequisites

Knowledge source. A reference database in the specified format Ensure the table has two columns: title, text. You can use pre-built .db Wikipedia 2023/04/01 dump, download it directly from here.
Embeddings. Vector representations of knowledge source titles (article titles). Pre-computed embeddings from the Wikipedia 2023/04/01 dump, generated using the sentence-transformers/all-mpnet-base-v2 model, are available here.
Trained FAISS Index. A trained FAISS IVF Index using the embeddings above. This must be trained on the same embeddings to ensure compatibility and optimal retrieval performance. If the trained index is too large (>5GB), it may not fit in RAM. See factscore/create_index.py about handling this
API Configuration. As this implementation uses model APIs, you must set base URLs and API keys in their corresponding environment variables before execution.

export EMBEDDINGS_API_KEY="key-for-embeddings"
export COMPLETIONS_API_KEY="key-for-completions"

export EMBEDDINGS_BASE_URL="https://embeddings-api.url"
export COMPLETIONS_BASE_URL="https://completions-api.url"

Run

Make a new Python 3.11+ environment conda

Install the requirements

cd v-factscore
pip install -r requirements.txt

Initialize the factscore instance

from factscore.factscorer import FactScorer

fs = FactScorer()

Use the knowledge source database:

fs.register_knowledge_source(faiss_index="path/to/index",
                             data_db="path/to/database", 
                             table_name="tablename")

Score generations

res = fs.get_score(generations=[generation1, generation2], k=1)

See see demo.ipynb for more details.

License

This project is licensed under the MIT License — see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 69 Commits
.github/workflows		.github/workflows
exps		exps
factscore		factscore
tests		tests
LICENSE		LICENSE
README.md		README.md
demo.ipynb		demo.ipynb
environment.yml		environment.yml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

v-factscore

Prerequisites

Run

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

v-factscore

Prerequisites

Run

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages