An AI-powered fact-checking tool for professional journalism
xFC-LLM is an LLM-based framework designed to emulate expert-driven fact-checking processes through automated verification techniques. The tool integrates three core methodologies from manual fact-checking:
- Cross-Checking
- Discourse Marker Analysis
- Task-Specific Guardrails for Fact-Checking
Example usage:
./install.sh
source .venv/bin/activate
python3 run.py --claim "Climate change is a hoax"API Integration
GoogleFactCheckmodule aggregates data from Google Fact Check Tools to get credibility scoresClaimBustermodule aggregates data from ClaimBuster API to get credibility scores
Example usage:
from integrations.scores import GoogleFactCheck, ClaimBuster
google_checker = GoogleFactCheck()
claimbuster = ClaimBuster()
sample_text = "Climate change is a hoax"
print("Google FactCheck Results:", google_checker.get_results(sample_text))
print("ClaimBuster Score:", claimbuster.get_score(sample_text))Sample model output:
Google FactCheck Results: ['False', 'False', 'Exaggerates', 'Misleading', 'False', 'Not the Whole Story', 'Spins the Facts', 'Four Pinocchios', 'False', 'False']
ClaimBuster Score: 0.5001685622
Scoring
- The tool requests Google FactCheck API and ClaimBuster API to extract credibility scores for the user query
- The tool searches for similar data from vectorized versions of FEVER and LIAR datasets
Vector Database
We provide a vectorized versions of two popular fact-checking databases:
See scripts/vectorized.py, folders indices & data for our vectorization script and resulting data storages (faiss and parquet for vector similarity search)
Vector Similarity Search
Our script searches for the most similar samples from LIAR and FEVER and retrieves matches and their L2-distance scores
Sample scoring output:
{'discourse_annotation': {'checks': [], 'markers': []},
'scoring_results': {'chunks': [{'results': [{'distance': 1.3203424215316772,
'label': 'SUPPORTS',
'text': 'Proponents of globalism '
'tend to advocate for '
'modification of '
'economic policy'},
{'distance': 1.4274979829788208,
'label': 'NOT ENOUGH INFO',
'text': 'There exists a 2000 '
'page novel called The '
'Winds of Winter.'},
{'distance': 1.4332990646362305,
'label': 'NOT ENOUGH INFO',
'text': 'There exists a vegan '
'and an atheist called '
'Ted Cruz.'}],
'source': 'FEVER'},
{'results': [{'distance': 0.243112251162529,
'label': 'true',
'text': 'Global warming is a '
'hoax.'},
{'distance': 0.45072153210639954,
'label': 'barely true',
'text': 'Says Donald Trump says '
'climate change is a '
'hoax invented by the '
'Chinese.'},
{'distance': 0.6204410791397095,
'label': 'half-true',
'text': 'Only 3 percent of '
'voters 18 to 34 dont '
'believe that climate '
'change is really '
'happening.'}],
'source': 'LIAR'}],
'scores': [{'result': ['Spins the Facts',
'Misleading',
'False',
'Not the Whole Story',
'False',
'False',
'Exaggerates',
'Four Pinocchios',
'False',
'False'],
'source': 'GoogleFactCheck'},
{'result': 0.5001685622,
'source': 'ClaimBuster'}]}}The app creates a temporary JSON storage for credibilty scores, chunks, consistency checks nd discourse analysis results.
