Skip to content

faerber-lab/Claim2Source

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

54 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Claim2Source at CheckThat! 2025: Zero-Shot Style Transfer for Scientific Claim-Source Retrieval

This repository contains the implementation, documentation, results and datasets relating to Claim2Source's participation in the CheckThat! 2025 Lab Task 4b Lab on scientific claim-source retrieval.

Contents

Abstract

In this paper, we present Claim2Source’s participation in the CheckThat! 2025 Task 4b on scientific claim-source retrieval. Our work systematically explores the impact of style transfer on retrieval performance using a dataset of COVID-19-related tweets and scientific publications. We apply seven distinct style transfer methods, distributed across claims and sources, to assess their combined impact on retrieval performance. These style transfer methods are evaluated across 15 retrieval systems, including 1 sparse, 7 dense, and 7 hybrid models, by testing each system with all combinations of claim and source styles. To guide the style transfer process, we employ a modular zero-shot prompting template with detailed instructions using a large language model (LLM). Our results show that GritLM-7B achieves the best performance without style transfer, suggesting strong robustness to informal text. In contrast, most other models, especially sparse and hybrid ones, benefit from applying a formal writing style to claims. Notably, we observe hybrid retrieval models tend to outperform their dense counterparts in most cases, highlighting the potential advantage of integrating sparse and dense retrieval paradigms.

Installation

Python 3.10 and PyTorch/2.1.2 was used to perform the experiments. To install the required dependencies, run the following command:

pip install -r requirements.txt

Datasets

The dataset for the CheckThat! 2025 Subtask 4b comprises a query set of tweets and a collection set of candidate papers drawn from the CORD-19 corpus. The query set includes 14,399 tweets with implicit references to scientific literature, partitioned into training, development (1,400 tweets), and test (1,446 tweets) subsets, each annotated with the unique identifier of the referenced paper. The collection set consists of metadata for 7,718 CORD-19 papers. We performed style transfer to both queries and claims. The datasets can be found at out/synthetic_abstracts and out/synthetic_queries.

Retrieval Systems

For our experiments we implemented 15 different retrieval models, including 1 sparse, 7 dense and 7 hybrid models.

Indexing

For all dense retrieval models we created one Faiss index per source style.

synthetic_abstracts = False
st = SentenceTransformerRetrievalSystem(model_name="intfloat/e5-large-v2")
st.save_corpus_embeddings(synthetic_abstracts=synthetic_abstracts)
st = SentenceTransformerRetrievalSystem(model_name="all-MiniLM-L6-v2")
st.save_corpus_embeddings(synthetic_abstracts=synthetic_abstracts)
st = SentenceTransformerRetrievalSystem(model_name="all-mpnet-base-v2")
st.save_corpus_embeddings(synthetic_abstracts=synthetic_abstracts)
st = SentenceTransformerRetrievalSystem(model_name="gtr-t5-xl")
st.save_corpus_embeddings(synthetic_abstracts=synthetic_abstracts)
st = GritLMRetrievalSystem()
st.save_corpus_embeddings(synthetic_abstracts=synthetic_abstracts)
st = SentenceTransformerRetrievalSystem(model_name="malteos/scincl")
st.save_corpus_embeddings(synthetic_abstracts=synthetic_abstracts)
st = SentenceTransformerRetrievalSystem(model_name="allenai/specter")
st.save_corpus_embeddings(synthetic_abstracts=synthetic_abstracts)

Ranking

Run retrieval to get rankings using the following code:

k_queries = -1  # Specify number of queries to use (-1: all)
synthetic_abstracts = False  # Set to True if style transfer should be used (Specify style in config.py)
synthetic_queries = False  # Set to True if style transfer should be used (Specify style in config.py)

# Run ranking
ranking(retrieval_system=BM25RetrievalSystem(), k_queries=k_queries, synthetic_queries=synthetic_queries,
        synthetic_abstracts=synthetic_abstracts)
ranking(retrieval_system=SentenceTransformerRetrievalSystem(model_name="intfloat/e5-large-v2"), k_queries=k_queries,
        synthetic_queries=synthetic_queries, synthetic_abstracts=synthetic_abstracts)
ranking(retrieval_system=SentenceTransformerRetrievalSystem(model_name="all-MiniLM-L6-v2"), k_queries=k_queries,
        synthetic_queries=synthetic_queries, synthetic_abstracts=synthetic_abstracts)
ranking(retrieval_system=SentenceTransformerRetrievalSystem(model_name="all-mpnet-base-v2"), k_queries=k_queries,
        synthetic_queries=synthetic_queries, synthetic_abstracts=synthetic_abstracts)
ranking(retrieval_system=SentenceTransformerRetrievalSystem(model_name="gtr-t5-xl"), k_queries=k_queries,
        synthetic_queries=synthetic_queries, synthetic_abstracts=synthetic_abstracts)
ranking(retrieval_system=GritLMRetrievalSystem(), k_queries=k_queries,
        synthetic_queries=synthetic_queries, synthetic_abstracts=synthetic_abstracts)
ranking(retrieval_system=SentenceTransformerRetrievalSystem(model_name="malteos/scincl"), k_queries=k_queries,
        synthetic_queries=synthetic_queries, synthetic_abstracts=synthetic_abstracts)
ranking(retrieval_system=SentenceTransformerRetrievalSystem(model_name="allenai/specter"), k_queries=k_queries,
        synthetic_queries=synthetic_queries, synthetic_abstracts=synthetic_abstracts)

Run ranking for CheckThat! 2025 Task 4b using the following code (Make sure that ranking files already exist!):

retrieval_systems = [GritLMRetrievalSystem()]
checkthat_submission(retrieval_systems=retrieval_systems, synthetic_queries=True, synthetic_abstracts=False)

Style Transfer

We developed a modular zero-shot prompting template for style transfer including 4 styles for claims and 3 styles for source documents. We designed a modular prompt template with four components: context, task, instructions, and output specification. The context describes the overall retrieval objective and is adapted depending on whether the input is a claim or a source. The task defines the LLM’s role, for example, generating a scientific question from a tweet. The instructions provide style transfer guidelines, including tone and structure. The output specification ensures uniform response formatting. All prompts can be found in the folder prompts/. For all experiments, we utilize the LLaMA 3.3 70B Instruct model.

style_transfer

Perform style transfer for claims and/or sources using the following code (The style can be specified in config.py):

run_style_transfer_query(top_k=-1)
run_style_transfer_corpus(top_k=-1)

We used an API to prompt LLaMA 3.3 70B Instruct. Change the Base-Url in llm/llm_request.py!

Evaluation

Style transfer for claim-source retrieval is evaluated using the MRR@5 score. Run the following code for evaluation:

retrieval_systems = [BM25RetrievalSystem(),
                     SentenceTransformerRetrievalSystem(model_name="all-MiniLM-L6-v2"),
                     SentenceTransformerRetrievalSystem(model_name="all-mpnet-base-v2"),
                     SentenceTransformerRetrievalSystem(model_name="intfloat/e5-large-v2"),
                     SentenceTransformerRetrievalSystem(model_name="gtr-t5-xl"),
                     GritLMRetrievalSystem(),
                     SentenceTransformerRetrievalSystem(model_name="malteos/scincl"),
                     SentenceTransformerRetrievalSystem(model_name="allenai/specter")]

results = evaluate_retrieval_systems(retrieval_systems=retrieval_systems, k_queries=-1, synthetic_queries=True,
                                     synthetic_abstracts=False)

License

License

About

Repository for Claim2Source's Participation at CLEF CheckThat! 2025 Lab Task 4b

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages