MicroTraitLLM Validation Subsystems

This folder implements two core reliability components for MicroTraitLLM:

Article Validation – scores and filters retrieved papers before they are passed to the LLM.
Citation Accuracy Checking – post-processes model outputs to detect and correct hallucinated or mismatched citations.

These components correspond to the Article Validation and Citation Accuracy tasks in the overall MicroTraitLLM project.

1. Article Validation

Goal

Ensure that only high-quality, relevant, and accessible articles are used as context in the RAG pipeline.

Inputs

query – user query string.
articles – list of Article objects, each containing:
- pmcid, doi
- title, abstract
- journal, year
- citation_count
- is_peer_reviewed, is_retracted

Outputs

Same list of Article objects, each annotated with:
- validation_score ∈ [0, 1]
- confidence_label ∈ {high, medium, low, invalid_id, unknown}

By default, low-confidence articles are filtered out.

Scoring Model

The composite score is a weighted combination of:

Recency – newer articles score higher.
Source reputation – tiered by journal and peer-review status.
Citation count – log-scaled and normalized.
Topic relevance – cosine similarity between query and article text embeddings.

The implementation is exposed via:

from validation import validate_articles

filtered_articles = validate_articles(query, retrieved_articles)

Configuration points:

Journal tiers (JOURNAL_TIERS)
Weighting of score components
Identifier accessibility check (check_identifier_accessibility)

2. Citation Accuracy Checking

Goal

Reduce citation hallucinations by verifying that:

Each citation in the answer corresponds to a real article in the retrieved corpus.
The cited passage is semantically consistent with the referenced article.

Inputs

answer_body – main answer text (with inline numeric citations like [1]).
ref_text – reference list generated by the model (e.g., lines starting with [1], [2], etc.).
retrieved_articles – same Article objects used in retrieval.

Outputs

cleaned_answer_body – answer text with hallucinated citations removed.
report_list – per-citation diagnostics:
- raw_citation (e.g. [1])
- identifier (PMCID/DOI or None)
- status – valid, mismatch, or not_found
- similarity – embedding similarity score

Processing Steps

Extract numeric citations from the answer (e.g. [1], [2]).
Parse the reference list and map each number to a PMCID or DOI.
Match each identifier to a retrieved article.
Extract a local context window around the citation token.
Compute embedding similarity between the context and the article’s title + abstract.
Flag citations as:
- valid (similarity above threshold),
- mismatch (article found but content does not align),
- not_found (no article with that identifier in the corpus).

Usage:

from validation import check_citations

cleaned_answer, report = check_citations(
    answer_body=answer_body,
    ref_text=ref_section,
    retrieved_articles=filtered_articles,
    similarity_threshold=0.6,
)

3. Integration Points

These subsystems plug into the MicroTraitLLM RAG pipeline as follows:

Retrieval – given a query, retrieve candidate articles.
Article Validation – call validate_articles to score and filter the candidates.
LLM Generation – pass validated articles as context to the model.
Citation Accuracy – call check_citations on the model’s answer and reference list.
Final Output – return the cleaned answer plus an optional citation report.

4. Future Work / TODOs

Replace embed_text stub with the production embedding model.
Implement real PMCID/DOI checks in check_identifier_accessibility using NCBI/PMC APIs or a local metadata cache.
Allow scoring weights and journal tiers to be configured via YAML/JSON.
Add unit tests and benchmark evaluation:
- Precision/recall for valid article retention.
- Precision/recall/F1 for citation validation.
Extend citation parsing to handle additional formats (e.g., inline PMCIDs, author-year styles).

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
MicroTraitLLM_base		MicroTraitLLM_base
sampledata		sampledata
ECES450_650 Final Report.pdf		ECES450_650 Final Report.pdf
MicroTraitLLM_ Final Project Presentation.pdf		MicroTraitLLM_ Final Project Presentation.pdf
README.md		README.md
avca.py		avca.py
test.py		test.py
validation_bands.png		validation_bands.png
validation_score.png		validation_score.png
validation_top50.png		validation_top50.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MicroTraitLLM Validation Subsystems

1. Article Validation

Goal

Inputs

Outputs

Scoring Model

2. Citation Accuracy Checking

Goal

Inputs

Outputs

Processing Steps

3. Integration Points

4. Future Work / TODOs

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

MicroTraitLLM Validation Subsystems

1. Article Validation

Goal

Inputs

Outputs

Scoring Model

2. Citation Accuracy Checking

Goal

Inputs

Outputs

Processing Steps

3. Integration Points

4. Future Work / TODOs

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages