This repository contains notebooks and resources that demonstrate how to build RAG (Retrieval-Augmented Generation) applications for medical literature analysis using Google Cloud BigQuery vector search and Vertex AI Gemini models.
The project converts the user experience from the Capricorn Medical Research Application into interactive Colab notebooks, making it accessible for both clinicians and data scientists.
- Open the Clinician Example notebook
- Click Runtime → Run all (or press Ctrl/Cmd + F9)
- Authenticate with your Google account
- Use the interactive Gradio app to:
- Paste your medical case notes
- Extract disease and events automatically
- Search and analyze PubMed literature
- Generate comprehensive analysis reports
- Open the Data Scientist Example notebook
- Configure your Google Cloud project
- Customize the analysis pipeline:
# Define custom scoring criteria CUSTOM_CRITERIA = [ {"name": "clinical_trial", "weight": 50}, {"name": "pediatric_focus", "weight": 60}, # Add your own criteria ] # Process medical case results = process_medical_case( case_text, default_articles=10, min_per_event=3 )
See CONTRIBUTING.md for details.
Apache 2.0; see LICENSE for details.
If you use this work in your research, please cite:
@software{pubmed_rag_2025,
author = {Zhang, Willis and Jiang, Stone},
title = {PubMed RAG: Medical Literature Analysis with BigQuery and Gemini},
year = {2025},
url = {https://github.com/google/pubmed-rag}
}This project is not an official Google project. It is not supported by Google and Google specifically disclaims all warranties as to its quality, merchantability, or fitness for a particular purpose.
