An AI-powered daily academic paper recommendation system that helps researchers stay up-to-date with the latest arXiv publications. PaperSurf fetches daily papers from arXiv, uses topic modeling to calculate similarity with papers from your Zotero library, and generates personalized email recommendations to help you keep pace with the daily paper waves. π
- π Daily arXiv Monitoring: Automatically fetches new papers from specified arXiv categories
- π§ Intelligent Matching: Uses embedding models to calculate similarity between new papers and your research interests
- π Multi-Directory Zotero Integration: Analyzes papers from multiple Zotero directories to understand different research interests and generate targeted recommendations
- π§ Email Notifications: Generates HTML email digests with top paper recommendations
- βοΈ Configurable: Customize categories, models, and recommendation parameters
The easiest way to use PaperSurf is through GitHub Actions, which runs automatically in the cloud thanks to Github:
EMAIL_SENDER: Your Gmail addressEMAIL_RECEIVER: Email address to receive recommendationsEMAIL_PASSWORD: Gmail app-specific passwordZOTERO_API_KEY: Your Zotero API keyZOTERO_LIBRARY_ID: Your Zotero library ID
For local development or manual runs:
- Python 3.8+
- uv (for dependency management)
- Zotero account with API access
-
Clone the repository:
git clone https://github.com/chenxshuo/paper-surf cd paper-surf -
Install dependencies:
uv venv venv source venv/bin/activate uv pip install .
-
Configure environment variables:
# Copy and edit the setup script cp setup_envs_temp.sh setup_envs.sh vi setup_envs.sh # and add your credentials there source setup_envs.sh
-
Configure the system: Edit
config.yamlto customize (see Configuration section for details):- arXiv categories and search parameters
- Zotero collections and directories
- Embedding model settings
- Email notification preferences
- Output and digest formatting
Run the daily recommendation system:
python run_daily.pyWith custom parameters:
python run_daily.py --date 2025-07-21 --topk 10.
βββ app/ # Application modules
β βββ fetcher.py # arXiv paper fetching
β βββ embedder.py # Text embedding and similarity
β βββ recommender.py # Recommendation engine
β βββ notifier.py # Email notification
βββ data/ # Data files (embeddings, papers)
βββ output/ # Generated HTML digests
βββ config.yaml # Configuration file
βββ run_daily.py # Main application script
βββ pyproject.toml # Project dependencies
- Get your Zotero API key from https://www.zotero.org/settings/keys
- Find your library ID from your Zotero profile
- Set these in your environment variables
Configure SMTP settings for email notifications:
- Email sender and receiver addresses
- App-specific password for Gmail or other providers
The config.yaml file contains all system settings:
arxiv:
categories: ["cs.CL", "cs.LG", "cs.AI", "stat.ML", "cs.CV", "cs.IR"]
lookback_days: 2 # How many days back to search for papers
max_results: 10 # Maximum papers to fetch per categoryinterest_papers:
zotero_collections:
deep_research: "0-PhD-LLM/9-agent-robust/deep-research"
# Add more collections for different research areas
# nlp_research: "path/to/nlp/collection"
# cv_research: "path/to/cv/collection"embedding_model: "allenai/specter2_base" # Embedding model for similarity
top_k: 5 # Number of top recommendations
use_seen_papers: false # Track previously seen papersdigest:
title: "PaperSurf Daily Digest"
subtitle: "AI & Machine Learning Paper Recommendations"
output_dir: "output"
open_in_browser: false # Auto-open digest after generation
output_path: "output/digest.html"pipeline:
skip_fetch: false # Skip fetching new papers (for testing)
skip_embed: false # Skip embedding generation (for testing)notification:
method: email
auto_send: true # Automatically send email notifications
dry_run: false # Test mode without sending emails
email:
smtp_server: "smtp.gmail.com"
smtp_port: 587- π₯ Fetch: Downloads new papers from specified arXiv categories
- π€ Embed: Generates embeddings for paper abstracts using transformer models
- π Match: Calculates similarity between new papers and your Zotero directories
- π Rank: Scores and ranks papers based on relevance to your interests
- π¨ Notify: Sends HTML email digest with top recommendations













