Skip to content

Overwhelmed by the waves of new papers from arXiv? Try out PaperSurf to get recommended papers! No local setup required thanks to Github Action!

Notifications You must be signed in to change notification settings

chenxshuo/paper-surf

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

9 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

PaperSurf πŸ„β€β™‚οΈπŸ“„

An AI-powered daily academic paper recommendation system that helps researchers stay up-to-date with the latest arXiv publications. PaperSurf fetches daily papers from arXiv, uses topic modeling to calculate similarity with papers from your Zotero library, and generates personalized email recommendations to help you keep pace with the daily paper waves. 🌊

Features ✨

  • πŸ“… Daily arXiv Monitoring: Automatically fetches new papers from specified arXiv categories
  • 🧠 Intelligent Matching: Uses embedding models to calculate similarity between new papers and your research interests
  • πŸ“š Multi-Directory Zotero Integration: Analyzes papers from multiple Zotero directories to understand different research interests and generate targeted recommendations
  • πŸ“§ Email Notifications: Generates HTML email digests with top paper recommendations
  • βš™οΈ Configurable: Customize categories, models, and recommendation parameters

Setup πŸš€

Option 1: GitHub Actions (Recommended - No Coding, Free, No Local Setup ) ☁️

The easiest way to use PaperSurf is through GitHub Actions, which runs automatically in the cloud thanks to Github:

Click to toggle Github Action Screenshot Tutorial

1. Fork and Star this repository to your GitHub account

2. Configure secrets in your repository settings (Settings > Secrets and variables > Actions):

  • EMAIL_SENDER: Your Gmail address
  • EMAIL_RECEIVER: Email address to receive recommendations
  • EMAIL_PASSWORD: Gmail app-specific password
  • ZOTERO_API_KEY: Your Zotero API key
  • ZOTERO_LIBRARY_ID: Your Zotero library ID
Click to how to get EMAIL_PASSWORD

https://myaccount.google.com/apppasswords

Click to how to get ZOTERO_API_KEY and ZOTERO_LIBRARY_ID

https://www.zotero.org/settings

3. Customize configuration by editing config.yaml in your fork

Option 2: Local Installation πŸ’»

For local development or manual runs:

Prerequisites

  • Python 3.8+
  • uv (for dependency management)
  • Zotero account with API access

Installation Steps

  1. Clone the repository:

    git clone https://github.com/chenxshuo/paper-surf
    cd paper-surf
  2. Install dependencies:

    uv venv venv
    source venv/bin/activate
    uv pip install .
  3. Configure environment variables:

    # Copy and edit the setup script
    cp setup_envs_temp.sh setup_envs.sh 
    vi setup_envs.sh # and add your credentials there
    source setup_envs.sh
  4. Configure the system: Edit config.yaml to customize (see Configuration section for details):

    • arXiv categories and search parameters
    • Zotero collections and directories
    • Embedding model settings
    • Email notification preferences
    • Output and digest formatting

Usage 🎯

Run the daily recommendation system:

python run_daily.py

With custom parameters:

python run_daily.py --date 2025-07-21 --topk 10

Project Structure πŸ“

.
β”œβ”€β”€ app/                # Application modules
β”‚   β”œβ”€β”€ fetcher.py      # arXiv paper fetching
β”‚   β”œβ”€β”€ embedder.py     # Text embedding and similarity
β”‚   β”œβ”€β”€ recommender.py  # Recommendation engine
β”‚   └── notifier.py     # Email notification
β”œβ”€β”€ data/               # Data files (embeddings, papers)
β”œβ”€β”€ output/             # Generated HTML digests
β”œβ”€β”€ config.yaml         # Configuration file
β”œβ”€β”€ run_daily.py        # Main application script
└── pyproject.toml      # Project dependencies

Configuration βš™οΈ

Zotero Setup πŸ“š

  1. Get your Zotero API key from https://www.zotero.org/settings/keys
  2. Find your library ID from your Zotero profile
  3. Set these in your environment variables
Click to how to get ZOTERO_API_KEY and ZOTERO_LIBRARY_ID

Email Setup πŸ“§

Configure SMTP settings for email notifications:

  • Email sender and receiver addresses
  • App-specific password for Gmail or other providers
Click to how to get EMAIL_PASSWORD

Configuration File (config.yaml) πŸ“„

The config.yaml file contains all system settings:

arXiv Settings

arxiv:
  categories: ["cs.CL", "cs.LG", "cs.AI", "stat.ML", "cs.CV", "cs.IR"]
  lookback_days: 2          # How many days back to search for papers
  max_results: 10           # Maximum papers to fetch per category

Interest Papers & Zotero Collections

interest_papers:
  zotero_collections:
    deep_research: "0-PhD-LLM/9-agent-robust/deep-research"
    # Add more collections for different research areas
    # nlp_research: "path/to/nlp/collection"
    # cv_research: "path/to/cv/collection"

Model & Recommendation Settings

embedding_model: "allenai/specter2_base"  # Embedding model for similarity
top_k: 5                                  # Number of top recommendations
use_seen_papers: false                    # Track previously seen papers

Output & Digest Settings

digest:
  title: "PaperSurf Daily Digest"
  subtitle: "AI & Machine Learning Paper Recommendations"
  output_dir: "output"
  open_in_browser: false                  # Auto-open digest after generation

output_path: "output/digest.html"

Pipeline Control

pipeline:
  skip_fetch: false                       # Skip fetching new papers (for testing)
  skip_embed: false                       # Skip embedding generation (for testing)

Notification Settings

notification:
  method: email
  auto_send: true                         # Automatically send email notifications
  dry_run: false                          # Test mode without sending emails
  email:
    smtp_server: "smtp.gmail.com"
    smtp_port: 587

How It Works πŸ”„

  1. πŸ“₯ Fetch: Downloads new papers from specified arXiv categories
  2. πŸ”€ Embed: Generates embeddings for paper abstracts using transformer models
  3. πŸ” Match: Calculates similarity between new papers and your Zotero directories
  4. πŸ“Š Rank: Scores and ranks papers based on relevance to your interests
  5. πŸ“¨ Notify: Sends HTML email digest with top recommendations

About

Overwhelmed by the waves of new papers from arXiv? Try out PaperSurf to get recommended papers! No local setup required thanks to Github Action!

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published