A Python application that automatically generates a newspaper-style PDF from various RSS feeds and Hacker News articles.
- Fetches articles from configurable RSS feeds
- Includes top stories from Hacker News
- Extracts full article content from webpages
- Generates a nicely formatted newspaper-style PDF
- Fully customizable templates and styling
- Handles images (optional)
- Site-specific content extraction for better quality
- Memory-efficient content processing
- Secure automated email delivery of generated papers
-
Clone this repository:
git clone https://github.com/yourusername/morning-paper-generator.git cd morning-paper-generator -
Create and activate a virtual environment:
python -m venv venv # On Windows: venv\Scripts\activate # On macOS/Linux: source venv/bin/activate -
Install the required dependencies:
pip install -r requirements.txt
-
Configure your feeds and settings in
config.json(a default configuration will be created if it doesn't exist) -
Run the application:
from morning import MorningPaperGenerator generator = MorningPaperGenerator() pdf_path = generator.run() print(f"Generated PDF: {pdf_path}")
Alternatively, create a script like
generate_paper.py:#!/usr/bin/env python3 from morning import MorningPaperGenerator if __name__ == "__main__": generator = MorningPaperGenerator() pdf_path = generator.run() if pdf_path: print(f"Successfully generated paper at: {pdf_path}") else: print("Failed to generate paper")
-
Find your generated PDF in the
papersdirectory (or your custom output directory specified in config)
You can set up automatic email delivery of your morning papers using the included email script with secure credential management:
Create a restricted-permission configuration file:
# Create a secure config file only readable by your user
touch ~/.morning_paper_email.conf
chmod 600 ~/.morning_paper_email.confAdd your email configuration to this file:
RECIPIENT=your@email.com
SENDER=sender@email.com
SMTP_SERVER=smtp.your-provider.com
SMTP_PORT=587
USERNAME=your_username
PASSWORD=your_password
python email_morning_paper.py --config ~/.morning_paper_email.confTo automate both paper generation and email delivery using secure credentials:
# Open crontab editor
crontab -e
# Add these entries (adjust times and paths as needed):
# Generate paper at 5:00 AM
0 5 * * * cd /path/to/your/project && ./main.py
# Email the paper at 6:00 AM using secure config
0 6 * * * cd /path/to/your/project && ./email_morning_paper.py --config ~/.morning_paper_email.confFor systems where environment variables are preferred:
# Create a file with export statements
cat > ~/.morning_paper_env << EOF
export MORNING_PAPER_RECIPIENT="your@email.com"
export MORNING_PAPER_SENDER="sender@email.com"
export MORNING_PAPER_SMTP_SERVER="smtp.your-provider.com"
export MORNING_PAPER_SMTP_PORT="587"
export MORNING_PAPER_USERNAME="your_username"
export MORNING_PAPER_PASSWORD="your_password"
EOF
# Secure the file
chmod 600 ~/.morning_paper_env
# In your crontab, source the environment variables before running the script
0 6 * * * source ~/.morning_paper_env && cd /path/to/your/project && ./email_morning_paper.py --use-envWhen setting up your configuration file, use these settings for common providers:
SMTP_SERVER=smtp.gmail.com
SMTP_PORT=587
USERNAME=youremail@gmail.com
PASSWORD=your_app_password # Use an App Password, not your regular password
SMTP_SERVER=smtp.office365.com
SMTP_PORT=587
USERNAME=your@office365.com
PASSWORD=your_password
The config.json file allows you to customize:
- RSS feeds and number of articles per feed
- Hacker News inclusion settings
- PDF formatting options
- Content extraction settings
- HTML templates
- Output directory
- Timeout settings
- Site-specific content selectors
- And more!
Example configuration:
{
"rss_feeds": [
{"name": "BBC News", "url": "http://feeds.bbci.co.uk/news/world/rss.xml", "max_articles": 5},
{"name": "New York Times", "url": "https://rss.nytimes.com/services/xml/rss/nyt/World.xml", "max_articles": 5}
],
"hacker_news": {
"include": true,
"max_articles": 5,
"only_self_posts": true
},
"output_directory": "./papers",
"newspaper_title": "Morning Paper",
"columns": 1,
"extract_full_content": true,
"include_images": false
}The application uses Jinja2 templates to generate the newspaper. You can customize the appearance by editing:
templates/paper_template.html- overall newspaper layouttemplates/article_template.html- individual article formatting
Default templates are created automatically if they don't exist in the configured templates directory.
- Python 3.7+
- All dependencies listed in
requirements.txt, including:- pydantic (v2+)
- feedparser
- requests
- beautifulsoup4
- markdownify
- Jinja2
- WeasyPrint
- readability-lxml (optional but recommended)
Run the test suite with pytest:
pytest tests/
For coverage information:
pytest --cov=morning tests/
WeasyPrint requires additional system dependencies for PDF generation:
-
Ubuntu/Debian:
sudo apt-get install build-essential python3-dev python3-pip python3-setuptools python3-wheel python3-cffi libcairo2 libpango-1.0-0 libpangocairo-1.0-0 libgdk-pixbuf2.0-0 libffi-dev shared-mime-info -
macOS (using Homebrew):
brew install cairo pango gdk-pixbuf libffi -
Windows: Follow the WeasyPrint Windows installation instructions
See the WeasyPrint installation documentation for more details.
For large feeds or many articles, you can:
- Reduce the
max_articlesper feed - Decrease the number of feeds
- Set
extract_full_contenttofalseto use summaries instead - Set
include_imagestofalseto skip image processing
- Configuration File Permissions: Ensure your config file has the correct permissions (
chmod 600) - Authentication Errors: Double-check your username and password. For Gmail, use an App Password.
- Connection Issues: Verify the SMTP server and port are correct for your email provider.
- Cron Environment: If using cron, ensure paths are absolute and environment is properly set up.
- Missing PDFs: Make sure the PDF generation completed successfully before sending emails.
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add some amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the GNU General Public License v3.0 - see the LICENSE file for details.
- feedparser for RSS parsing
- readability-lxml for content extraction
- WeasyPrint for HTML to PDF conversion
- Hacker News API for HN integration