CryptoDigest is an automated system designed to keep the user informed about the latest news in the crypto world. This Python-based tool scrapes cryptocurrency news articles, more specifically Bitcoin-related ones, summarizes them for quick consumption, stores them in a MongoDB database for future access, and sends out daily digest emails with summaries to me.
Automated News Scraping: Daily scraping of the latest cryptocurrency news from Yahoo Finance.
NLP Summarization: Use the T5 transformer and SimplyT5 models to condense articles into summaries.
Database Storage: News summaries, titles, and URLs are stored in a MongoDB database, allowing for data analysis and retrieval.
Daily Email Digests: Users of the script can receive a daily email with the latest news summaries when running the script
To set up CryptoDigest, follow these steps:
⚡ git clone https://github.com/yourusername/CryptoDigest.git
⚡ cd CryptoDigest
⚡ install the required library packages
Set up environment variables for MongoDB and email configuration by adding them to your .bashrc or .bash_profile:
⚡ export MONGO_URI="your_mongodb_uri"
⚡ export SENDER_EMAIL="your_email@example.com"
⚡ export RECEIVER_EMAIL="receiver_email@example.com"
⚡ export EMAIL_PASSWORD="your_email_password"
⚡ If I had more time to work on this project, I would incorporate more data to further fine-tune the model and generate more precise summaries. The summaries generated are mostly correct but grammar and capitalization can be improved.
⚡ The training_data-_actual.csv file is my own collected data, the user can have their CSV file with their data to train their model.
⚡ The test.py file is for testing and messing with the training parameters of the model to see how different values can influence the output of summaries.
This project is licensed under the GPL-2.0 License - see the LICENSE.md file for details.