Regularly collects data from Yandex Direct and Pravoved website. Stores this data in Google Spreadsheet.
Income_daily.py collects data once a day. Income_hourly.py runs every hour.
Script uses:
- Yandex Direct API
- Beautifulsoup and python requests module to scrape Pravoved website
- Selenium for scraping Lexprofit website
- Gspread for authorization in Google SpreadSheets
-
Create Google SpreadSheet
-
In SpreadSheet:
- Create 2 tabs called 'hourly', 'daily'.
- Add headers in the first row: 'Date/Time', 'Lexprofit', 'Pravoved', 'Google', Yandex'
- In the first column of 'daily' tab write yesterday's date, i.e. '08.01.18'
- In the first column of 'hourly' tab write today's date and time, i.e. '09.01 12:00'
-
Create project
mkdir income
cd income
(create virtualenv)
git clone https://github.com/iakovleva/vu_income
cd vu_income
pip install -r requirements.txt
- Get Google API credentials for Gspread authorize. Follow https://gspread.readthedocs.io/en/latest/oauth2.html (Save JSON file wifth credentials in the current directory)
- Fill in tokens.py example file. Rename it to 'tokens.py'.
- Run script
python income_daily.py
python income_hourly.py