Skip to content

ssarahreyes/ih_datamadpt1120_project_m1

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

📊 Data Pipeline Basic Income Survey

In this project we have created a data pipeline that unites the results of the survey from a data base with data obtained from API connection and web scraping process in order to enrich the results.

alt text

⏩ One-liner

To obtain the results of the survey for just one country:

python main.py -p /data/raw/raw_data_project_m1.db -c Spain

💻 Technology stack

  • Python==3.8.5
  • pandas==1.1.3
  • sqlalchemy==1.3.20
  • requests==2.25.1
  • seaborn==0.11.0
  • numpy==1.19.2
  • argparse==3.2

⚡ Data

There are 3 different datasources involved:

  • Tables (.db) with the results of the survey. You can see the data in data/raw folder.
  • API. We will use the API from the Open Skills Project.
  • Web Scraping. Finally, we will need to retrieve information about country codes from Eurostat website.

📁 Folder structure

└── project
    ├── .gitignore
    ├── requirements.txt
    ├── README.md
    ├── main.py
    ├── p_acquisition
    │   ├── __init__.py
    │   └── m_acquisition.py
    ├── p_wrangling
    │   ├── __init__.py
    │   └── m_wrangling.py
    ├── p_analysis
    │   ├── __init__.py
    │   └── m_analysis.py
    ├── p_reporting
    │   ├── __init__.py
    │   └── m_reporting.py
    └── data
        ├── raw
        ├── processed
        └── results

📨 Contact info

If you have some question, email me to sarisinhache@gmail.com!

About

Creating a data pipeline for a data set.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages