forked from jvalue/made-template
-
Notifications
You must be signed in to change notification settings - Fork 0
week_03
Rafoolin edited this page Nov 21, 2023
·
1 revision
We have 3 main data sources and each of them might need some side dataset that explains about the special units or geo location abbreviation for example.
We first download the data from the data providers.
Then we drop the rows with Na values, rename columns with the same name but different meaning. We can also change the value of rows, for example replace unit's abbr with real numerical values.
Then we merge the database on same columns and drop the NA rows.
Finally we save the result in a new SQLITE dataset.
There is a bash script inside project directory that creates a virtual environment and then install requirements.txt, and run the pipeline.
Note
All the data and downloaded files will be stored in \data directory.