This project aims to centralise and make accessible the sales data of a multinational company that sells various goods across the globe. The current sales data is distributed across multiple sources, making it challenging to analyse. The organisation's goal is to become more data-driven, and this project addresses the data centralisation needs.
The project involves the following tasks:
- Extracting and cleaning user data
- Extracting users and cleaning card details
- Extracting and cleaning details of each store
- Extracting and cleaning product details
- Retrieving and cleaning the orders table
- Retrieving and cleaning date events data
To run this project, follow these steps:
-
Clone the repository:
git clone https://github.com/emma-luk/multinational-retail-data-centralisation946.git cd Python -
Install the required dependencies: pip install -r requirements.txt
-
Set up the necessary database credentials. Refer to db_creds.yaml and local_credentials.yaml for examples.
Execute the main script main.py to perform the data extraction, transformation, and loading (ETL) tasks. Ensure that the required API key and URLs are configured in the script.
python main.py
multinational-retail-data-centralisation/ |-- Python/ | |-- main.py | |-- data_extraction.py | |-- data_cleaning.py | |-- database_utils.py | |-- db_creds.yaml | |-- local_credentials.yaml |-- README.md |-- requirements.txt
Feel free to customise the content as needed. Add more details or sections based on your project's specific requirements. Make sure to update the information in square brackets (e.g., [your-username], [your-project-name], etc.) with your actual details.