This is an ETL pipeline to pull bitcoin exchange data from CoinCap API and load it into our data warehouse. For more details check out the blog at https://startdataengineering.com/post/data-engineering-project-to-impress-hiring-managers/
Code available at bitcoinMonitor repository.
You can run this data pipeline using GitHub codespaces. Follow the instructions below.
- Create codespaces by going to the bitcoinMonitor repository, forking it and then clicking on
Create codespaces on mainbutton. - Wait for codespaces to start, then in the terminal type
make up. - Wait for
make upto complete, and then wait for 30s (give Metabase sometime to setup). - After 30s go to the
portstab and click on the link exposing port3000to access Metabase UI (username and password issdeuserandsdepassword1234respectively). Seemetabase connection settingsscreenshot below for connection details.
Note: The screenshots show how to run a project on codespaces, please make sure to use the instructions above for this specific project.
The metabase UI will look like the following

Note Make sure to switch off codespaces instance, you only have limited free usage; see docs here.
To run locally, you need:
- git
- Github account
- Docker with at least 4GB of RAM and Docker Compose v1.27.0 or later
Clone the repo and run the following commands to start the data pipeline:
git clone https://github.com/josephmachado/bitcoinMonitor.git
cd bitcoinMonitor
make up
sleep 30 # wait for Metabase to start
make ci # run checks and testsGo to http:localhost:3000 to see the Metabase UI.
We use python to pull, transform and load data. Our warehouse is postgres. We also spin up a Metabase instance for our presentation layer.
All of the components are running as docker containers.
Read this post, for information on setting up CI/CD, IAC(terraform), "make" commands and automated testing.




