This is a task to show case my data modelling skills from a task I was asked to complete.
All models apart from base_pricing__rules.sql and intm_pricing__tariffs.sql were designed by me.
I made significant changes to the _base.yml and _intm.yml files.
The rest were previously designed.
The task, database and framework for the task were designed by someone else. If you would like more details about who designed these things, please get in contact.
The aim was to build some DBT models that fulfil the objectives listed below:
- allow us to easily look up the current price that would be charged on different days of the week for parking sessions of various durations, at a specific location
- allow us to easily look up how the price for a one hour parking session at a specific location has changed over time
You will find my solutions in the models/ directory. More specifically they are the .sql and .yml files.
They represent my data models and documentation.
Once you've run poetry install and poetry run dbt run, you can run poetry run dbt show --select {model.sql} to get a
preview of my models.
First, defining some terms:
- a parking location is a parking spot or car park where parking is offered
- a parking session is when a parking spot is used for a particular duration, starting at some specific time/date
- a price is a set of rules you can use to calculate the cost of a parking session
- a tariff is member of the set of rules (e.g. 'if parking on a Saturday for 2 hours starting after 4pm, the hourly cost is £4.00')
I was asked to use two key technologies:
- DuckDB
- DBT
For this project we're managing everything via poetry. You can install it [using these instructions]
(https://python-poetry.org/docs/#installing-with-the-official-installer). If you run poetry install you will end up with
duckdb and dbt installed in a local virtual environment.
There is an example dataset in the warehouse file pricing.duckdb. It contains
- two source tables
main.pricesandmain.rules - two example, incomplete DBT models
main_base.base_pricing__rulesandmain_intm.intm_pricing__tariffs
In brief:
pricesrepresents the history of which pricing logic applied at particular times in the pastrulescontains the actual pricing rules, in a nested JSON structure
The example DBT models are defined in models/base/base_pricing__rules.sql and models/intm/intm_pricing__tariffs.sql.
These are there mostly to demonstrate some DuckDB syntax for working with JSON data (since candidates are not expected to
be familiar with this - here is a reference
in case needed).
There is also pricing_original.duckdb, this is the original warehouse file before the example DBT models have been run.
There are many ways to read duckdb files. One way is to use Python to read the table into a DataFrame:
import duckdb
conn = duckdb.connect("./pricing.duckdb")
prices = conn.execute("SELECT * FROM prices;).fetchdf()
conn.close()
ENJOY MY DATA MODELS AND PLAYING AROUND WITH DUCK DB AND DBT