GitHub - THeneghan/bauer

Installation and Usage

The user is recommended to use poetry to run this project. Docker is also required as this project creates a postgres instance in a docker container.

Connection details are as follows - it is recommended that the user accesses the data using a client e.g. DBeaver, to view the data.

Host: localhost

User: postgres

Password: password

To install all the required dependencies and ensure pre-commit is installed, run:

poetry install
 
pre-commit install

Why you chose this schema/approach

I chose this schema as it allowed checking for both primary key and foreign key violations. It also means that we don't get bogged down with possible varied data in JSON form.

Alternative approaches you considered

I considered using a DynamoDB table (this can be ran in a docker container) but thought against it because of the lack of join capabilities.

Trade-offs in terms of performance, cost, and scalability

This uses Postgres and SQLalchemy - sqlalchemy can be a bit slow on inserts. Postgres is still one of the best tools for transactional data and although it can be costly it is a predictable cost. Analytic work might be better suited in Snowflake for fast but expensive querying. Some people use Snowflake for transactional purposes but this is controversial. Postgres will handle high volumes of traffic but DynamoDB might be better suited due to its more reactive throughput.

How you would handle additional data sources (e.g., listening events)

Postgres would still be a brilliant tool for additional sources. Could have Airflow instance for orchestrated jobs, ECS for API related jobs or lambda for simple triggered jobs.

Troubleshooting

Some IDEs, such as PyCharm, may struggle to identify modules elsewhere in this repository. Make sure to mark the src directory as 'Sources Root' by right-clicking the folder and selecting Mark Directory as Sources Root.

Testing

Pytest has been used for testing - this generates a testing docker container which runs the tests in a separate docker container to that created in src/main.py

pytest

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
data		data
src		src
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Installation and Usage

Why you chose this schema/approach

Alternative approaches you considered

Trade-offs in terms of performance, cost, and scalability

How you would handle additional data sources (e.g., listening events)

Troubleshooting

Testing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Installation and Usage

Why you chose this schema/approach

Alternative approaches you considered

Trade-offs in terms of performance, cost, and scalability

How you would handle additional data sources (e.g., listening events)

Troubleshooting

Testing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages