Skip to content

Stephen137/Liga-MT-cloud

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

40 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Liga Młodych talentów

⚽ ⚽ ⚽ Liga Młodych Talentów (The Young Talent League) is a football competition which takes place across six locations in Poland. Matches take place every fortnight over five rounds. The winter season came to an end last weekend, with just under 600 teams competing in almost 4,500 matches across 23 leagues. Results and league standings are currently maintained in Google Sheets, with a publicly available URL for each city. ⚽ ⚽ ⚽

Project Scope

My aim was to bring all this information together in one place to provide users with real-time information.

LigaMT ETL Architecture

From raw data to insights, here’s what I did:

  • ✅ Pulled raw data from six Google Sheets URLs
  • ✅ Cleaned & transformed it using Apache Spark in Databricks 🔥
  • ✅ Followed the Medallion architecture (Bronze → Silver → Gold)
  • ✅ Stored the processed data in AWS S3 (Parquet format) 🏗️
  • ✅ Built a Streamlit app to serve insights in real-time! 🎨📊

Databricks Workflow

LIGA_MT

The Pipeline scriptsa re included in this repo under the Bronze, Silver and Gold folders.

Write to parquet in AWS S3

s3

Serve as a Streamlit app hosted on Community Cloud

https://liga-mt-cloud-development-mode-stephen-barrie.streamlit.app/

About

End-to-end Cloud ETL Pipeline, clean and transform with Spark in Databricks, write to AWS S3 storage, serve as Streamlit app

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published