visual-analytics

As part of my Master's in Software and Data Engineering (MSDE), where the "D" stands for Data, this project focuses on learning how to ingest, clean, and visualize data using Python libraries like Pandas and Matplotlib

📊 Data Exploration and Visualization Assignment

As part of my Master's in Software Development and Engineering (MSDE), this assignment focuses on exploring, cleaning, and visualizing data using Python tools introduced in class.

Assignment 1

🎯 Goal

The goal of this assignment is to use Python and Jupyter Notebook to explore, analyze, and visualize the provided datasets. You can find the dataset following this link:

[Download Airport Dataset] (https://drive.google.com/file/d/1MUJrvA0dRDoWGXlIJY9BxhjobL0O8Mg1/view?usp=share_link)
[Download Countries Dataset] (https://drive.google.com/file/d/1mAyCkM2_Y_kLTWpb3dQ2-xBymgsgzQT2/view?usp=share_link)
[Download Energy Dataset] (https://drive.google.com/file/d/12BvtMOuuRCzPawqgSe-nHnHeXGnN73e_/view?usp=share_link)
[Europe.geoJson Dataset] (https://drive.google.com/file/d/1MK3yuScG26-6RcJUR2PU-GZlGUwi_-tz/view?usp=share_link)
[Market value Decline Dataset] (https://drive.google.com/file/d/1BTolE3CDJpe_lP0IBQ9TPnCTATWLFtFI/view?usp=share_link)
[Routes Dataset] (https://drive.google.com/file/d/1admk_UHq7fZaFMY7-LAs9L3GKBzmLL8f/view?usp=share_link)

▶️ How to Run the Notebook

Clone the repository:

git clone https://github.com/your-username/visual-analytics.git
cd visual-analytics

Assignment 2: Visual Analytics with Elasticsearch & Kibana

This project combines data querying, visualization, and the extension of Elasticsearch capabilities through a custom ingestion plugin.

🎯 Goal

The aim of this assignment is to:

Explore and analyze a dataset of restaurants using Elasticsearch queries and aggregations.
Visualize insights through Kibana dashboards and canvas presentations.
Develop a custom Elasticsearch ingest plugin that performs lookup-based text substitutions during document ingestion.

📁 Dataset

The dataset used in this assignment is provided as a CSV file at this link:

[Download Restaurants Dataset] (https://drive.google.com/file/d/1-SQEOkNKFW5VhdHM69CWF9m5nKYuRvHw/view?usp=share_link)
[Downloand NDJson of Restaurants Dataset] (https://drive.google.com/file/d/1vNQueoWjHDXbvuk973xo_kC3b9ESu3rh/view?usp=share_link)
[Nyc Borought geoJson Dataset] (https://drive.google.com/file/d/18aTi575vHgVT1-XzEm4hsNmshvMyYc9G/view?usp=share_link)

Please ensure your Elasticsearch index is named: restaurants

🛠️ Features Implemented

🔎 Section 1: Indexing, Queries & Aggregations

Indexed the restaurants.csv into JSON documents
Crafted advanced search queries and filters (e.g., geolocation, string patterns, numeric ranges)
Built aggregations for:
- Weighted averages
- Top-N groupings
- Bucket-based analysis

📊 Section 2: Kibana Visualization

Created interactive dashboards with:
- Review sentiment trends
- Cost distribution across continents
- Map with vote-based markers
- Heatmap for ratings vs price
Built a Canvas with:
- Custom filters
- Categorized cost metrics
- Visual summaries of review quality

🔌 Section 3: Elasticsearch Ingest Plugin

Implemented a lookup ingest processor
Enabled dynamic replacement of coded fields with human-readable values during indexing
Configured pipeline setup and document transformation via custom plugin

🔄 Example of Plugin in Action

Input document:

{ "field1": "Need to optimize the C001 temperature. C010 needs to be changed." }

Lookup Map { "C001": "tyre", "C010": "front wing" }

Output { "field1": "Need to optimize the tyre temperature. front wing needs to be changed." }

▶️ How to Run the Notebook

Assignment 3: Data Processing with Polars & Apache Spark

As part of my Master's in Software Development and Engineering (MSDE), this assignment focuses on learning how to process and analyze large datasets using Polars and Apache Spark — two powerful tools for high-performance data manipulation.

🎯 Goal

The goal of this assignment is to:

Ingest and clean a large dataset using Polars and Apache Spark.
Perform efficient computations and transformations on the data.
Generate insightful visualizations based on the processed data.

📁 Dataset

The dataset used in this assignment is provided in CSV format.
[Downlod Trip Data Dataset] (https://drive.google.com/file/d/14jnhbmcfedFyaj6EksPQr_DtVx8qiBdb/view?usp=share_link) [Download Trip Fare Dataset] (https://drive.google.com/file/d/14jnhbmcfedFyaj6EksPQr_DtVx8qiBdb/view?usp=share_link)

🛠️ Features Implemented

🧼 Data Cleaning & Transformation

Handled missing values, inconsistent formatting, and outliers.
Applied efficient column-wise transformations using Polars and Spark APIs.

📊 Visualization & Analytics

Computed key statistics and trends across multiple dimensions.
Generated visual insights from the processed data using Python tools.

▶️ How to Run the Notebook

Clone the repository:

git clone https://github.com/your-username/your-repo-name.git
cd your-repo-name

🧠 Group Project: Tech Jobs & Cost of Living Dashboard

For this group project, we explored the relationship between tech job markets and cost of living by creating an interactive dashboard using Python tools.

🎯 Goal

The objective of this group project was to:

Find and analyze a real-world dataset.
Use Python libraries covered in class to clean, process, and visualize the data.
Build an interactive dashboard to explore key insights.

📁 Dataset

We used several dataset to compute the statistics of cost of living from 2015 t0 2025. There is a folder inside the group project one, containing all the dataset used. Datasets were ingested and merged using Pandas.

🛠️ Tools & Technologies

Pandas: for data ingestion, cleaning, and merging
Dash (by Plotly): for building the interactive web-based dashboard
CSV: as the data source format

📊 Final Output

The dashboard allows users to:

Compare average tech salaries vs cost of living across cities
Filter by region or job title
Explore affordability and job density visually

▶️ How to Run the Dashboard

Clone the repository:

git clone https://github.com/your-username/your-repo-name.git
cd your-repo-name

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
Assignment 1		Assignment 1
Assignment 2		Assignment 2
Assignment 3		Assignment 3
Group-project		Group-project
.DS_Store		.DS_Store
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

visual-analytics

📊 Data Exploration and Visualization Assignment

Assignment 1

🎯 Goal

▶️ How to Run the Notebook

Assignment 2: Visual Analytics with Elasticsearch & Kibana

🎯 Goal

📁 Dataset

🛠️ Features Implemented

🔎 Section 1: Indexing, Queries & Aggregations

📊 Section 2: Kibana Visualization

🔌 Section 3: Elasticsearch Ingest Plugin

🔄 Example of Plugin in Action

▶️ How to Run the Notebook

Assignment 3: Data Processing with Polars & Apache Spark

🎯 Goal

📁 Dataset

🛠️ Features Implemented

🧼 Data Cleaning & Transformation

📊 Visualization & Analytics

▶️ How to Run the Notebook

🧠 Group Project: Tech Jobs & Cost of Living Dashboard

🎯 Goal

📁 Dataset

🛠️ Tools & Technologies

📊 Final Output

▶️ How to Run the Dashboard

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

24luca24/visual-analytics

Folders and files

Latest commit

History

Repository files navigation

visual-analytics

📊 Data Exploration and Visualization Assignment

Assignment 1

🎯 Goal

▶️ How to Run the Notebook

Assignment 2: Visual Analytics with Elasticsearch & Kibana

🎯 Goal

📁 Dataset

🛠️ Features Implemented

🔎 Section 1: Indexing, Queries & Aggregations

📊 Section 2: Kibana Visualization

🔌 Section 3: Elasticsearch Ingest Plugin

🔄 Example of Plugin in Action

▶️ How to Run the Notebook

Assignment 3: Data Processing with Polars & Apache Spark

🎯 Goal

📁 Dataset

🛠️ Features Implemented

🧼 Data Cleaning & Transformation

📊 Visualization & Analytics

▶️ How to Run the Notebook

🧠 Group Project: Tech Jobs & Cost of Living Dashboard

🎯 Goal

📁 Dataset

🛠️ Tools & Technologies

📊 Final Output

▶️ How to Run the Dashboard

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages