Skip to content

This repository presents three practical projects from the Computer Data Analysis course, showcasing data exploration, clustering, and classification techniques using Python and the Iris dataset.

Notifications You must be signed in to change notification settings

XEN00000/ComputerDataAnalysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

📘 Computer Data Analysis – Project Summary

This repository contains the full implementation of three course projects completed as part of the subject Computer Data Analysis. The projects are based on the well-known Iris dataset and focus on the practical application of data exploration, visualization, and machine learning techniques using Python.

🧩 Project Overview

✅ Project 1 – Exploratory Data Analysis

  • Loaded and preprocessed the dataset.
  • Calculated key descriptive statistics (mean, median, min, max, quartiles, standard deviation).
  • Visualized data distributions using histograms and boxplots.
  • Investigated relationships between features using Pearson correlation and linear regression.

✅ Project 2 – Clustering with k-Means

  • Normalized the dataset using min-max scaling.
  • Used the elbow method to determine the optimal number of clusters.
  • Applied the k-means algorithm and visualized the clusters across different feature pairings.

✅ Project 3 – Classification with k-Nearest Neighbors (k-NN)

  • Built a k-NN classifier using custom implementation.
  • Evaluated classification accuracy for k values from 1 to 15.
  • Generated confusion matrices and plotted accuracy metrics.
  • Repeated the classification for various pairs of features.

🛠 Technologies Used

  • Language: Python
  • Libraries: pandas, numpy, matplotlib, seaborn
  • Custom Modules:
    • lib_ksrednich.py – for clustering
    • lib_knn.py – for classification

📈 Learning Outcomes

This repository demonstrates a full data analysis workflow – from data preprocessing and visualization, through unsupervised clustering, to supervised classification and performance evaluation. It reflects the practical skills and analytical thinking developed throughout the Computer Data Analysis course.


About

This repository presents three practical projects from the Computer Data Analysis course, showcasing data exploration, clustering, and classification techniques using Python and the Iris dataset.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages