Skip to content

Latest commit

 

History

History
29 lines (24 loc) · 1.21 KB

File metadata and controls

29 lines (24 loc) · 1.21 KB

KNN Iris Classification

Table of contents

Introduction

K-nearest neighbors algorithm implemented from scratch in Python, tested on iris dataset.

Data description

image

This dataset consists of samples from each of three species of Iris (Iris setosa, Iris virginica and Iris versicolor). Each sample has four features: sepal length, sepal width, petal length and petal width (measured in cm). It was originally introduced by the British statistician and biologist Ronald Fisher in his 1936 paper 'The use of multiple measurements in taxonomic problems' as an example of linear discriminant analysis.

Methods used

  • Reading data from csv using pandas
  • Exploratory Data Analysis (EDA) using seaborn: pairplot, boxplot, violinplot
  • Implementing K-nearest neighbors algorithm from scratch
  • Plotting algorithm's accuracy by different values of k using matplotlib

Technologies used

  • Python 3.8.8
  • Pandas 1.2.4
  • Numpy 1.20.1
  • Matplotlib 3.3.4
  • Seaborn 0.11.1