A project where we preprocess the dataset and perform classification, clustering and association rule mining.
Classification for discrete data. The dataset used here is Balance scale weight and distance.
Classification for continuous data. The dataset used here is Rice Seed Dataset (Gonen&Jasmine).
Clustering. The dataset used here is Online Retail Customer Dataset.
Association Rule Mining. The dataset used here is Groceries Market Basket Dataset.
The dataset used here is the Breast Cancer Wisconsin Diagnostic Data Set. This dataset contains 569 records of and 32 features (including the Id). The features represent various parameters that might be useful in predicting if a tumor is malignant or benign.