Skip to content

In this repo you will know how to deal with outliers, skewness, inconsistency, parsing dates, handle missing values with different techniques, and deal with different datatypes of data (e.g. numerical data and catogrical data)

Notifications You must be signed in to change notification settings

tawfikhammad/Data-cleaning-tutorial

Repository files navigation

Data Cleaning

This repository contains code and resources related to handling data inconsistencies, parsing dates, data transformation techniques (log, sqrt, Box Cox), handling outliers, and imputation techniques.


Usage

To use the code in this repository, follow these steps:

  • Clone the repository: git clone https://github.com/tawfikhammad/Data-cleaning-tutorial.git

  • Install the required dependencies. Assuming you have Python and pip installed, run the following command: pip install -r requirements.txt


Content

Here are some instructions on how to use the code and resources provided in this repository:

Data Inconsistency: This section provides techniques and code snippets to identify and handle inconsistent data. Refer to Data Inconsistency for detailed information.

Parsing Dates: If you need to parse dates from different formats, check out Parsing Dates for code examples and guidelines.

Data Transformation:

  • Log Transformation: Learn how to apply a logarithmic transformation to your data using the techniques outlined in Log Transformation.
  • Square Root Transformation: Explore the benefits of applying a square root transformation to your data. Refer to Square Root Transformation for more information.
  • Box Cox Transformation: Understand how the Box Cox transformation can help normalize your data. Find implementation details in Box Cox Transformation.

Handling Outliers: Discover effective methods for detecting and handling outliers in your dataset. See Handling Outliers for code examples and recommendations.

Imputation: Imputation methods aim to estimate the missing values based on the available information in the dataset.

Images: This repository contains images related to the project. You can find them in the images folder.

About

In this repo you will know how to deal with outliers, skewness, inconsistency, parsing dates, handle missing values with different techniques, and deal with different datatypes of data (e.g. numerical data and catogrical data)

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published