Skip to content

This repository contains the code, data, and documentation for project that involves cleaning and preparing a dataset for analysis using SQL.

Notifications You must be signed in to change notification settings

komalverma183/Data-cleaning-using-SQL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

Data Cleaning Project with SQL

Overview

This repository contains the code, data, and documentation for project that involves cleaning and preparing a dataset for analysis using SQL.

Goals

The primary goals of this project are to:

Identify and address data quality issues within the dataset Enhance the accuracy and consistency of the data Transform the data into a format suitable for further analysis and visualization

Dataset

The dataset used in this project is Nashville Housing Data.

Data Exploration: Examine the structure and content of the dataset to understand its characteristics and identify potential issues. Use descriptive statistics, visualizations, and SQL queries to visualize data distributions and detect anomalies.

Data Cleaning: Handle missing values: Apply strategies like imputation or removal based on the nature of missing data. Correct inconsistencies: Identify and rectify discrepancies in data formatting, naming conventions, or value ranges. Remove duplicates: Identify and eliminate identical records. Handle outliers: Detect and address extreme values that may distort analysis.

Data Transformation: Standardize data formats: Ensure consistency in data types, date formats, and units of measurement. Create new features: Derive additional variables or aggregate data as needed for analysis. Normalize or scale data: Adjust values to a common scale for certain statistical analyses.

About

This repository contains the code, data, and documentation for project that involves cleaning and preparing a dataset for analysis using SQL.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published