Skip to content

mkhalidh/ML_Working-with-CSV-files

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Working with CSV Files in Python using Pandas

This repository contains Jupyter Notebook tutorials demonstrating various techniques and parameters for working with CSV files using the Pandas library in Python.

Topics Covered

  1. Basic CSV Operations

    • Importing pandas
    • Reading local CSV files
    • Reading CSV files from URLs
  2. CSV Reading Parameters

    • sep parameter - Specifying different delimiters
    • index_col parameter - Setting specific columns as index
    • header parameter - Handling column headers
    • usecols parameter - Selecting specific columns to read
    • squeeze parameter
    • skiprows/nrows parameters - Controlling row reading
    • encoding parameter - Handling different file encodings
    • error_bad_lines parameter - Handling parsing errors
    • dtype parameter - Specifying column data types
    • parse_dates parameter - Handling date columns
    • converters parameter - Transforming data during import
    • na_values parameter - Specifying custom NA/NaN values
    • Working with large datasets using chunksize

Files in the Repository

  • 1st.ipynb: Main Jupyter notebook containing all examples and explanations
  • aug_train.csv: Sample dataset used for demonstrations
  • Psl_Complete_Dataset(2016-2024).csv: PSL (Pakistan Super League) dataset used for specific examples

Usage

To run these examples:

  1. Make sure you have Python installed on your system
  2. Install required libraries:
    pip install pandas
    
  3. Open the Jupyter notebook 1st.ipynb to see the examples and run them interactively

Key Features Demonstrated

  • Reading CSV files with different parameters
  • Handling different types of data
  • Working with large datasets efficiently
  • Data transformation during import
  • Error handling and data cleaning
  • Custom data conversions

Getting Started

Clone this repository and open the Jupyter notebook to start learning about different ways to work with CSV files using Pandas:

git clone https://github.com/mkhalidh/ML_Working-with-CSV-files.git
cd ML_Working-with-CSV-files
jupyter notebook

Prerequisites

  • Python 3.x
  • Pandas library
  • Jupyter Notebook

Author

  • Khalid

Acknowledgments

These notes are based on a comprehensive tutorial about working with CSV files using Pandas. The examples demonstrate practical use cases and common scenarios when dealing with CSV data in Python.


Feel free to contribute to this repository by creating pull requests or reporting issues!