Unlocking the dynamics of Indian flight pricing through data, visuals, and storytelling!
Flight fares in India fluctuate based on myriad factors—airline, route, timing, booking lead time, and more. This repository aims to uncover patterns and insights from Indian airfare data using a blend of Python, SQL, visualization, and statistical analysis.
| Step | Notebook | Description |
|---|---|---|
| 1 | 1. Data_cleaning_preprocessing.ipynb | Import & cleanse data, handle missing values, and encode features. |
| 2 | 2. Exploratory_data_analysis.ipynb | Dive into fare distributions and route-level breakdowns. |
| 3 | 3. Statistical_analysis.ipynb | Apply tests (T-tests, ANOVA) to find significant fare influencers. |
| 4 | 4. sql_analysis.ipynb | Query structured data for airline and route aggregations. |
| 5 | 5. Visualization.ipynb | Final charts and dashboards for storytelling. |
- cleaned_flight_data.csv — The primary processed dataset (30MB) used for all analysis notebooks.
- Indian Airlines.csv — Original/raw airfare data for transparency.
- Language: Python (Jupyter Notebooks)
- Libraries:
pandas,numpy(Data Manipulation)matplotlib,seaborn(Visualization)scipy.stats(Statistical Rigor)
- SQL: Embedded SQLite/PandasSQL queries for structured analysis.
- How fare variation differs by airline, route, and number of stops.
- Statistical significance behind pricing strategies of different carriers.
- Identification of the most expensive and cheapest flight corridors in India.