Skip to content

Baybordi/Flight-Price-Prediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

8 Commits
Β 
Β 
Β 
Β 

Repository files navigation

Flight Price Prediction with Random Forest πŸš€

Overview

This project aims to predict flight prices using a Random Forest Regressor. By leveraging the power of ensemble learning and applying k-fold cross-validation, the model is optimized for accuracy and robustness. The dataset is preprocessed and imputed for missing values, and the model's performance is evaluated based on the R-squared score.

Key Features

  • βœ… **Random Forest Regressor: A powerful ensemble learning model used for predicting continuous variables like flight prices.
  • βœ… **Cross-Validation: Implements 5-fold cross-validation to evaluate model performance and prevent overfitting.
  • βœ… **R-squared Metric: Uses R-squared to assess the quality of predictions and model fit.
  • βœ… **Efficient Model Training: The model is trained on imputed data for better accuracy and handling of missing values.
  • βœ… **Scalable: The Random Forest algorithm is suitable for large datasets and can easily be extended to other use cases.

Technologies Used

  • Python 3.x
  • scikit-learn: For machine learning models and evaluation metrics
  • pandas and numpy: For data manipulation and preprocessing
  • Random Forest Regressor: For predicting flight prices
  • Cross-validation: For model evaluation

Results & Insights

  • πŸ“Œ Cross-validated R-squared scores: The model demonstrates high predictive accuracy across multiple folds, with R-squared scores ranging from 0.82 to 0.88.
  • πŸ“Œ Final R-squared score: After training on the entire dataset, the Random Forest model achieved an R-squared score of 0.8542, indicating that it can explain 85.42% of the variance in flight prices.
  • πŸ“Œ Imputed Data Handling: The model uses imputed data to handle missing values effectively, ensuring that the predictions are not biased by incomplete information.

Future Work

  • πŸ”Ή Hyperparameter Tuning: Explore tuning the Random Forest parameters like n_estimators and max_depth to improve model performance further.
  • πŸ”Ή Feature Engineering: Experiment with additional features like flight duration, airline, or departure time to enhance the model’s predictive power.
  • πŸ”Ή Comparison with Other Models: Compare the performance of Random Forest with other models like Gradient Boosting or XGBoost to assess which provides the best results for flight price prediction.

Clone the repository: git clone https://github.com/Baybordi/Flight-Price-Prediction.git

About

Flight Price Prediction with Random Forest

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published