IronKaggle

One day competition of Ironhack's Data Analytics bootcamp. Goal was to build a predicting model for sales that then was to be verified. Cleaned a raw dataset with data of sales from different stores, used feature engineering for feature selection and then applied two diferente models and compared the scores on both: xgboost and Random Forest Regressor. Weighted the bias / variance to decide on which to choose: chose the second.

Model later verified by the teacher on a new dataset and ended being the winner.

Technical Requirements

Data Cleaning and Manipulation: checking and dropping null values / rows / columns, dealing with duplicates, formatting and filtering data;
Combining and Structuring Data:
Data Aggregation and Filtering;
Libraries imported:
- Pandas: import, export the shark_attack.csv - baseline for the project - and manipulate data;
- matplotlib: plotting histograms to verify hypothesis;
- Numpy;
- Seaborn;
- sklearn: metrics, ensemble and model_selection.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

IronKaggle

Technical Requirements

Resources

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

IronKaggle

Technical Requirements

Resources