This project aim to design a solution to the AXA Driver Telematics challenge on Kaggle.
The repository includes code of my solution with additional code implementing ideas which where release after the competition had ended.
- scripts/sample_the_dataset.m - create positive and negative data sets
for local evaluation purposes. It lets you define and import:
- The driver of interests trips batch (positive data set)
- The number of arbitrary irrelevant trips (negative data set)
- scripts/local_evaluation.m - use k-fold cv to assess model AUC performance
- functions/getTelematicMeasurements - this function takes the Cartesian
coordinates
$(X,Y)$ of a trip, rotate them with regard to the centroid mean, and calculate measurements with respect to the time dimension, such as: speed, acceleration, distance, and orientation at each sec - functions/getSpatialMeasurements - this function takes the Cartesian
coordinates
$(X,Y)$ of a trip, simplified it by removing the time dimension utilizing RDP (Ramer–Douglas–Peucker), and making coordinates pairs of equally spaced length. - scripts/preprocess_the_dataset split the 547,200
csvfiles into Nmatfiles on disk for further analysis. - unittests/test_all.m - run all unit tests
- Download the project files
- Download all the data files from Kaggle
- Run scripts/sample_the_dataset.R
- Run the different scripts and reports, especially script/local_evaluation.R
- Download the project files
- Download all the data files from Kaggle
- Run scripts/preprocess_the_dataset.R
- Run the different models within models directory
- The submission file will appear by default at submission folder.