Skip to content

jamesmnguyen704/TripleTenProgram

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TripleTenProgram

TripleTen Data Scientist Portfolio

A comprehensive collection of all projects completed during the TripleTen Data Science professional training program.

Project Name Description Libraries and Models Used
Project 01: Music Preferences Analysis Analyzed and compared music preferences between two different cities to identify patterns and trends in listening habits. Library: pandas
Project 02: Credit Scoring Analysis Evaluated various metrics to predict the likelihood of a customer defaulting on a loan, helping financial institutions make informed lending decisions. Library: pandas
Project 03: Vehicle Price Analysis Investigated factors influencing vehicle prices by analyzing classified ads data to assist in better pricing strategies. Libraries: pandas, numpy, matplotlib
Project 04: Cell Plans Analysis Examined client behavior and identified which telecom packages generate the most income, providing insights for marketing and sales strategies. Libraries: pandas, numpy, matplotlib, math, scipy
Project 05: Video Games Analysis Tested hypotheses regarding video game users and critics to determine promising projects and plan effective advertising campaigns. Libraries: pandas, numpy, scipy, matplotlib
Project 06: Taxi Trip Analysis Analyzed taxi trip durations in relation to weather conditions to test and validate hypotheses, aiding in optimizing taxi services. Libraries: pandas, numpy, scipy, matplotlib
Project 07: Cell Phone Plans Prediction Developed a classification model to help clients select the best cell phone plan, achieving a performance metric of at least 0.75. Models: decision tree, random forest, logistic regression; Libraries: pandas, sklearn
Project 08: Client Retention Prediction Created a prediction model for client retention with an F1 score of at least 0.59, helping businesses improve their customer retention strategies. Models: decision tree, random forest, logistic regression; Libraries: pandas, matplotlib, sklearn
Project 09: Oil Well Location Prediction Validated oil reserve volume prediction models and calculated profits and risks for different regions, aiding in strategic decision-making for OilyGiant's operations. Models: random forest, linear regression; Libraries: pandas, numpy, scipy, matplotlib, sklearn
Project 10: Gold Production Modeling Modeled the production process in a mine to predict gold extraction efficiency and developed a prototype machine learning model for industrial applications. Models: linear regression, decision tree regressor, random forest regressor; Libraries: pandas, numpy, matplotlib, sklearn
Project 11: Insurance Modeling Identified similar customers and predicted insurance benefits while ensuring data privacy, enhancing customer service and risk management in insurance. Models: K-Nearest Neighbors (classifier); Libraries: numpy, pandas, seaborn, matplotlib, sklearn, math, IPython
Project 12: Car Price Prediction Built a model to determine market value of cars with an emphasis on prediction quality and speed, supporting automotive market analysis. Models: decision tree, random forest regressor, linear regression, xgboost, catboost, LGB; Libraries: numpy, pandas, matplotlib, sklearn, catboost, xgboost, lightgbm, time
Project 13: Taxi Orders Prediction Predicted the number of taxi orders in the next hour with a RECM metric not exceeding 48, helping optimize taxi fleet management. Models: decision tree, random forest regressor, linear regression, LGBM; Libraries: numpy, pandas, matplotlib, sklearn, statsmodels
Project 14: Movie Reviews Categorization Trained models to automatically detect negative movie reviews, aiding in sentiment analysis and customer feedback management. Models: logistic regression, LGBM classifier; Libraries: numpy, re, pandas, seaborn, matplotlib, nltk, transformers, tqdm, spacy, sklearn, lightgbm
Project 15: Face Identification Built and evaluated a neural network regression model to estimate age based on photographs, supporting biometric applications. Models: Convolutional Neural Network; Libraries: pandas, numpy, matplotlib, PIL, tensorflow.keras (ImageDataGenerator, ResNet50, Sequential, GlobalAveragePooling2D, Dense, Dropout, Flatten, Adam)
Project 17: Customer Retention Prediction Developed a model predicting contract cancellations with an AUC-ROC greater than or equal to 0.75, helping businesses reduce churn rates. Models: Logistic Regression, Decision Tree, Random Forest, GridSearchCV, Convolutional Neural Network (keras, fashion_mnist, dense); Libraries: numpy, pandas, matplotlib, tensorflow, Catboost, sklearn

This portfolio demonstrates the application of various data science techniques and machine learning models across different industries and use cases, highlighting the practical skills and knowledge acquired during the TripleTen Data Science training program.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published