Machine Learning Projects
- Home Credit Default Risk
- Developed k-fold LightGBM (Light Gradient Boosting Machine) model.
- Predicted repayment capabilities of each client.
- Achieved AUC (area under the curve) to 0.787 compared with Logistic Regression model 0.671 and Random Forest model 0.678.
- Training data has 307,511 observations and 122 features and testing data has 48,744 observations and 121 features.
- NLP Sentiment Analysis with Emoji Labels
- Converted training/testing features into vector list using pre-trained GloVe model.
- Inserted a keras Embedding layer given word to vector mapping.
- Build and trained a 2 layer LSTM network.
- Achieved sentiment classification for sentences and labeled with emoji.
- Game Recommedation System
- Scrapped user data (17.3MB) and app data (510.2MB) from Steam.
- Sotred app data in MySQL database.
- Trained content-based model and item-based model.
- Created a website providing 5 recommendations for each user in list.