A Study on Comparative Evaluation of Credit Card Fraud Detection Using Tree-Based Machine Learning Models, 2021 [paper][code]
Thitiwat Ruangsakorn, Yu Song
Abstract: Credit card fraud is a severe problem that distresses financial companies and cardholders around the world and is becoming more and more serious along with the development of technology. The loss every year due to these fraudulent acts is billions of dollars. Fraud detection has been an interesting topic in machine learning. In this study, we focus on the comparative evaluation of results by using the tree-based machine learning models (decision tree, random forest, and XGBoost) to detect fraudulent card behavior. In addition, we apply the SMOTE technique to handle imbalance data. Numerical tests show that the accuracy for decision tree, random forest and XGBoost are 96.82%, 97.06%, and 98.35%, respectively. Hence, we conclude that XGBoost performs superior to the other algorithms.
- Download the dataset from https://www.kaggle.com/competitions/ieee-fraud-detection.
- Extract the datset and run
fraudDetection.ipynb
(using conda environment).