MACHINE LEARNING TECHNIQUE FOR CREDIT CARD SCAM DETECTION
DOI:
https://doi.org/10.51903/jtie.v1i1.143Kata Kunci:
Scam Detection, Credit Card, Financial Scam, Machine LearningAbstrak
Credit Card (CC) scam In financial markets is a growing nuisance. CC scams increasing rapidly and causing large amounts of financial losses for organizations, governments, and public institutions, especially now that all payment methods for e-commerce shopping can be done much more easily through digital payment methods. For this reason, the purpose of this study is to detect scam CC transactions from a given dataset by performing a predictive investigation on the CC transaction dataset using machine learning techniques. The method used is a predictive model approach, namely logistic regression models (LR-M), random forests (RF), and XGBoost combined along particular resampling techniques that have been practiced to anticipate scams and the authenticity of CC transactions. Model performance was calculated grounded Re-call Curve (RC), precision, f1-score, PR, and ROC.
The experimental results show that the random forest in combination with the hybrid resampling approach of SMOTE and removal of Tomek Links works better than other models. The random forest model and XGBoost accomplished are preferred over the LR-M as long as their global f1 score is without re-sampling. This demonstrates the strength of one technique that can provide greater achievement alike in the existence of class inequality dilemmas. Each approach, at the same time when used with Ran-Under, will give a great memory score but fails cursedly in the language of accuracy. Compared to the coordinate model sine re-sampling, the accuracy and RS are not repaired in cases where Tomek linker displacement was used. RF and xgboost perform quite well in terms of f1-S when Ran-Over is used. SMOTE increases the random forest draw score and xgboost but the precision score (PS) decreases slightly.
Completely, during a hybrid solution of Tomek delinker and SMOTE was practiced with random forest, it gave equitable attention and RS in the PR-AUC. XGboost failed to increase the PS even though the same re-sampling technique was used. For future research, a fee-delicate study method can be applied as long as fee misclassifications. So for future research, it is very necessary to consider this behavior change and it is also very important to develop predictive models. In addition to this, much larger data is needed so that detailed studies on handling non-stationary properties in CC scam detection can be carried out better.
Referensi
Andrea Dal Pozzolo, Giacomo Boracchi, Olivier Caelen, Cesare Alippi, and Gianluca Bontempi. CC scam detection and concept-drift adaptation with delayed supervised information. 2015 International Joint Conference on Neural Networks (IJCNN), pages 1-8, 2015.
Andrea Dal Pozzolo, Olivier Caelen, Reid A. Johnson, and Gianluca Bontempi. Calibrating probability with undersampling for unbalanced classification. 2015 IEEE Symposium Series on
Andrea Dal Pozzolo, Olivier Caelen, Yann-Ael Le Borgne, Serge Waterschoot, and Gianluca Bontempi. Learned lessons in CC scam detection from a practitioner's perspective. Expert System. Appl., 41:4915-4928, 2014.
Andrea Dal Pozzolo. Adaptive Machine Learning for CC Scam Detection. Ph.D. thesis, 2015.
Andrew P. Bradley. The use of the area under the roc curve in the evaluation of machine learning algorithms. Pattern Recognition, 30:1145-1159, 1997.
Bertrand Lebichot, Fabian Braun, Olivier Caelen, and Marco Saerens. A graph-based, semi-supervised, CC scam detection system. in COMPLEX NETWORKS, 2016.
Emin Aleskerov, Bernd Freisleben, and R. Bharat Rao. Card watch: a neural network-based database mining system for CC scam detection. in CIFEr, 1997.
Gustavo EAPA Batista, Ronaldo C. Prati, and Maria Carolina Monard. A study of the behavior of several methods for balancing machine learning training data. SIGKDD Explorations, 6:20-29, 2004.
John O. Awoyemi, Adebayo Olusola Adetunmbi, and Samuel Adebayo Oluwadare. CC scam detection using machine learning techniques: A comparative analysis. 2017 International Conference on Computing Networking and Informatics (ICCNI), pages 1-9, 2017.
Kevin W. Bowyer, Nitesh V. Chawla, Lawrence O. Hall, and W. Philip Kegelmeyer. Smote: Synthetic minority over-sampling technique. J. Artif. Intell. Res., 16:321-357, 2002.
Mock, TJ, Srivastava, RP, & Wright, AM (2017). Scam Risk Assessment Using the Scam Risk Model as a Decision Aid. Mock, TJ, Srivastava, RP & Wright, AM 2017, 'Scam Risk Assessment Using the Scam Risk Model as a Decision Aid', Journal of Emerging Technologies in Accounting, Vol. 14, No. 1, pp. 37-56. Https://Doi.Org/10.2308/Jeta-51724 . https://cris.maastrichtuniversity.nl/en/publications/0c1747fa-e8e8-43cc-8f36-9c57a05c5963
Nathalie Japkowicz and Shaju Stephen. The class inequality dilemma: A systematic study. Intelligent Data Analysis, pages 429-449, 2002.
Neil Liberty. Decision trees and random forests towards data science, Jan 2017.
Pedro M. Domingos. Metacost: A general method for making classifiers cost-sensitive. in KDD, 1999.
Piotr Juszczak, Niall M. Adams, David J. Hand, Christopher Whitrow, and David John Weston. Off-the-peg and bespoke classifiers for scam detection. Computational Statistics Data Analysis, 52:4521-4532, 2008.
Richard Wheeler and J. Stuart Aitken. Multiple algorithms for scam detection. Knowledge-Based System, 13:93-99, 2000.
Robert C. Holte, L. Acker, and B. Porter. Concept learning and the problem of small disjuncts. In Proceedings of the Eleventh International Joint Conference on Artificial Intelligence (IJCAI-89), pages 813-818, Detroit, MI, 1989.
Sam Maes, Karl Tuyls, Bram Vanschoenwinkel, and Bernard Manderick. CC scam detection using bayesian and neural networks. In In: Maciunas RJ, editor. Interactive image-guided neurosurgery. American Association of Neurological Surgeons, pages 261-270, 1993.
SeattleDataGuy. Additioning and sacking: How to develop a robust machine learning algorithm, Nov 2017.
Thomas G. Dietrich. Multiple classifier systems. In Lecture Remarks in Computer Science, 2000.
Tianqi Chen and Carlos Guestrin. Xgboost: A scalable tree additioning system. in KDD, 2016.