Improving the quality of payment fraud detection by using a combined approach of transaction analysis
Subject matter: The study focuses on the methods for detection fraud transactions. Goal: Improve the accuracy of machine learning models for fraud transactions with combined methods for transaction analysis. Tasks: Investigate methods of detection fraud transactions and suggest methods that improve...
Saved in:
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Kharkiv National University of Radio Electronics
2024-12-01
|
Series: | Сучасний стан наукових досліджень та технологій в промисловості |
Subjects: | |
Online Access: | https://itssi-journal.com/index.php/ittsi/article/view/523 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Subject matter: The study focuses on the methods for detection fraud transactions. Goal: Improve the accuracy of machine learning models for fraud transactions with combined methods for transaction analysis. Tasks: Investigate methods of detection fraud transactions and suggest methods that improve accuracy. Methods: artificial intelligence methods, machine learning. Results: Methods for detecting fraudulent transactions are investigated. Methods based on data classification technology are considered: XGBoost, SVC, Logistic Regression, Logistic Regression, AdaBoostClassifier, K-Nearest Neighbors, Isolation Forest and their software models are built. The dataset used is "creditcard.csv", which contains transactions made by European cardholders over two days and contains 492 fraud cases out of 284,807 transactions. The best result is obtained with the model based on gradient boosting, which allows to process unbalanced data. It is obtained that the f1-score, due to the use of the weight parameter of the minority class, is 86% for the minority class. To improve the accuracy of fraud detection, the labeled data was clustered into subclasses using the -means method. The number of clusters equal to twelve was determined by the elbow method. This made it possible to improve the accuracy of multiclassification. F1-score ranges from 96 to 100% for different subclasses. The feature importance within each subclass is evaluated by the gradient boosting algorithm. The results of the experiment showed a different influence of features on subclass belonging, which allows for a more detailed analysis of the data to identify hidden structures in the data. Conclusions: The scientific novelty of the results obtained is the combined use of data classification and clustering methods to detect fraudulent transactions, which reduced the number of type II errors. Assessing the informative value of features within different types (subclasses) of fraudulent transactions allows us to evaluate which features have the greatest impact on the object’s belonging to a particular subclass.
|
---|---|
ISSN: | 2522-9818 2524-2296 |