CTGANBoost:Credit Fraud Detection Based on CTGAN and Boosting
In the financial industry,credit fraud detection is an important task,which can reduce a lot of economic losses for banks and consumer institutions.However,there are problems of class imbalance and overlapping features of positive and negative sam-ples in credit data,which lead to low sensitivity of minority class recognition and low data discrimination.To address these pro-blems,a CTGANBoost method is proposed for credit fraud detection.First,in each Boosting iteration of AdaBoost,the conditional tabular generative adversarial network(CTGAN)method based on class label information constraint is introduced to learn fea-ture distribution for minority class data augmentation.Secondly,based on the enhanced data set synthesized by CTGAN,a weight normalization method is designed to ensure that the distribution characteristics and relative weights of the original data set are maintained during the sample weighting process.Experimental results on three open source datasets show that CTGANBoost out-performs other mainstream credit fraud detection methods,with AUC values increase by 0.5%~2.0%and F1 values increase by 0.6%~1.8%,which verifies the effectiveness and generalization ability of CTGANBoost method.