Credit Scoring Based on Semi-supervised Support Vector Machine
To address the problem of difficulty and high cost in obtaining labeled samples in credit scoring,a new credit scoring model is proposed based on semi-supervised support vector machines.By introducing new parameters to the unlabeled samples,the model need not satisfy the random missing assumption and has good applicability.Meanwhile,adding a semi-supervised part to the loss function encourages the similarity between the coefficients of labeled and unlabeled samples,which can effectively fuse the unlabeled sample information and improve the estimation effect.In addition,Group LASSO is used for variable selection,which can make full use of the group structure information and screen important variables.The feasibility of the proposed method and its excellent results in variable selection,coefficient estimation and classification prediction are demonstrated by numerical simulations and an example data of credit card risk default prediction.