基于孤立损失和深度自编码器的医保欺诈识别算法

Medical insurance fraud identification based on isolation loss and deep autoencoder

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：针对医保欺诈识别中欺诈样本与正常样本之间的高相似性、区分度不高问题以及边缘正常样本的迷惑性问题,本文提出了基于孤立损失(isolation loss)和深度自编码器(deep autoencoder)的医保欺诈识别算法(ISDAE).该算法针对边缘欺诈样本和稀疏欺诈样本的易隔离性,提出了样本的孤立度度量,旨从特征分布角度量化分析两类样本的差异.在此基础上,利用DAE对医保线性和非线性特征的挖掘能力,并综合考虑边缘正常样本对模型训练的干扰,在潜在特征空间中定义了孤立损失以实现中心正常样本的聚集和边缘正常样本的分离,从而增大欺诈样本和正常样本的差异;然后,通过集成孤立度值和重构误差来评估样本的欺诈程度,提高模型的欺诈识别性能.最后在天池医保数据集上对所提算法的性能进行了验证,结果表明本文所提ISDAE算法的整体欺诈识别能力优于对比方法,且其性能表现更加稳定.

外文摘要：Aiming at the problem of high similarity and low degree of discrimination between fraudulent samples and normal samples and the confusion of marginal normal samples in med-ical insurance fraud identification,this paper proposes a medical insurance fraud identification algorithm based on isolation loss and deep autoencoder(ISDAE).Aiming at the easy isolation of marginal fraud samples and sparse fraud samples,the algorithm proposes a sample isolation measure to quantitatively analyze the differences between the two types of samples from the perspective of feature distribution.On the basis,using DAE's ability to mine linear and non-linear features of medical insurance and considering the interference of margin normal samples on model training,an isolation loss is defined in the latent space to achieve the aggregation of center normal samples and the separation of edge normal samples,thereby increasing the differ-ence between fraudulent samples and normal samples.To further improve the fraud detection performance of the model,the fraud degree of samples is evaluated by integrating the isolation value and the reconstruction error.Finally,the performance of the proposed algorithm is verified on the Tianchi medical insurance dataset.The results show that the overall fraud identification performance of the proposed ISDAE algorithm is better than the comparative methods,and its performance is more stable.

外文关键词：

medical insurance fraud identificationisolation lossdeep autoencoderunsuper-vised learning

作者：

柳叶、王亚楠、候文慧、刘慧、王坚强

展开 >

作者单位：

中南大学商学院,长沙 410083

关键词：

医保欺诈识别孤立损失深度自编码器无监督学习

出版年：

2024

DOI：

10.12011/SETP2023-1899

系统工程理论与实践

中国系统工程学会

系统工程理论与实践

CSTPCDCSSCI北大核心

影响因子：1.575

ISSN：1000-6788

年,卷(期)：2024.44(11)