社会救助是保障社会公平的重要民生工程之一.为了解决社会救助业务中被救助对象类型的精准识别问题,本文基于数据要素驱动民生服务视角进行研究.首先,通过K-means聚类对救助指标进行等级划分;其次,基于互信息理论定义家庭贫困指数以量化家庭贫困程度;然后,运用互信息与相关性原理,选取救助类型判定关键特征,加入贫困指数完成特征重建,提出一种基于关键特征和贫困指数的 KM-WRF(K-means mutual information-weighted random forest)社会救助预测模型方法;最后,以上海社会救助历史业务数据为例,验证该模型对社会救助分类的有效性与可行性.实验结果表明:KM-WRF模型与多种常用模型相比在社会救助的精准识别应用中有更高的预测精度和更强的稳定性;家庭贫困指数能有效评估家庭困难程度,为数据要素视角下精准社会救助提供了科学且有效的辅助决策手段.
Abstract
Social assistance is an important aspect of people's livelihoods that ensures social equity.To address the problem of accurately identifying the types of people being assisted in social assistance operations,this study was based on the data element-driven livelihood service perspective.Firstly,the assistance index was classified by K-means clustering.Secondly,a household poverty index was defined to quantify the degree of household poverty based on mutual information theory.Then,the principle of mutual information and correlation was applied to select the key features for the determination of the type of assistance and add the poverty index to complete the feature reconstruction,proposing a KM-WRF(K-means mutual information-weighted random forest)social assistance prediction modeling approach based on key characteristics and poverty indices.Finally,using the principles of mutual information and correlation,the key features were selected to determine the type of assistance and the poverty index was added to complete the reconstruction of features.The validity and feasibility of the model for social assistance classification were verified by using historical social assistance data in Shanghai.The experimental results show that the KM-WRF model has higher prediction accuracy and greater stability than many other commonly used models for accurate identification of social assistance.The household poverty index can effectively assess the degree of household hardship,providing a scientific and effective tool for accurate social assistance decision-making from the perspective of data elements.
关键词
数据要素/数据驱动/互信息/随机森林/精准社会救助
Key words
data elements/data-driven/mutual information/random forest/precise governance of social assistance