A feature extraction algorithm based on positive region and voting attribute importance
Irrelevant or redundant information in high-dimensional data sets leads to high computational complexity of feature extraction,which has become the research hotspot.The neighborhood rough set model has the advantage of improving computational efficiency by deleting redundant information in large-scale data.In order to further improve the computational efficiency of existing neighborhood rough set models in feature extraction of continuous high-dimensional databases,we propose a feature extraction algorithm based on positive region and voting attribute importance.Firstly,since the positive region stays invariable before and after attribute reduction,and the intra-class merging and inter-class differentiation of decision-making classes in positive region is essentially related to the attribute reduction,the algorithm improves the voting attribute importance calculation method,and then incorporates an attribute granularity threshold to evaluate the importance of conditional attributes from three aspects:inter-domain differentiation,inter-class differentiation and intra-class differentiation.Thus,the distance influence of conditional attributes with different distribution densities on voting results is reduced.Finally,the importance of all conditional attributes is provided by one-time voting,and the calculation of the importance of conditional attributes is reduced from k dimensions to one dimension,thus the complexity of the calculation is decreased.Experimental analysis shows that the proposed algorithm is effective in improving the efficiency of attribute importance calculation,and is superior to the existing algorithms in terms of classification accuracy and running time on seven UCI test data sets.