Contribution Rate of Data Asset Value Evaluation Index Based on DP-FS-BP Prediction Framework and SHAP Algorithm
Data asset valuation is of strategic significance to the development of data elementalization,in order to clarify the contri-bution rate of data asset valuation indicators and balance the accuracy and interpretability of machine learning models,a data prepro-cessing-feature selection-back propagation neural network(DP-FS-BP)prediction framework prediction framework was proposed,and the Shapley Additive exPlanations(SHAP)algorithm was used to explain the metric contribution of the prediction model.Taking the transaction block data collected by Youe data network as an example,data preprocessing and feature selection were used to clean the data and select indicators,and then the values of R2,root mean squared error(RMSE)and mean absolute error(MAE)were compared with the original data on linear regression,support vector machine(SVM),decision tree,k-nearest neighbors(KNN),random forest,XGBoost and DP-FS-BP models.The results show that the DP-FS-BP model obtains the most ideal prediction results,and has a signifi-cant advantage over other models in prediction accuracy.The results of explaining the BP neural network model using the SHAP algo-rithm show that the average absolute values of SHAP values for scientific research techniques and data sample sizes are 209.25 and 191.24,respectively,ranking first and second.By visualizing the contribution rate of features to the output,a decision-making basis is provided for establishing a corresponding data asset value evaluation index system.
data preprocessingfeature selectionmodel interpretabilityback propagation neural networkcontribution rate