Determination of Protein Content in Wheat Based on CNN and Feature Selection Regression Methods
Wheat protein content is generally determined by Kieldahl method,having a complicated operation process and long analysis time and cannot achieve batch testing of samples.There are linear and nonlinear mapping relationships between the near-infrared spectral features of wheat and protein content,and in this research,an algo-rithm with linear and nonlinear mapping capabilities was proposed,i.e.a combined algorithm based on convolutional neural network and feature selection regression.Multiple preprocessing methods were used to improve the signal-to-noise ratio of wheat kernels near-infrared spectral data,the one-dimensional preprocessed spectral data were collapsed into a two-dimensional matrix,a two-dimensional convolutional neural network model was used to predict the protein content of wheat,and the feature information output from the neurons in the middle part of the middle lay-er was extracted to form an integrated dataset with the data from the preprocessed spectral dataset.On the integrated dataset,the lasso,Minimax Concave Penalty regression(MCP)and Smoothly Clipped Absolute Deviation regression(SCAD)methods were used to construct a model for determining the protein content of wheat,compared and ana-lyzed with the models of Multiple Linear Regression(MLR)and Partial Least Squares Regression(PLSR).The ex-perimental results indicated that the introduction of convolutional neural network made the model with nonlinear mapping ability and improved the performance of wheat protein content prediction model.