X-ray fluorescence spectrometry combined with principal component analysis-linear discriminant analysis for identification of coking coal origins
Coking coal is the key raw material in steel production,and its quality and accurate tracking of or-igin are crucial to guarantee the stability and safety of industry chain.In this study,the X-ray fluorescence spectrometry(XRF)data of total 78 groups of coking coal samples from five different coal-producing areas in four countries were collected,including Russia(South Yakutian Basin,Kuznetsk Basin),Australia(Bow-en Basin),the United States(Appalachian Basin),and Canada(Elk Valley).A model for identifying the ori-gin of coking coal was established based on the algorithms including principal component analysis and linear discriminant analysis,thus realizing the rapid identification of coking coal origins.The outliers were correc-ted using the box plot correction method and filled with the nearest neighbor method.The spectral data was preprocessed using Savitzky-Golay(SG)smoothing filter and quadratic function curve fitting baseline.The first three principal components were used as input vectors and the four nationalities were used as target vectors.The training set and test set were randomly selected in the ratio of 70%and 30%,respectively.The training set underwent 5-fold cross-validation,and a linear discriminant analysis was used to establish the i-dentification model.The results showed that the accuracy of the validation set and the test set were 98.2%and 100%,respectively.The proposed model could accurately and rapidly identify the origins of coking coal from Russia(South Yakutian Basin,Kuznetsk Basin),Australia(Bowen Basin),the United States(Appala-chian Basin),and Canada(Elk Valley).