首页期刊导航|Chemometrics and Intelligent Laboratory Systems
期刊信息/Journal information
Chemometrics and Intelligent Laboratory Systems
Elsevier BV
Chemometrics and Intelligent Laboratory Systems

Elsevier BV

0169-7439

Chemometrics and Intelligent Laboratory Systems/Journal Chemometrics and Intelligent Laboratory SystemsSCIISTPEI
正式出版
收录年代

    A sparse fused group lasso regression model for fourier-transform infrared spectroscopic data with application to purity prediction in olive oil blends

    Soh, Chin GiZhu, Ying
    11页
    查看更多>>摘要:The percentage of olive oil present in an oil blend is of interest in the quality control of oils sold to consumers. One way in which this can be measured is using infrared spectroscopy. The analysis of the resulting data is challenging due to the high-dimension of the data and multicollinearity caused by issues such as the similarities between the chemical constituents in vegetable oils. This paper develops a sparse fused group lasso model for simultaneous feature selection and model fitting on Fourier-transform infrared spectroscopic data, and applies it to the task of percentage purity prediction in oil blends. The arising optimization problem is solved via the alternating direction method of multipliers algorithm. The sparse fused group lasso method is seen to improve on the interpretability of the resultant models, while providing comparable predication performance. Most importantly, it provides a flexible model that can capture group structure and smoothness in the coefficient structure.

    Efficacy of Transfer Learning-based ResNet models in Chest X-ray image classification for detecting COVID-19 Pneumonia

    Showkat, SadiaQureshi, Shaima
    10页
    查看更多>>摘要:Because of COVID-19's effect on pulmonary tissues, Chest X-ray(CXR) and Computed Tomography (CT) images have become the preferred imaging modality for detecting COVID-19 infections at the early diagnosis stages, particularly when the symptoms are not specific. A significant fraction of individuals with COVID-19 have negative polymerase chain reaction (PCR) test results; therefore, imaging studies coupled with epidemiological, clinical, and laboratory data assist in the decision making. With the newer variants of COVID-19 emerging, the burden on diagnostic laboratories has increased manifold. Therefore, it is important to employ beyond laboratory measures to solve complex CXR image classification problems. One such tool is Convolutional Neural Network (CNN), one of the most dominant Deep Learning (DL) architectures. DL entails training a CNN for a task such as classification using extensive datasets. However, the labelled data for COVID-19 is scarce, proving to be a prime impediment to applying DL-assisted analysis. The available datasets are either scarce or too diversified to learn effective feature representations; therefore Transfer Learning (TL) approach is utilized. TL-based ResNet architecture has a powerful representational ability, making it popular in Computer Vision. The aim of this study is two -fold-firstly, to assess the performance of ResNet models for classifying Pneumonia cases from CXR images and secondly, to build a customized ResNet model and evaluate its contribution to the performance improvement. The global accuracies achieved by the five models i.e., ResNet18_v1, ResNet34_v1, ResNet50_v1, ResNet101_v1, ResNet152_v1 are 91.35%, 90.87%, 92.63%, 92.95%, and 92.95% respectively. ResNet50_v1 displayed the highest sensitivity of 97.18%, ResNet101_v1 showed the specificity of 94.02%, and ResNet18_v1 had the highest precision of 93.53%. The findings are encouraging, demonstrating the effectiveness of ResNet in the automatic detection of Pneumonia for COVID-19 diagnosis. The customized ResNet model presented in this study achieved 95% global accuracy, 95.65% precision, 92.74% specificity, and 95.9% sensitivity, thereby allowing a reliable analysis of CXR images to facilitate the clinical decision-making process. All simulations were carried in PyTorch utilizing Quadro 4000 GPU with Intel(R) Xeon(R) CPU E5-1650 v4 @ 3.60 GHz processor and 63.9 GB useable RAM.

    Classifying COVID-19 based on amino acids encoding with machine learning algorithms

    Alkady, WalaaElBahnasy, KhaledLeiva, VictorGad, Walaa...
    11页
    查看更多>>摘要:COVID-19 disease causes serious respiratory illnesses. Therefore, accurate identification of the viral infection cycle plays a key role in designing appropriate vaccines. The risk of this disease depends on proteins that interact with human receptors. In this paper, we formulate a novel model for COVID-19 named "amino acid encoding based prediction" (AAPred). This model is accurate, classifies the various coronavirus types, and distinguishes SARS-CoV-2 from other coronaviruses. With the AAPred model, we reduce the number of features to enhance its performance by selecting the most important ones employing statistical criteria. The protein sequence of SARSCoV-2 for understanding the viral infection cycle is analyzed. Six machine learning classifiers related to decision trees, k-nearest neighbors, random forest, support vector machine, bagging ensemble, and gradient boosting are used to evaluate the model in terms of accuracy, precision, sensitivity, and specificity. We implement the obtained results computationally and apply them to real data from the National Genomics Data Center. The experimental results report that the AAPred model reduces the features to seven of them. The average accuracy of the 10-fold cross-validation is 98.69%, precision is 98.72%, sensitivity is 96.81%, and specificity is 97.72%. The features are selected utilizing information gain and classified with random forest. The proposed model predicts the type of Coronavirus and reduces the number of extracted features. We identify that SARS-CoV-2 has similar physicochemical characteristics in some regions of SARS-CoV. Also, we report that SARS-CoV-2 has similar infection cycles and sequences in some regions of SARS CoV indicating the affectedness of vaccines on SARS-CoV2. A comparison with deep learning shows similar results with our method.

    Channel and band attention embedded 3D CNN for model development of hyperspectral image in object-scale analysis

    Zhu, FengleCai, JianpingHe, MengzhuLi, Xiaoli...
    11页
    查看更多>>摘要:Recently there is a rising trend of employing convolutional neural network (CNN) for modeling the complex high dimensional hyperspectral images in object-scale analysis. Compared with 1D CNN and 2D CNN for merely extracting spectral or spatial features, the 3D CNN naturally offers a more effective method for simultaneously extracting the integrated deep spectral-spatial features. Due to the convolution characteristics of operating within a local receptive field, computer vision studies had incorporated the attention mechanism into 2D CNN to exploit the relationship between features for adaptive feature refinement. No exploration has been reported on incorporating the attention mechanism into 3D CNN for model development of hyperspectral image in object-scale analysis. In this study, we investigated an improved 3D CNN architecture with attention modules embedded for adaptive feature refinement in object-scale hyperspectral image modeling. Besides the adapted channel attention, the band attention module was specially designed to learn the band-wise relationship. Based on the 3D ResNet architecture, various modifications on the arrangement and structure of channel and band attention modules were explored systematically for higher modeling performance. An exemplar hyperspectral image dataset of basil leaves for predicting their relative chlorophyll content (RCC), was applied to evaluate the proposed model. Comprehensive comparison experiments showed performance improvement after adding attention modules into the residual block of 3D ResNet, demonstrating the effectiveness of adaptive feature refinement along channel and band dimensions through the learned attention maps. The sequential channel-band attention module achieved the highest model performance, with testing determination coefficient (R-2) of 0.8998. The results indicated the effectiveness of the channel and band attention embedded 3D CNN for model development of hyperspectral image in object-scale analysis.

    Rock lithological instance classification by hyperspectral images using dimensionality reduction and deep learning

    Galdames, Francisco J.Perez, Claudio A.Estevez, Pablo A.Adams, Martin...
    11页
    查看更多>>摘要:The mining operations are part of the industry 4.0 revolution, and there is a need in developing new ways to produce a flow of information among all the processes of a plant. In this context, the lithological classification of the rocks, just after being extracted, provides information related to their chemical composition and physical properties. Hyperspectral imaging is an exceptional tool for acquiring information to perform this characterization. We present a method based on deep learning and hyperspectral images, within the short-wavelength infrared range of 900-2500 nm, to perform lithological classification. The method performs an instance segmentation of the rocks, thus segmenting and classifying the rocks at the same time. A transfer learning methodology was applied by using a deep neural network pretrained with millions of color images to classify the rocks. To use this network, the dimensionality of the hyperspectral images is reduced from 268 to only 3 channels by another neural network. In addition, these 3-channels images can be used for human interpretation. We compare various deep network architectures and classical methods for performing dimensionality reduction. The method was tested on our hyperspectral image database with 13 different lithological classes, obtaining an F1-score that was above 96% and 98% in the instance and pixel-wise performance, respectively.

    Multi-classification deep CNN model for diagnosing COVID-19 using iterative neighborhood component analysis and iterative ReliefF feature selection techniques with X-ray images

    Aslan, NarinKoca, Gonca OzmenKobat, Mehmet AliDogan, Sengul...
    11页
    查看更多>>摘要:Background: The acute respiratory syndrome coronavirus 2 (SARS-CoV-2) disease seriously affected worldwide health. It remains an important worldwide concern as the number of patients infected with this virus and the death rate is increasing rapidly. Early diagnosis is very important to hinder the spread of the coronavirus. Therefore, this article is intended to facilitate radiologists automatically determine COVID-19 early on X-ray images. Iterative Neighborhood Component Analysis (INCA) and Iterative ReliefF (IRF) feature selection methods are applied to increase the accuracy of the performance criteria of trained deep Convolutional Neural Networks (CNN). Materials and methods: The COVID-19 dataset consists of a total of 15153 X-ray images for 4961 patient cases. The work includes thirteen different deep CNN model architectures. Normalized data of lung X-ray image for each deep CNN mesh model are analyzed to classify disease status in the category of Normal, Viral Pneumonia and COVID-19. The performance criteria are improved by applying the INCA and IRF feature selection methods to the trained CNN in order to improve the analysis, forecasting results, make a faster and more accurate decision. Results: Thirteen different deep CNN experiments and evaluations are successfully performed based on 80-20% of lung X-ray images for training and testing, respectively. The highest predictive values are seen in the analysis using INCA feature selection in the VGG16 network. The means of performance criteria obtained using the accuracy, sensitivity, F-score, precision, MCC, dice, Jaccard, and specificity are 99.14%, 97.98%, 99.58%, 98.80%, 97.81%, 98.83%, 97.68%, and 99.56%, respectively. This proposed study is indicated the useful application of deep CNN models to classify COVID-19 in X-ray images.

    Near-infrared spectroscopy with chemometrics for identification and quantification of adulteration in high-quality stingless bee honey

    Raypah, Muna E.Zhi, Loh JingLoon, Lim ZiOmar, Ahmad Fairuz...
    8页
    查看更多>>摘要:This study presents a simple, rapid, and non-destructive approach of combining near-infrared spectroscopy (NIRS) with chemometrics for the evaluation of adulteration levels in stingless bee honey (SBH). Three high-quality SBH samples were directly obtained from the northern part of Malaysia and then adulterated with water (W) and apple cider vinegar (AC) at an adulteration range of 4.76%-50%. The NIRS analysis was performed at the spectral region of 700-1100 nm. The chemometric tools used include principal component analysis (PCA), hierarchical cluster analysis (HCA), principal component analysis-linear discriminant analysis (PCA-LDA), and partial least squares regression (PLSR). Using the first three principal components (PCs) with a total of 97.95% of explained variance, a complete distinction between adulterants, pure, and adulterated samples was achieved. PCA-LDA model with 100% classification accuracy in prediction was able to discriminate each SBH adulterated with W and AC. A general PLSR model to quantify the level of adulteration was developed. The best prediction model used 7 factors with a high correlation coefficient 'R' and low root mean square error of prediction 'RMSEP' (R-P = 0.995 and RMSEP = 1.350%). These results confirm the combined ability of NIRS at region 700-1100 nm and chemometrics to effectively discriminate and quantify adulterated SBHs.

    A quasi-qualitative strategy for FT-NIR discriminant prediction: Case study on rapid detection of soil organic matter

    Chen, HuazhouXu, LiliGu, JieMeng, Fangxiu...
    8页
    查看更多>>摘要:Fourier transform near infrared (FT-NIR) is a technology to provide direct and rapid quantitative determinations of soil organic matter (SOM). In this paper, a new discriminant method is proposed for quasi-qualitative determination by combining the interval search principal component analysis algorithm with logistic regression (iPCALR). We firstly predict the SOM content of soil samples based on the partial least square (PLS) regression. To build up a quasi-qualitative analytical strategy, we design various fault-tolerant thresholds. Discriminate the sample marks as accurate or non-accurate according to the predicted values from priori PLS and the thresholds. The quantitative calibration model is thereby transformed into a quasi-qualitative discriminant model. We then leverage iPCA-LR to select informative FT-NIR wavebands with parameter optimization, according to the optimal discriminant accuracy. Results show that the FT-NIR quasi-qualitative discriminant predictive accuracy varies significantly with thresholds varying, but fortunately that the optimal accuracy climbed to above 74%. Furthermore, the test of different informative wavebands outputs the optimal calibration models with an accuracy above 88%. In the SOM content prediction of FT-NIR, iPCA-LR converts the quantitative problem into the quasi qualitative discriminant issue when combined with the threshold-transformed PLS results. The quasi-qualitative strategy helps to overcome the over-idealistic modeling in PLS quantitative analysis. It is more beneficial for the real-time application of spectroscopy technology.

    Deep reinforced neural network model for cyto-spectroscopic analysis of epigenetic markers for automated oral cancer risk prediction

    Ghosh, AritriChaudhuri, DwiteeyaAdhikary, ShreyaChatterjee, Kabita...
    11页
    查看更多>>摘要:Understanding epigenetic changes can provide vital information for early stage oral cancer diagnosis. Vibrational spectroscopy methods like Raman spectroscopy (RS) and Fourier Transform Infrared Spectroscopy (FTIR) can provide several advantages over conventional molecular biology methods, by incorporating information from fingerprint regions. Moreover, application of advanced spectral analysis tools like deep learning (DL) techniques can be efficiently applied for analyzing the large spectral dataset and extracting the vital features. Epigenetic changes are identifies in oral epithelial cells of healthy individuals, oral leucoplakia and squamous cell carcinoma patients through analysis of Raman (400-1800 cm(-1)) and FTIR data (700-2000 cm(-1)). Deep reinforced neural network (DRNN) model is employed to classify the epigenetic changes identified from the Raman and FTIR spectra. Feature extraction layer of DL model uses peak detection layer and reinforced learning layer to identify significant epigenetic features. Classification layer is made up of N numbers of back propagated Artificial Neural Network (ANN) layers. DL model developed is fully automated and overcome the wave shift problem of spectroscopic data. Testing accuracy of the proposed DRNN model is 83.33%. Class wise accuracies for NRML, OLPK and OSCC are 83.3%, 87% and 95.24%, respectively. Proposed DRNN model attains an overall ROC of 0.88 Present study employs combination of two complementary vibrational spectroscopy methods for oral cancer detection and chemometric analysis of the spectral features with DRNN mode. Identification of the epigenetic changes and utilization of the knowledge in cancer prediction will enable the proposed study to develop smart point-of-care diagnostic system.

    Response oriented covariates selection (ROCS) for fast block order- and scale-independent variable selection in multi-block scenarios

    Mishra, PuneetMetz, MaximeMarini, FedericoBiancolillo, Alessandra...
    9页
    查看更多>>摘要:Multi-block datasets are widely met in the chemometrics domain, and several data fusion approaches have recently been proposed to treat them. Apart from exploratory and predictive modelling, a key task in this context is feature selection which involves finding key complementary variables across multiple data blocks that jointly provide a good explanation of the response variables, revealing the key variables of the system. In that direction, a new method called response-oriented covariate selection (ROCS) is proposed here. ROCS is a direct extension of the covariance selection (CovSel) approach to multi-block scenarios, where the choice is based on a competition between variables in different blocks, as is done in the response-oriented sequential alternation (ROSA) method. The uniqueness of the ROCS method is its simplicity, fast execution speed, insensitivity to block order and scale invariance. The evaluation of ROCS is presented using several multi-block modelling cases and by comparison with other variable selection methods.