首页期刊导航|Chemometrics and Intelligent Laboratory Systems
期刊信息/Journal information
Chemometrics and Intelligent Laboratory Systems
Elsevier BV
Chemometrics and Intelligent Laboratory Systems

Elsevier BV

0169-7439

Chemometrics and Intelligent Laboratory Systems/Journal Chemometrics and Intelligent Laboratory SystemsSCIISTPEI
正式出版
收录年代

    A calibrant-free drift compensation method for gas sensor arrays

    Maho, PierreHerrier, CyrilLivache, ThierryComon, Pierre...
    11页
    查看更多>>摘要:Gas sensors lack repeatability over time. They are affected by drift, the result of changes at the sensor level and in the environment. A solution is to design software methods that compensate for the drift. Existing methods are often based on calibration samples acquired at the start of each new measurement session. However, finding a good reference compound is a difficult task and generating calibration samples is time-consuming. We propose a model-based correction method which does not require any calibration sample over time, operating 'blindly'. In this study, we focus on the drift affecting electronic noses. To this end, we built a real data set acquired over 9 months in real-life conditions. By using the proposed method, we show that the drift is partly compensated, thus increasing the reliability of the electronic nose. Besides, we also show that the algorithm can easily adapt if the target compounds are not all sampled during every session.

    Online Nonnegative and Sparse Canonical Polyadic Decomposition of Fluorescence Tensors

    Sanou, Isaac WilfriedRedon, RolandLuciani, XavierMounier, Stephane...
    14页
    查看更多>>摘要:The NonNegative Canonical Polyadic Decomposition (NN-CPD) is used in many fields such as in chemistry, biology and medicine. The data coming from these fields can be dynamic which lead to use real-time or "online" decomposition. Even though there are a variety of online tensor decomposition algorithms, the main assumption of all these algorithms is that the rank of the decomposition is known and/or does not vary over time. However this should not be the case in experimental conditions. In this work, we propose three algorithms to compute the online NN-CPD based on sparse dictionary learning for tracking chemical components in water by using a set of Emission and Excitation Matrices (EEMs) of fluorescence. The methods developed in this work is not limited to this application field and it addresses the major challenges posed by the variation of the CPD rank in real-time. First, the algorithms take into account the unknown factors and the variation of tensor rank. Second, previous extracted information are used to decompose upcoming new tensors. In addition to the development of these algorithms, one of the contributions of this paper is the real-time acquisition of fluorescence data in a semi controlled environment. These algorithms were applied on these real datasets and compared to state of the art algorithms.

    Fault detection and recognition by hybrid nonnegative matrix factorizations

    Jia, Qilong
    11页
    查看更多>>摘要:Fault detection and recognition are to recognize which type of operating mode the current operating mode belongs to, among the normal and faulty operating modes of an industrial process. This paper develops a method for fault detection and recognition using hybrid nonnegative matrix factorizations (HNMF) where the term 'hybrid' refers to the fact that these models utilize nonnegative matrix factorization objective functions built upon ideas from graph theory and information theory. Although HNMF absorb a variety of advanced theories and are significantly different from the existing nonnegative matrix factorizations (NMF), they are still convergent in theory. To achieve fault detection and recognition by HNMF, this paper designs a feasible technical roadmap for performing fault detection and recognition using HNMF. Due to the incorporation of NMF, graph theory, and information theory, HNMF show advantages over the existing NMF in terms of fault detection and recognition. More importantly, the proposed fault detection and recognition approach has advantages over the NMFs-based approaches, which is demonstrated through a case study on a penicillin fermentation process.

    Development of a data-driven scientific methodology: From articles to chemometric data products

    Carballo-Meilan, AraMcDonald, LewisPragot, WanawanStarnawski, Lukasz Michal...
    13页
    查看更多>>摘要:Information and data science algorithms were combined to predict the outcome of an experiment in chemical engineering. Using the Scientific Method workflow, we started the journey with the formulation of a specific question. At the research stage, the common process of querying and reading articles on scientific databases was substituted by a systematic review with a built-in recursive data mining method. This procedure identifies a specific community of knowledge with the key concepts and experiments that are necessary to address the formulated question. A small subset of relevant articles from a very specific topic among thousands of papers was identified while assuring the loss of the least amount of information through the process. The secondary dataset was bigger than a common individual study. The process revealed the main ideas currently under study and identified optimal synthesis conditions to produce a chemical substance. Once the research step was finished, the experimental information was compiled and prepared for metaanalysis using a supervised learning algorithm. This is a hypothesis generation stage whereby the secondary dataset was transformed into experimental knowledge about a particular chemical reaction. Finally, the predicted sets of optimal conditions to produce the desired chemical compound were validated in the laboratory.

    Characterization of uncertainties and model generalizability for convolutional neural network predictions of uranium ore concentrate morphology

    McDonald, Luther W.Nizinski, Cody A.Ly, CuongVachet, Clement...
    12页
    查看更多>>摘要:As the capabilities of convolutional neural networks (CNNs) for image classification tasks have advanced, interest in applying deep learning techniques for determining the natural and anthropogenic origins of uranium ore concentrates (UOCs) and other unknown nuclear materials by their surface morphology characteristics has grown. But before CNNs can join the nuclear forensics toolbox along more traditional analytical techniques - such as scanning electron microscopy (SEM), X-ray diffractometry, mass spectrometry, radiation counting, and any number of spectroscopic methods - a deeper understanding of "black box" image classification will be required. This paper explores uncertainty quantification for convolutional neural networks and their ability to generalize to out-of-distribution (OOD) image data sets. For prediction uncertainty, Monte Carlo (MC) dropout and random image crops as variational inference techniques are implemented and characterized. Convolutional neural networks and classifiers using image features from unsupervised vector-quantized variational autoencoders (VQVAE) are trained using SEM images of pure, unaged, unmixed uranium ore concentrates considered "unperturbed." OOD data sets are developed containing perturbations from the training data with respect to the chemical and physical properties of the UOCs or data collection parameters; predictions made on the perturbation sets identify where significant shortcomings exist in the current training data and techniques used to develop models for classifying uranium process history, and provides valuable insights into how datasets and classification models can be improved for better generalizability to out-of-distribution examples.

    Discrimination and source correspondence of black gel inks using Raman spectroscopy and chemometric analysis with UMAP and PLS-DA

    Asri, Muhammad Naeim MohamadVerma, RajeshMahat, Naji ArafatNor, Nor Azman Mohd...
    9页
    查看更多>>摘要:In the examination of forged documents, ink analysis plays an important role and the forensic scientist is required to opine on the origin of ink and colorants when the physical appearance is similar. Also required is to link the ink with its source as this has an important bearing in solving cases involving documents. In this study, we have tried to explore a relatively new type of writing instrument, the gel ink pen, which is commonly used by perpetrators of fraud. The favorable approach in ink analysis is the non-destructive technique such as Raman spectroscopy combined with chemometrics for objective and automated examinations. Most of the studies so far used unsupervised chemometrics for data exploration like PCA and HCA without any use of supervised methods for classification and source prediction of inks. In recent years, more complex unsupervised algorithms such as tDistributed Stochastic Neighbor Embedding (t-SNE) and Uniform Manifold Approximation and Projection (UMAP) have emerged, which are frequently employed in big data scenarios. These strategies are also appropriate for the type of data we used in this study. Partial Least Square-Discriminant Analysis (PLS-DA) is a supervised classification technique commonly used to classify samples into known groups and predict the class of unknown samples. The performance of PLS-DA has been reported to be near perfect for a dataset of few classes, however, its applicability for large datasets remains to be explored. In this study, we report the application of PLS-DA for the classification of black gel ink samples (n 1/4 140) from 14 different brands. To demonstrate the applicability of PLSDA in a forensic investigation involving unknown ink deposited on documents, we have tested 11 unknown samples to the trained PLS-DA model, and have achieved 91% correct classification rate. We also demonstrated misclassification due to large datasets can be mitigated by UMAP exploration and then applying PLS-DA to a reduced number of classes datasets. The procedure of using UMAP and PLS-DA may prove useful for unveiling the identity of black gel ink deposited on forged documents.

    RKPCA-based approach for fault detection in large scale systems using variogram method

    Kalb, Mohammed Tahar HabibKouadri, AbdelmalekHarkat, Mohamed FaouziBensmail, Abderazak...
    8页
    查看更多>>摘要:ABSTR A C T Principal Component Analysis (PCA)-based approach for fault detection is a simple and accurate data-driven technique for feature extraction and selection. However, PCA performs poorly if the data used has nonlinear characteristics where this type of data is widely present in most industrial processes. To overcome this drawback, Kernel PCA (KPCA) is an alternative technique used to work on this type of data but it requires more computation time and memory storage space for large-sized data sets. Many size reduction techniques have been developed to select the most relevant observations that will be employed by KPCA. This, known as Reduced KPCA (RKPCA), consequently requires less computation time and memory storage space than KPCA. Besides, it possesses the advantages of both KPCA and standard PCA. In this paper, a reduction in the size of a data set based on a multivariate variogram is proposed. According to its conventional formalism, the uncorrelated observations are selected and kept to form a reduced training data set. Afterward, the KPCA model is built through this data set for faults detection purposes. The proposed RKPCA scheme is tested using an actual involuntary process fault and various simulated sensor faults in a cement plant. Compared to other RKPCA techniques, the developed one yields better results.

    Root cause analysis of industrial faults based on binary extreme gradient boosting and temporal causal discovery network

    Qin, KaiChen, LeiShi, JintaoLi, Zhenxing...
    13页
    查看更多>>摘要:As the modern industry develops, fault identification and root cause analysis in industries have become a significant problem. In this work, a new framework is developed to analyze the root cause of faults in the absence of historical fault information. First, binary-extreme gradient boosting (Bi-Xgboost) is proposed to analyze the fault contribution of variables. When a new fault occurs, the changes in the importance of variables before and after the fault occurrence are compared, and the contribution of the process variables to the fault is calculated. Secondly, a fault variables screening method based on the number of variables called mean contribution threshold (MCT) is proposed for screening the appropriate number of fault variables. In addition, a temporal causal discovery network (TCDN) is introduced for root cause analysis with causal time lag information. The proposed framework was validated in the Tennessee Eastman process, and results show that it can identify exact root causes and propagation paths of faults without historical fault data modeling.

    Strategies for robust designs in toxicological tests

    Pozuelo-Campos, SergioCasero-Alonso, VictorAmo-Salas, Mariano
    10页
    查看更多>>摘要:Toxicological tests are widely used to study toxicity in aquatic environments. Reproduction is a possible endpoint of this type of experiment, whose response variable is given by counts. There is a literature on the most suitable probability distribution to be used for analyzing the data. In the theory of optimal experimental design, the assumption of this probability distribution is essential, and when this assumption is not appropriate, there may be a loss of efficiency in the design obtained. The main objective of this study is to propose robust designs when there is uncertainty about the probability distribution of the response variable. Three different strategies for attaining this goal are introduced and compared, and they are then applied to toxicological tests based on Ceriodaphnia dubia and Lemna minor. In addition, a simulation study is performed to test the estimation properties of the robust designs obtained.

    UV spectroscopy as a quantitative monitoring tool in a dairy side-stream fractionation process

    Tonolini, MargheritaSkou, Peter Baekvan den Berg, Frans W. J.
    8页
    查看更多>>摘要:In this work, we investigate the feasibility of multi-wavelength ultra-violet spectroscopy for the quantification of 8-lactoglobulin and alpha-lactalbumin in ultrafiltration permeates from protein fractionation processes. Spectra from solutions of pure proteins were compared and distinctive characteristics for the two proteins were identified. Subsequently, two different calibration approaches were tested to overcome the "cage of covariance " that is inherent in protein fractionation and up concentration processes. Selection of wavelength regions allowed for the prediction of the 8-lactoglobulin and alpha-lactalbumin concentration with high precision and accuracy, reaching a root mean square error of cross-validation of 0.26 w/w% (concentration range 0-10 w/w%*) protein for alpha-lactalbumin and 0.11 w/w% (0-10 w/w%) protein for 8-lactoglobulin. This proves the potential of the methods developed for implementation as rapid monitoring of protein composition in permeates from ultrafiltration processes. The developed Partial Least Squares (PLS) regression models were used to predict protein composition in a continuous mode during two lab-scale filtration experiments. The results obtained show that UV spectroscopy can be used, along with tailored chemometrics techniques, for monitoring protein composition in protein fractionation processes both at-line and potentially in-line.