查看更多>>摘要:Transfer learning is an important research topic in machine learning and recently has been introduced into evolutionary computation to form evolutionary multi-task optimization (EMTO). EMTO focuses on tackling multiple optimization tasks simultaneously based on knowledge transfer and reuse. Many EMTO algorithms have been proposed mainly for single-objective optimization problems and achieved success in various fields, yet multi objective EMTO remains a big challenge. The existing multi-objective EMTO algorithms tend to suffer from issues like slow convergence and degraded performance on less correlated tasks. To alleviate these issues, this paper proposes a new multi-objective EMTO algorithm by introducing cross-dimensional variable search and prediction-based individual search for efficient knowledge transfer. The cross-dimensional variable search optimizes a decision variable using information collected from other variables. The prediction based individual search performs individual mapping where the offspring solutions and the corresponding parent solutions are symmetrized about the predicted population center to maintain the population diversity. The proposed algorithm is tested on benchmark problems and the experimental results demonstrate the effectiveness and efficiency of the algorithm.(c) 2021 Elsevier Inc. All rights reserved.
查看更多>>摘要:The studies on modeling and analysis of time series based on fuzzy granulation have shown that fuzzy granulation is an effective approach in data mining of time series. However, the investigation of fuzzy granulation of interval-valued time series (ITS) and its applications has just begun in recent years with appearance of few research results. Different from the existing studies, this paper carried out the investigation of fuzzy granulation of ITS in interval number space instead of real number space. Two distinguished concepts, namely static fuzzy information granules and dynamic fuzzy information granules of ITS, are proposed firstly. Then the approaches for constructing a static fuzzy information granule and a dynamic fuzzy information granule are designed respectively. After that, two fuzzy granulation methods for ITS based on the above two approaches are presented in the framework of interval analysis under the guidance of the principle of justifiable granularity. Based on the specific proposed fuzzy granule which is called linear dynamic fuzzy granule, a long-term forecasting model for ITS was developed with the aid of artificial neural network. Experiments conducted on several ITS from stock markets with different dynamic characteristics showed the outperformance of the proposed long-term forecasting model.(c) 2021 Elsevier Inc. All rights reserved.
查看更多>>摘要:The uncertain graph is widely used to model and analyze graph data in which the relation between objects is uncertain. We here study the structural clustering in uncertain graphs. As an important method in graph clustering, structural clustering can not only discover the densely connected core vertices, but also the hub vertices and the outliers. We propose a new clustering model named stable structural clustering, to solve the problem existing in previous models that the mined core vertex is a 'real' core one in only a small amount of possible worlds of the uncertain graph. Specifically, we first propose the concept of prob-ability core reliability which measures the probability of a vertex being a core vertex in the uncertain graph. On the basis of probability core reliability, we propose the definition of stable core vertex and formulate the stable structural clustering problem. Comparing with other structural clustering models, the proposed stable structural clustering performs bet -ter in crucial indicators that reflect the quality of clustering. We develop two algorithms to calculate stable core vertex, a precise dynamic programming based algorithm and a sam-pling based algorithm with some effective pruning techniques, based on which we give our structural clustering algorithm. Extensive experiments show that comparing with other structural clustering algorithms in uncertain graphs, the stable structural clustering algo-rithms proposed can get better clustering to a certain extent.(c) 2021 Elsevier Inc. All rights reserved.
Castan-Lascorz, M. A.Jimenez-Herrera, P.Troncoso, A.Asencio-Cortes, G....
17页
查看更多>>摘要:Time series forecasting has become indispensable for multiple applications and industrial processes. Currently, a large number of algorithms have been developed to forecast time series, all of which are suitable depending on the characteristics and patterns to be inferred in each case. In this work, a new algorithm is proposed to predict both univariate and multivariate time series based on a combination of clustering, classification and forecasting techniques. The main goal of the proposed algorithm is first to group windows of time series values with similar patterns by applying a clustering process. Then, a specific forecasting model for each pattern is built and training is only conducted with the time windows corresponding to that pattern. The new algorithm has been designed using a flexible framework that allows the model to be generated using any combination of approaches within multiple machine learning techniques. To evaluate the model, several experiments are carried out using different configurations of the clustering, classification and forecasting methods that the model consists of. The results are analyzed and compared to classical prediction models, such as autoregressive, integrated, moving average and Holt-Winters models, to very recent forecasting methods, including deep, long short-term memory neural networks, and to well-known methods in the literature, such as k nearest neighbors, classification and regression trees, as well as random forest.(c) 2021 The Author(s). Published by Elsevier Inc. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
Kim, SuhyeonKim, HangyeolLee, Eun-SolLim, Chiehyeon...
16页
查看更多>>摘要:The health index measures a person's overall health status which provides useful informa-tion for people to manage their health, so developing a precise and relevant health index is urgent. Currently, many researchers have studied the biological age (BA) estimation, one of the beneficial health indices, by applying machine learning and deep learning techniques to health data. However, most of them have focused on the chronological age prediction or basic latent feature extraction methods. In this paper, we present a new algorithm to estimate BA, called Risk Score-Embedded Autoencoder-based BA (RSAE-BA). RSAE-BA can provide an accurate health index by using deep representation learning with an individ-ual's health risk. We first proposed a notion of risk score (RS) calculation to monitor a per-son's health risk. Then we extracted latent features by using an autoencoder embedding the RS, and used them to generate BA. To evaluate RSAE-BA, we presented a new BA vali-dation method using the RS, which is applicable to both unlabeled and labeled data. We compared the results of RSAE-BA with existing methods, and demonstrated the accuracy of RSAE-BA and its applicability to predict disease incidence. We believe that RSAE-BA will be a useful alternative method to measure a person's health. (c) 2021 The Author(s). Published by Elsevier Inc. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
查看更多>>摘要:Hierarchical classification identifies a sample from the root node to a leaf node along the hierarchical structures of labels. It is often difficult to perform leaf-node prediction owing to ambiguous or incomplete information In such scenarios, a multi-granularity decision needs to be designed to stop the sample at a coarse-grained node rather than going rashly to a wrong leaf node. Conventionally, the probabilities of assigning the sample to its child nodes are regarded as evidence to decide whether to let the sample stop or go. However, the fact that the output of an unreliable classifier cannot be used to appropriately form the basis of decision making, especially with the existence of data uncertainties, is overlooked by existing methods. Therefore, we model the multi-granularity decision problem from an uncertainty perspective and consider that the uncertainties of multi-granularity decision emanate from the prediction of the classifier and the intrinsic information of the data. Inspired by the theory of fuzzy rough sets, a new measure is proposed to describe the intrinsic uncertainty in data. Integrating with the prediction uncertainty of the classifier, the hierarchical structure is used to design an effective optimization method that ensures proper multi-granularity decisions. Experiments show that the proposed algorithm achieves state-of-the-art performance.(c) 2021 Elsevier Inc. All rights reserved.
查看更多>>摘要:Feature selection is a significant preprocessing technique that involves discarding redundant and irrelevant features, so as to reduce the data dimensionality and build a better understanding of data. Pruning superfluous features tend to build better generalization models while improving the computation efficiency extremely. In practices, obtaining the labels of data is time-consuming and labor-intensive, which brings great challenges for feature selection. In this paper, we present a novel robust unsupervised feature selection method for unlabeled dimensionality reduction, which obtains feature importance information by predicting cluster labels of data with matrix factorization. The orthogonal constraints on two decomposing matrices facilitate achieving more accurate class labels of clusters so as to select features with highly discrimination power. Then, the local preserving term is integrated into the projected data for selecting features with local retention ability. Independent of that, an alternative iterative algorithm is incorporated into the optimization of l(2;1)-norm objective function for efficient and robust feature selection. Extensive comparative experiments are carried out on six benchmark datasets to evaluate the performance of the proposed method. The results show that our method outperforms several well-known unsupervised feature selection methods in terms of both clustering accuracy and normalized mutual information. (c) 2021 Elsevier Inc. All rights reserved.
Zhang, Hua-PengOuyang, YaoWang, ZhudengBaets, Bernard De...
12页
查看更多>>摘要:We characterize the class of idempotent nullnorms on a bounded lattice in terms of partic-ular common solutions to two equations related to the underlying meet and join opera-tions. When this common solution is unique, it is an idempotent nullnorm if and only if it is increasing on a particular set. As an application of this characterization, we present several construction methods for idempotent nullnorms on a bounded lattice. These con-struction methods unify and generalize several known ones in the literature.(c) 2021 Elsevier Inc. All rights reserved.
查看更多>>摘要:Despite the multiple benefits of the Fisher information matrix, it is generally disregarded and substituted by the identity matrix or an approximation format. However, when dealing with complicated real-world applications, ignoring the correlation between data features may compromise the modeling capability. To address this problem we present the exact calculation of the Fisher information matrix (EFIM) for the generalized Dirichlet multinomial (GDM) mixture that has proven its efficiency when modeling count data. We present a parametrization of GDM mixture model that allows the determination of the Fisher matrix's elements by means of the Beta-binomial probability function. We also propose a novel count data modeling approach with the benefit of EFIM. In particular, we tackle the problem of mixture model estimation and selection using the Fisher scoring algorithm and minimum message length within the Deterministic Annealing Expectation Maximization learning framework. Experiments on detecting depression in tweets, dialogue-based emotion recognition, and image-based sentiment analysis confirm the capability of the proposed approach and the merits of using the EFIM as compared with existing state-of-the-art methods and techniques that ignore the full determination of the Fisher information matrix's elements.(c) 2021 Elsevier Inc. All rights reserved.
查看更多>>摘要:In this paper we deal with conditional aggregation-based survival functions recently introduced by Boczek et al. (2020). The concept is worth to study because of its possible implementation in real-life situations and mathematical theory as well. The aim of this paper is the comparison of this new notion with the standard survival function. We state sufficient and necessary conditions under which the generalized and the standard survival function equal. The main result is the characterization of the family of conditional aggregation operators (on discrete space) for which these functions coincide.(c) 2021 Elsevier Inc. All rights reserved.