查看更多>>摘要:Outcomes with a natural order commonly occur in prediction problems and often the available input data are a mixture of complex data like images and tabular predictors. Deep Learning (DL) models are state-of-the-art for image classification tasks but frequently treat ordinal outcomes as unordered and lack interpretability. In contrast, classical ordinal regression models consider the outcome's order and yield interpretable predictor effects but are limited to tabular data. We present ordinal neural network transformation models (omkams), which unite DL with classical ordinal regression approaches. ONTRAM5 are a special case of transformation models and trade off flexibility and interpretability by additively decomposing the transformation function into terms for image and tabular data using jointly trained neural networks. The performance of the most flexible ONTRAM is by definition equivalent to a standard multiclass DL model trained with cross-entropy while being faster in training when facing ordinal outcomes. Lastly, we discuss how to interpret model components for both tabular and image data on two publicly available datasets. (C) 2021 Published by Elsevier Ltd.
查看更多>>摘要:Micro-expression recognition has become challenging, as it is extremely difficult to extract the subtle fa-cial changes of micro-expressions. Recently, several approaches have proposed various expression-shared features algorithms for micro-expression recognition. However, these approaches do not reveal the spe-cific discriminative characteristics, which leads to sub-optimal performance. This paper proposes a novel Feature Refinement (FeatRef) with expression-specific feature learning and fusion for micro-expression recognition that aims to obtain salient and discriminative features for specific expressions and predicts expressions by fusing expression-specific features. FeatRef consists of an expression proposal module with an attention mechanism and a classification branch. First, an inception module is designed based on op-tical flow to obtain expression-shared features. Second, to extract salient and discriminative features for specific expressions, expression-shared features are fed into an expression proposal module with atten-tion factors and proposal loss. Last, in the classification branch, category labels are predicted via a fusion of expression-specific features. Experiments on three publicly available databases validate the effective-ness of FeatRef under different protocols. The results on public benchmarks demonstrate that FeatRef provides salient and discriminative information for micro-expression recognition. The results also show that FeatRef achieves better or competitive performance with existing state-of-the-art methods on micro-expression recognition. (c) 2021 Elsevier Ltd. All rights reserved.
查看更多>>摘要:Foreground-background segmentation (FBS) is one of the prime tasks for automated video-based applica-tions like traffic analysis and surveillance. The different practical scenarios like weather degraded videos, irregular moving objects, dynamic background, etc., make FBS a challenging task. The existing FBS algo-rithms mainly depend on one of the three different factors, namely (1) complicated training process, (2) additionally trained modules for other applications, or (3) neglect the inter-frame spatio-temporal struc-tural dependencies. In this paper, a novel multi-frame-based adversarial learning network is proposed with multi-scale inception and residual module for FBS. As, FBS is a temporal enlightenment-based prob-lem, a temporal encoding mechanism with decreasing variable intervals is proposed for the input frame selection. The proposed network comprises multi-scale inception and residual connection-based dense modules to learn prominent features of the foreground object(s). Also, feedback of the estimated fore-ground map of previous frame is utilized to exhibit more temporal consistency. Learning of the network is concentrated in different ways like cross-data, disjoint, and global training-testing for FBS. The qualitative and quantitative experimental analysis of the proposed approach is done on three benchmark datasets for FBS. Experimental analysis on three benchmark datasets proves the significance of the proposed approach as compared to state-of-the-art FBS approaches. (c) 2021 Elsevier Ltd. All rights reserved.
查看更多>>摘要:Measures of distance or how data points are positioned relative to each other are fundamental in pattern recognition. The concept of depth measures how deep an arbitrary point is positioned in a dataset, and is an interesting concept in this regard. However, while this concept has received a lot of attention in the statistical literature, its application within pattern recognition is still limited. To increase the applicability of the depth concept in pattern recognition, we address the well-known computational challenges associated with the depth concept, by suggesting to estimate depth using incremental quantile estimators . The suggested algorithm can not only estimate depth when the dataset is known in advance, but can also track depth for dynamically varying data streams by using recursive updates . The tracking ability of the algorithm was demonstrated based on a real-life application associated with detecting changes in human activity from real-time accelerometer observations. Given the flexibility of the suggested approach, it can detect virtually any kind of changes in the distributional patterns of the observations, and thus outperforms detection approaches based on the Mahalanobis distance. (c) 2021 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY license ( http://creativecommons.org/licenses/by/4.0/ )
Cina, Antonio EmanueleTorcinovich, AlessandroPelillo, Marcello
11页
查看更多>>摘要:Clustering algorithms play a fundamental role as tools in decision-making and sensible automation pro-cesses. Due to the widespread use of these applications, a robustness analysis of this family of algorithms against adversarial noise has become imperative. To the best of our knowledge, however, only a few works have currently addressed this problem. In an attempt to fill this gap, in this work, we propose a black-box adversarial attack for crafting adversarial samples to test the robustness of clustering algo-rithms. We formulate the problem as a constrained minimization program, general in its structure and customizable by the attacker according to her capability constraints. We do not assume any information about the internal structure of the victim clustering algorithm, and we allow the attacker to query it as a service only. In the absence of any derivative information, we perform the optimization with a custom approach inspired by the Abstract Genetic Algorithm (AGA). In the experimental part, we demonstrate the sensibility of different single and ensemble clustering algorithms against our crafted adversarial samples on different scenarios. Furthermore, we perform a comparison of our algorithm with a state-of-the-art approach showing that we are able to reach or even outperform its performance. Finally, to highlight the general nature of the generated noise, we show that our attacks are transferable even against supervised algorithms such as SVMs, random forests and neural networks. (c) 2021 Elsevier Ltd. All rights reserved.
查看更多>>摘要:Fine-grained action recognition involves comparison of similar actions of variable-length size consisting of subtle interactions between human and specific objects. Hence, we propose a dynamic kernel-based approach to handle the variable-length patterns for effective recognition of fine-grained actions. Initially, we extract local spatio-temporal features for each video to capture appearance and motion information effectively. An action-independent Gaussian mixture model (AIGMM) is trained on the extracted features of all fine-grained actions to analyze spatio-temporal information and preserve the local similarities among fine-grained actions. Then, the statistics of AIGMM, namely, mean, covariance, and posteriors are used to build the kernels for finding the similarity between any two fine-grained actions by mapping statistics to kernel feature space. We demonstrate the effectiveness of proposed approach using three dynamic kernels i.e., GMM mean interval kernel, supervector kernel, intermediate matching kernel on four varieties of fine-grained action datasets, namely, MERL, JIGSAWS, KSCGR, and MPII cooking2 (c) 2021 Elsevier Ltd. All rights reserved.
查看更多>>摘要:Optical flow, which expresses pixel displacement, is widely used in many computer vision tasks to pro -vide pixel-level motion information. However, with the remarkable progress of the convolutional neu-ral network, recent state-of-the-art approaches are proposed to solve problems directly on feature-level. Since the displacement of feature vector is not consistent with the pixel displacement, a common ap-proach is to forward optical flow to a neural network and fine-tune this network on the task dataset. With this method, they expect the fine-tuned network to produce tensors encoding feature-level motion information. In this paper, we rethink about this de facto paradigm and analyze its drawbacks in the video object detection task. To mitigate these issues, we propose a novel network (IFF-Net) with an In-network Feature Flow estimation module (IFF module) for video object detection. Without resorting to pre-training on any additional dataset, our IFF module is able to directly produce feature flow which in-dicates the feature displacement. Our IFF module consists of a shallow module, which shares the features with the detection branches. This compact design enables our IFF-Net to accurately detect objects, while maintaining a fast inference speed. Furthermore, we propose a transformation residual loss (TRL) based on self-supervision, which further improves the performance of our IFF-Net. Our IFF-Net outperforms ex-isting methods and achieves new state-of-the-art performance on ImageNet VID. (c) 2021 Elsevier Ltd. All rights reserved.
查看更多>>摘要:In this paper, a novel non-parametric clustering algorithm which is based on the concept of divide-and-merge is proposed. The proposed algorithm is based on two primary phases, after data cleaning: (i) the Division phase and (ii) the Merging phase. In the initial phase of division, the data is divided into an optimized number of small sub-clusters utilizing all the dimensions of the data. In the second phase of merging, the small sub-clusters obtained as a result of division are merged according to an advanced statistical metric to form the actual clusters in the data. The proposed algorithm has the following mer-its: (i) ability to discover both convex and non-convex shaped clusters, (ii) ability to discover clusters different in densities, (iii) ability to detect and remove outliers/noise in the data (iv) easily tunable or fixed hyperparameters (v) and its usability for high dimensional data. The proposed algorithm is exten-sively tested on 20 benchmark datasets including both, the synthetic and the real datasets and is found better/competing to the existing state-of-the-art parametric and non-parametric clustering algorithms. (c) 2021 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license ( http://creativecommons.org/licenses/by-nc-nd/4.0/ )
查看更多>>摘要:As one of the important dimensionality reduction techniques, unsupervised feature selection (UFS) has enjoyed amounts of popularity over the last few decades, which can not only improve learning performance, but also enhance interpretability and reduce computational costs. The existing UFS methods often model the data in the original feature space, which cannot fully exploit the discriminative information. In this paper, to address this issue, we investigate how to strengthen the relationship between UFS and the feature subspace, so as to select relevant features more straightforwardly and effectively. Methodologically, a novel UFS approach, referred to as Graph Regularized Local Linear Embedding (GLLE), is proposed by integrating local linear embedding (LLE) and manifold regularization constrained in feature subspace into a unified framework. To be more specific, we explicitly define a feature selection matrix composed of 0 and 1, which can realize the process of UFS. For the purpose of modelling the feature selection matrix, we propose to preserve the local linear reconstruction relationship among neighboring data points in the feature subspace, which corresponds to LLE constrained in the feature subspace. To make the feature selection matrix more accurate, we propose to use manifold regularization as an assistant of LLE to find the relevant and representative features such that the selected features can make each sample under the feature subspace be accordance with the manifold assumption. A tailored iterative algorithm based on Alternative Direction Method of Multipliers (ADMM) is designed to solve the proposed optimization problem. Extensive experiments on twelve real-world benchmark datasets are conducted, and the more promising results are achieved compared with the state-of-the-arts approaches. (c) 2021 Elsevier Ltd. All rights reserved.
查看更多>>摘要:Hashing based methods have gained great success for cross-modal similarity search, due to its fast query speed and low storage cost. However, there are some challenging problems that need to be further solved: 1) Many approaches are sensitive to noises and outliers, because pound 2 norm is utilized in the objec-tive function, the error may be amplified. 2) Most existing methods take relaxation or rounding scheme to generate binary codes, causing a large quantization loss. 3) Many supervised cross-media algorithms usually take a large n x n matrix to preserve the similarity relationship, leading to large calculation and making them unscalable. To mitigate these challenges, we develop a novel cross-media search algorithm, i.e., robust and discrete matrix factorization hashing, dubbed RDMH. The method takes a two-step strat-egy. In the first phase, the pound 2 , 1 norm is utilized to improve the robustness, which makes our model not sensitive to noises and outliers. We can learn the hash codes directly by the proposed discrete optimiza-tion method instead of relaxation scheme, avoiding the large quantization loss. Moreover, RDMH corre-lates the hash codes and semantic labels directly instead of manipulating the large similarity matrix. In the second phase, we propose an autoencoder strategy to learn the hash functions, more valuable infor-mation can be preserved and making the hash functions more powerful. Comprehensive experiments on several databases demonstrate the superior performance and efficacy of the developed RDMH. (c) 2021 Elsevier Ltd. All rights reserved.