查看更多>>摘要:How to integrate various heterogeneous features for better recognition performance is increasingly critical for automatic target recognition. Existing integration methods present the following drawbacks: (1) most feature integration methods ignore the information, both common and discriminate knowledge, among different types of features; (2) most decision integration methods ignore the fact that different knowledge contributes differently; (3) the feature weights of integration model learned in the source domain cannot perform well in the target domain. To tackle these problems, we propose a deep Knowledge Integration framework by combining heterogeneous features for Domain Adaptive synthetic aperture radar (SAR) target recognition (KIDA). In the training phase, we implement deep knowledge integration at both feature and decision levels. At the feature level, to exploit the common and discriminative knowledge, multiple heterogeneous features are projected from the feature space into a unified label space by exploring the shared and specific structures simultaneously. The shared structure integrates common information in different features, while the specific structure reserves discriminative information of each type of feature. At the decision level, to reveal the relative importance of different knowledge, a decision integration strategy with feature weights is adopted in the label space. In the online testing phase, to improve the generalization of the model in dynamical environments, we employ online learning with sequential target domain knowledge to update the feature weights, thus achieving domain adaptation. Extensive experiments on different datasets validate the effectiveness and advantages of the proposed KIDA, especially in noisy environments. (c) 2022 Elsevier Ltd. All rights reserved.
查看更多>>摘要:Efficient neural networks has received ever-increasing attention with the evolution of convolutional neural networks (CNNs), especially involving their deployment on embedded and mobile platforms. One of the biggest problems to obtaining such efficient neural networks is efficiency, even recent differentiable neural architecture search (DNAS) requires to sample a small number of candidate neural architectures for the selection of the optimal neural architecture. To address this computational efficiency issue, we introduce a novel architecture parameterization based on scaled sigmoid function , and propose a general Differentiable Neural Architecture Learning (DNAL) method to obtain efficient neural networks without the need to evaluate candidate neural networks. Specifically, for stochastic supernets as well as conventional CNNs, we build a new channel-wise module layer with the architecture components controlled by a scaled sigmoid function. We train these neural network models from scratch. The network optimization is decoupled into the weight optimization and the architecture optimization, which avoids the interaction between the two types of parameters and alleviates the vanishing gradient problem. We address the non-convex optimization problem of efficient neural networks by the continuous scaled sigmoid method instead of the common softmax method. Extensive experiments demonstrate our DNAL method delivers superior performance in terms of efficiency, and adapts to conventional CNNs (e.g., VGG16 and ResNet50), lightweight CNNs (e.g., MobileNetV2) and stochastic supernets (e.g., ProxylessNAS). The optimal neural networks learned by DNAL surpass those produced by the state-of-the-art methods on the benchmark CIFAR-10 and ImageNet-1K dataset in accuracy, model size and computational complexity. Our source code is available at https://github.com/QingbeiGuo/DNAL.git . (c) 2022 Elsevier Ltd. All rights reserved.
查看更多>>摘要:A B S T R A C T With the rapid development of multimedia technologies (e.g. deep learning), Feature Selection (FS) is now playing a critical role in acquiring discriminative features from massive data. Traditional FS methods score feature importance and select the top best features by treating all instances equally; Hence, valuable instances like directional outliers (DOs), which are specific outliers closer to other class centres than to their owns, seldom receive particular attention during feature selection. Based on our observation, DOs derive from "misclassified instances" which lead to misclassification. In this paper, we present a novel supervised feature selection method entitled Feature Selection via Directional Outliers Correcting (FSDOC), for accurate data classification. The proposed FSDOC includes an optimization algorithm to capture DOs, and two correcting algorithms to reasonably capture redundant features by correcting DOs with intraclass deviation minimization and interclass relative distance maximization. We give theoretical guarantees and adequate analysis on all algorithms to show the effectiveness of FSDOC. Extensive experiments on fifteen public datasets, and two case studies of deep features and very-high dimensional Fisher Vector selection, demonstrate the superior performance of FSDOC. (c) 2022 Elsevier Ltd. All rights reserved.
查看更多>>摘要:Object detection methods draw increasing attention in deep learning based visual tracking algorithms due to their robust discrimination and powerful regression ability. To further explore the potential of object detection methods in the visual tracking task, there are two gaps that need to be bridged. The first is the difference in object definition. Object detection is class-specific while visual tracking is class agnostic. Moreover, visual tracking needs to differentiate the target from intra-class distractors. The second is the difference in temporal dimension. Different from object detection which processes still-image, visual tracking concentrates on objects which vary continuously with time. In this paper, we propose a Detection to Tracking (D2T) framework to address the above issues and effectively transfer existing advanced detection methods to visual tracking task. Specifically, to bridge the gap of object definition, we propose a general-to-specific network that separates learning general object features and instance-level features. To make full use of the contextual information while adapting to the appearance variation of targets, we propose a temporal strategy combining short-term constraint and long-term updating. To the best of our knowledge, our D2T framework is the first universal framework which directly transfers deep learning based object detectors to visual tracking task. It provides a novel solution to visual object tracking, and it achieves superior performance in several public datasets. (c) 2022 Elsevier Ltd. All rights reserved.
查看更多>>摘要:Establishing reliable correspondences is a fundamental task in computer vision, and it requires rich contextual information. In this paper, we propose a Channel-Spatial Difference Augment Network (CSDA-Net), by selectively aggregating information from spatial and channel aspects, to seek reliable correspondences for feature matching. Specifically, we firstly introduce the spatial and channel attention mechanism to construct a simple yet effective block for discriminately extracting the global context. After that, we design a Overlay Attention block by further exploiting the spatial and channel attention mechanism with different squeeze operations, to gather more comprehensive contextual information. Finally, the proposed CSDA-Net is able to achieve feature maps with a strong representative ability for feature matching due to the integration of the two novel blocks. Extensive experiments on outlier rejection and relative pose estimation have shown better performance improvements of our CSDA-Net over current state-of-the-art methods on both outdoor and indoor datasets. (c) 2022 Elsevier Ltd. All rights reserved.
查看更多>>摘要:The goal of molecule optimization is to optimize molecular properties by modifying molecule structures. Conditional generative models provide a promising way to transfer the input molecules to the ones with better property. However, molecular properties are highly sensitive to small changes in molecular struc-tures. This leads to an interesting thought that we can improve the property of molecules with lim-ited modification in structure. In this paper, we propose a structure-aware conditional Variational Auto-Encoder, namely SCVAE, which exploits the topology of molecules as structure condition and optimizes the molecular properties with constrained structural modification. SCVAE leverages graph alignment of two-level molecule structures in an unsupervised manner to bind the structure conditions between two molecules. Then, this structure condition facilitates the molecule optimization with limited struc-tural modification, namely, constrained molecule optimization, under a novel variational auto-encoder framework. Extensive experimental evaluations demonstrate that structure-aware CVAE generates new molecules with high similarity to the original ones and better molecular properties. (c) 2022 Elsevier Ltd. All rights reserved.
查看更多>>摘要:The recent years have witnessed a resurgence on neural network. Many functional layers are stacked hierarchically to learn the high-level representations. Yet the large album of radar image with label in-formation are scarce. The fitting power of deep architectures are therefore limited. Additionally, the co-herent imaging mechanism inevitably produce many speckles. They are with the statistical specificity of multiplicative noise, and hence make the image interpretation difficult. To solve the problems, this pa-per presents a new hierarchical receptive neural network. A signal-wise receptive module is first built by a family of delicate convolutional filters, with which the empirical features and knowledge are encoded. The receptive features are further refined in a patch-wise receptive unit, where some convolutional blocks are configured sequentially. The refined representations are finally used to make the inference. Multiple comparative studies are performed to demonstrate the advantage of proposed strategy. (c) 2022 Elsevier Ltd. All rights reserved.
查看更多>>摘要:An effective approach for the task of face recognition is proposed in this paper, which formulates the problem as an enhanced nuclear norm based matrix regression model and explores the low-rank property of the reconstructed image. Previous works have already leveraged the nuclear norm to obtain a low-rank representation of the error image and get a promising recognition rate. Motivated by the low rank property of the reconstructed image through theoretical observation, our model imposes the nuclear norm constraints not only on the representation residual but also on the reconstructed image. The proposed method preserves the 2D structural information of the error images and reconstructs images, which is significant for the face recognition tasks. To further improve the performance of the proposed model, we explore the impact of different regularization terms under various scenarios. Extensive experiments on several benchmark datasets show the efficacy of the proposed model especially in terms of robustness against contiguous occlusion and illumination changes, which achieves superior performance over the most competitive methods. (c) 2022 Elsevier Ltd. All rights reserved.
查看更多>>摘要:In this paper, we propose a new topic, Human-Centric Captioning, to mainly describe the human behavior in an image. Human activities and relationships are the primary objectives of visual understanding in daily applications. However, existing image captioning systems cannot differently treat humans and other objects, which limits the ability to understand and describe diverse human activities. As the first explorer of this new task, we build a novel Human-Centric COCO dataset concentrating on humans. Accordingly, we propose a novel Human-Centric Captioning Model (HCCM) that focuses on human-centric feature hierarchization and sentence generation. Specifically, our model first utilizes human body part level knowledge to hierarchize the image features and then applies a novel three-branch captioning model to process these hierarchical features independently to calibrate the descriptions of human actions. Comprehensive experiments demonstrate that our HCCM achieves the state-of-the-art performance with BLEU-4, CIDEr and SPICE scores of 41.5, 127.3, 23.5 respectively. Dataset and code are publicly available at https://github.com/JohnDreamer/HCCM/. (c) 2022 Elsevier Ltd. All rights reserved.
查看更多>>摘要:Multi-instance learning (MIL) is able to cope with the weakly supervised problems where the training data is represented by labeled bags consisting of multiple unlabeled instances. Due to its practical signif-icance, MIL has recently drawn increasing attention. Introducing bag representations is an attractive way to learn MIL data. However, it is difficult for the existing MIL methods to utilize both implicit and ex-plicit bag representations simultaneously. In this paper, we propose a bag dissimilarity regularized (BDR) framework that incorporates multiple bag representations regardless of explicitness or implicitness. Here, the implicit bag representations are incorporated into a regularization term that contains the intrinsic geometric information provided by the bag dissimilarities. The regularization term can be added to the objective function of supervised classifiers. An effective method for explicit bag embedding is also pro-posed, which exploits the Fisher score derived from factor analysis. Finally, we propose two specific BDR methods based on support vector machine and broad learning system. The proposed BDR methods are evaluated on 14 datasets, and have achieved competitive results with limited computation consumption. We also discuss the effectiveness and the characteristics of BDR framework. (c) 2022 Elsevier Ltd. All rights reserved.