查看更多>>摘要:Current face recognition tasks are usually carried out on high-quality face images, but in reality, most face images are captured under unconstrained or poor conditions, e.g., by video surveillance. Existing methods are featured by learning data uncertainty to avoid overfitting the noise, or by adding margins to the angle or cosine space of the normalized softmax loss to penalize the target logit, which enforces intra-class compactness and inter-class discrepancy. In this paper, we propose a deep Rival Penalized Competitive Learning (RPCL) for deep face recognition in low-resolution (LR) images. Inspired by the idea of the RPCL, our method further enforces regulation on the rival logit, which is defined as the largest non-target logit for an input image. Different from existing methods that only consider penalization on the target logit, our method not only strengthens the learning towards the target label, but also enforces a reverse direction, i.e., becoming de-learning, away from the rival label. Comprehensive experiments demonstrate that our method improves the existing state-of-the-art methods to be very robust for LR face recognition. (C) 2022 Elsevier Ltd. All rights reserved.
查看更多>>摘要:High quality end-to-end speech translation model relies on a large scale of speech-to-text training data, which is usually scarce or even unavailable for some low-resource language pairs. To overcome this, we propose a target-side data augmentation method for low-resource language speech translation. In particular, we first generate large-scale target-side paraphrases based on a paraphrase generation model which incorporates several statistical machine translation (SMT) features and the commonly used recurrent neural network (RNN) feature. Then, a filtering model which consists of semantic similarity and speech-word pair co-occurrence was proposed to select the highest scoring source speech-target paraphrase pairs from candidates. Experimental results on English, Arabic, German, Latvian, Estonian, Slovenian and Swedish paraphrase generation show that the proposed method achieves significant and consistent improvements over several strong baseline models on PPDB datasets (http://paraphrase. org/). To introduce the results of paraphrase generation into the low-resource speech translation, we propose two strategies: audio-text pairs recombination and multiple references training. Experimental results show that the speech translation models trained on new audio-text datasets which combines the paraphrase generation results lead to substantial improvements over baselines, especially on low-resource languages. (C)& nbsp;2022 Elsevier Ltd. All rights reserved.
查看更多>>摘要:Object tracking by the Siamese network has gained its popularity for its outstanding performance and considerable potential. However, most of the existing Siamese architectures are faced with great difficulties when it comes to the scenes where the target is going through dramatic shape or environmental changes. In this work, we proposed a novel and concise generative adversarial learning method to solve the problem especially when the target is going under drastic changes of appearance, illumination variations and background clutters. We consider the above situations as distractors for tracking and joint a distractor generator into the traditional Siamese network. The component can simulate these distractors, and more robust tracking performance is achieved by eliminating the distractors from the input instance search image. Besides, we use the generalized intersection over union (GIoU) as our training loss. GIoU is a more strict metric for the bounding box regression compared to the traditional IoU, which can be used as training loss for more accurate tracking results. Experiments on five challenging benchmarks have shown favorable and state-of-the-art results against other trackers in different aspects. (C)& nbsp;2022 Elsevier Ltd. All rights reserved.
查看更多>>摘要:We consider restricted Boltzmann machine (RBMs) trained over an unstructured dataset made of blurred copies of definite but unavailable "archetypes " and we show that there exists a critical sample size beyond which the RBM can learn archetypes, namely the machine can successfully play as a generative model or as a classifier, according to the operational routine. In general, assessing a critical sample size (possibly in relation to the quality of the dataset) is still an open problem in machine learning. Here, restricting to the random theory, where shallow networks suffice and the "grandmother-cell " scenario is correct, we leverage the formal equivalence between RBMs and Hopfield networks, to obtain a phase diagram for both the neural architectures which highlights regions, in the space of the control parameters (i.e., number of archetypes, number of neurons, size and quality of the training set), where learning can be accomplished. Our investigations are led by analytical methods based on the statistical-mechanics of disordered systems and results are further corroborated by extensive Monte Carlo simulations. (C) 2022 Elsevier Ltd. All rights reserved.
Nebli, AhmedGharsallaoui, Mohammed AmineGurler, ZeynepRekik, Islem...
12页
查看更多>>摘要:Graph neural networks (GNNs) have witnessed an unprecedented proliferation in tackling several prob-lems in computer vision, computer-aided diagnosis and related fields. While prior studies have focused on boosting the model accuracy, quantifying the reproducibility of the most discriminative features identified by GNNs is still an intact problem that yields concerns about their reliability in clinical applications in particular. Specifically, the reproducibility of biological markers across clinical datasets and distribution shifts across classes (e.g., healthy and disordered brains) is of paramount importance in revealing the underpinning mechanisms of diseases as well as propelling the development of personalized treatment. Motivated by these issues, we propose, for the first time, reproducibility-based GNN selection (RG-Select), a framework for GNN reproducibility assessment via the quantification of the most discriminative features (i.e., biomarkers) shared between different models. To ascertain the soundness of our framework, the reproducibility assessment embraces variations of different factors such as training strategies and data perturbations. Despite these challenges, our framework successfully yielded replicable conclusions across different training strategies and various clinical datasets. Our findings could thus pave the way for the development of biomarker trustworthiness and reliability assessment methods for computer-aided diagnosis and prognosis tasks. RG-Select code is available on GitHub at https://github.com/basiralab/RG-Select. (C) 2022 Elsevier Ltd. All rights reserved.
查看更多>>摘要:As a rotary-percussion system, the vibro-impact drilling (VID) system utilises resonantly induced high frequency periodic impacts alongside existing drill-string rotation to cut through downhole rock layers. Due to the inhomogeneous nature of the rock layers, the system often experiences multi-stability which generates different categories of impact motions as drilling continues downhole. Some impact motions yield better drilling performance in terms of rate of penetration (ROP) and bit life-span when compared to others. As an optimisation strategy, the present study adopts feature-based classification algorithms including multi-layer perceptron, support vector machine and long short-term memory network as intelligent models for categorising impact motions from a one-degree-of-freedom impact oscillator representing the percussive bit-rock impacts of the VID system. This way, high-performance impacts can be easily detected and maintained while undesirable low-performance impacts are well avoided to increase ROP, improve bit life-span and save cost. In this study, scarce and limited classes of experimental impact data are merged with inexhaustibly simulated impact data to train different network models. By means of cross-validation, the trained networks were tested on separate sets of only-simulation and only-experimental data. Results show that extracting appropriate features from raw impact data is essential for optimising the performance of each network model. About 42% of the feature-based networks yield accuracies greater than 91% while about 67% yield accuracies greater than 77% on both simulation and experimental impact motion data.(c) 2022 Elsevier Ltd. All rights reserved.