首页期刊导航|Neural Networks
期刊信息/Journal information
Neural Networks
Pergamon Press
Neural Networks

Pergamon Press

0893-6080

Neural Networks/Journal Neural NetworksSCIAHCIEIISTP
正式出版
收录年代

    Learning policy scheduling for text augmentation

    Li S.Ao X.Pan F.He Q....
    7页
    查看更多>>摘要:? 2021 Elsevier LtdWhen training deep learning models, data augmentation is an important technique to improve the performance and alleviate overfitting. In natural language processing (NLP), existing augmentation methods often use fixed strategies. However, it might be preferred to use different augmentation policies in different stage of training, and different datasets may require different augmentation policies. In this paper, we take dynamic policy scheduling into consideration. We design a search space over augmentation policies by integrating several common augmentation operations. Then, we adopt a population based training method to search the best augmentation schedule. We conduct extensive experiments on five text classification and two machine translation tasks. The results show that the optimized dynamic augmentation schedules achieve significant improvements against previous methods.

    Cross-attention-map-based regularization for adversarial domain adaptation

    Li J.Wang H.Wu K.Liu C....
    11页
    查看更多>>摘要:? 2021 Elsevier LtdIn unsupervised domain adaptation (UDA), many efforts are taken to pull the source domain and the target domain closer by adversarial training. Most methods focus on aligning distributions or features between the source domain and the target domain. However, little attention is paid to the interaction between finer-grained levels, such as classes or samples of the two domains. In contrast to UDA, another transfer learning task, i.e., few-shot learning (FSL), takes full advantage of the finer-grained-level alignment. Many FSL methods implement the interaction between samples of support sets and query sets, leading to significant improvements. We wonder whether we can get some inspiration from these methods and bring such ideas of FSL to UDA. To this end, we first take a closer look at the differences between FSL and UDA and bridge the gap between them by high-confidence sample selection (HCSS). Then we propose cross-attention map generation module (CAMGM) to interact samples selected by HCSS. Moreover, we propose a simple but efficient method called cross-attention-map-based regularization (CAMR) to regularize the feature maps generated by the feature extractor. Experiments on three challenging datasets demonstrate that CAMR can bring solid improvements when added to the original objective. More specifically, the proposed CAMR can outperform original methods by 1% to 2% in most tasks without bells and whistles.

    New effective approach to quasi synchronization of coupled heterogeneous complex networks

    Chen T.
    5页
    查看更多>>摘要:? 2021 Elsevier LtdThis short paper addresses quasi synchronization of linearly coupled heterogeneous systems. Similarity and difference between the complete synchronization of linearly coupled homogeneous systems and the quasi synchronization of linearly coupled heterogeneous systems will be revealed.

    Zenithal isotropic object counting by localization using adversarial training

    Rodriguez-Vazquez J.Alvarez-Fernandez A.Molina M.Campoy P....
    9页
    查看更多>>摘要:? 2021 Elsevier LtdCounting objects in images is a very time-consuming task for humans that yields to errors caused by repetitiveness and boredom. In this paper, we present a novel object counting method that, unlike most of the recent works that focus on the regression of a density map, performs the counting procedure by localizing each single object. This key difference allows us to provide not only an accurate count but the position of every counted object, information that can be critical in some areas such as precision agriculture. The method is designed in two steps: first, a CNN is in charge of mapping arbitrary objects to blob-like structures. Then, using a Laplacian of Gaussian (LoG) filter, we are able to gather the position of all detected objects. We also propose a semi-adversarial training procedure that, combined with the former design, improves the result by a large margin. After evaluating the method on two public benchmarks of isometric objects, we stay on par with the state of the art while being able to provide extra position information.

    Epistemic uncertainty quantification in deep learning classification by the Delta method

    Nilsen G.K.Munthe-Kaas A.Z.Skaug H.J.Brun M....
    13页
    查看更多>>摘要:? 2021 The Author(s)The Delta method is a classical procedure for quantifying epistemic uncertainty in statistical models, but its direct application to deep neural networks is prevented by the large number of parameters P. We propose a low cost approximation of the Delta method applicable to L2-regularized deep neural networks based on the top K eigenpairs of the Fisher information matrix. We address efficient computation of full-rank approximate eigendecompositions in terms of the exact inverse Hessian, the inverse outer-products of gradients approximation and the so-called Sandwich estimator. Moreover, we provide bounds on the approximation error for the uncertainty of the predictive class probabilities. We show that when the smallest computed eigenvalue of the Fisher information matrix is near the L2-regularization rate, the approximation error will be close to zero even when K?P. A demonstration of the methodology is presented using a TensorFlow implementation, and we show that meaningful rankings of images based on predictive uncertainty can be obtained for two LeNet and ResNet-based neural networks using the MNIST and CIFAR-10 datasets. Further, we observe that false positives have on average a higher predictive epistemic uncertainty than true positives. This suggests that there is supplementing information in the uncertainty measure not captured by the classification alone.

    IC neuron: An efficient unit to construct neural networks

    An J.Liu F.Shen F.Zhao J....
    12页
    查看更多>>摘要:? 2021 Elsevier LtdAs a popular machine learning method, neural networks can be used to solve many complex tasks. Their strong generalization ability comes from the representation ability of the basic neuron models. The most popular neuron model is the McCulloch–Pitts (MP) neuron, which uses a simple transformation to process the input signal. A common trend is to use the MP neuron to design various neural networks. However, the optimization of the neuron structure is rarely considered. Inspired by the elastic collision model in physics, we propose a new neuron model that can represent more complex distributions. We term it the Inter-layer Collision (IC) neuron which divides the input space into multiple subspaces to represent different linear transformations. Through this operation, the IC neuron enhances the non-linear representation ability and emphasizes useful input features for a given task. We build the IC networks by integrating the IC neurons into the fully-connected, the convolutional, and the recurrent structures. The IC networks outperform the traditional neural networks in a wide range of tasks. Besides, we combine the IC neuron with deep learning models and show the superiority of the IC neuron. Our research proves that the IC neuron can be an effective basic unit to build network structures and make the network performance better.

    Exponential synchronization of coupled neural networks under stochastic deception attacks

    Zhang H.Li L.Li X.
    10页
    查看更多>>摘要:? 2021 Elsevier LtdIn this paper, the issue of synchronization is investigated for coupled neural networks subject to stochastic deception attacks. Firstly, a general differential inequality with delayed impulses is given. Then, the established differential inequality is further extended to the case of delayed stochastic impulses, in which both the impulsive instants and impulsive intensity are stochastic. Secondly, by modeling the stochastic discrete-time deception attacks as stochastic impulses, synchronization criteria of the coupled neural networks under the corresponding attacks are given. Finally, two numerical examples are provided to demonstrate the correctness of the theoretical results.

    Detecting out-of-distribution samples via variational auto-encoder with reliable uncertainty estimation

    Ran X.Xu M.Mei L.Xu Q....
    10页
    查看更多>>摘要:? 2021 Elsevier LtdVariational autoencoders (VAEs) are influential generative models with rich representation capabilities from the deep neural network architecture and Bayesian method. However, VAE models have a weakness that assign a higher likelihood to out-of-distribution (OOD) inputs than in-distribution (ID) inputs. To address this problem, a reliable uncertainty estimation is considered to be critical for in-depth understanding of OOD inputs. In this study, we propose an improved noise contrastive prior (INCP) to be able to integrate into the encoder of VAEs, called INCPVAE. INCP is scalable, trainable and compatible with VAEs, and it also adopts the merits from the INCP for uncertainty estimation. Experiments on various datasets demonstrate that compared to the standard VAEs, our model is superior in uncertainty estimation for the OOD data and is robust in anomaly detection tasks. The INCPVAE model obtains reliable uncertainty estimation for OOD inputs and solves the OOD problem in VAE models.

    GuidedStyle: Attribute knowledge guided style manipulation for semantic face editing

    Hou X.Zhang X.Liang H.Shen L....
    12页
    查看更多>>摘要:? 2021 Elsevier LtdAlthough significant progress has been made in synthesizing high-quality and visually realistic face images by unconditional Generative Adversarial Networks (GANs), there is still a lack of control over the generation process in order to achieve semantic face editing. In this paper, we propose a novel learning framework, called GuidedStyle, to achieve semantic face editing on pretrained StyleGAN by guiding the image generation process with a knowledge network. Furthermore, we allow an attention mechanism in StyleGAN generator to adaptively select a single layer for style manipulation. As a result, our method is able to perform disentangled and controllable edits along various attributes, including smiling, eyeglasses, gender, mustache, hair color and attractive. Both qualitative and quantitative results demonstrate the superiority of our method over other competing methods for semantic face editing. Moreover, we show that our model can be also applied to different types of real and artistic face editing, demonstrating strong generalization ability.

    Sparsity-control ternary weight networks

    Deng X.Zhang Z.
    12页
    查看更多>>摘要:? 2021 Elsevier LtdDeep neural networks (DNNs) have been widely and successfully applied to various applications, but they require large amounts of memory and computational power. This severely restricts their deployment on resource-limited devices. To address this issue, many efforts have been made on training low-bit weight DNNs. In this paper, we focus on training ternary weight {?1, 0, +1} networks which can avoid multiplications and dramatically reduce the memory and computation requirements. A ternary weight network can be considered as a sparser version of the binary weight counterpart by replacing some ?1s or 1s in the binary weights with 0s, thus leading to more efficient inference but more memory cost. However, the existing approaches to train ternary weight networks cannot control the sparsity (i.e., percentage of 0s) of the ternary weights, which undermines the advantage of ternary weights. In this paper, we propose to our best knowledge the first sparsity-control approach (SCA) to train ternary weight networks, which is simply achieved by a weight discretization regularizer (WDR). SCA is different from all the existing regularizer-based approaches in that it can control the sparsity of the ternary weights through a controller α and does not rely on gradient estimators. We theoretically and empirically show that the sparsity of the trained ternary weights is positively related to α. SCA is extremely simple, easy-to-implement, and is shown to consistently outperform the state-of-the-art approaches significantly over several benchmark datasets and even matches the performances of the full-precision weight counterparts.