查看更多>>摘要:? 2021As two important types of generalized convex functions, pseudoconvex and quasiconvex functions appear in many practical optimization problems. The lack of convexity poses some difficulties in solving pseudoconvex optimization with quasiconvex constraint functions. In this paper, we propose a one-layer recurrent neural network for solving such problems. We prove that the state of the proposed neural network is convergent from the feasible region to an optimal solution of the given optimization problem. We show that the proposed neural network has several advantages over the existing neural networks for pseudoconvex optimization. Specifically, the proposed neural network is applicable to optimization problems with quasiconvex inequality constraints as well as affine equality constraints. In addition, parameter matrix inversion is avoided and some assumptions on the objective function and inequality constraints in existing results are relaxed. We demonstrate the superior performance and characteristics of the proposed neural network with simulation results in three numerical examples.
查看更多>>摘要:? 2021 Elsevier LtdTime delays are inevitable in the neural processing of sensorimotor systems; small delays can cause severe damage to movement accuracy and stability. It is strongly suggested that the cerebellum compensates for delays in neural signal processing and performs predictive control. Neural computational theories have explored concepts of the internal models of control objects—believed to avoid delays by providing internal feedback information—although there has been no clear relevance to neural processing. The timing-dependent plasticity of parallel fiber-Purkinje cell synapses is well known. The long-term depression of the synapse is observed when parallel fiber activation precedes climbing fiber activation within ?50–300 ms, and is the greatest within 50–200 ms. This paper presents a theory that this temporal difference of 50–200 ms is the basis for an associative anticipation of as many milliseconds. Associative learning can theoretically connect an input signal to a desired signal; therefore, a 50–200 ms earlier input signal can be connected to a desired output signal through temporary asymmetric plasticity. After learning is completed, an input signal generates a desired output signal that appears 50–200 ms later. For the associative learning of temporally continuous signals, this study integrates the universal function approximation capability of the cerebellar cortex model and temporally asymmetric synaptic plasticity to create the theory of associative anticipatory learning of the cerebellum. The effective motor control of this learning is demonstrated by adaptively stabilizing an inverted pendulum with a delay similar to that done by humans.
查看更多>>摘要:? 2021 Elsevier LtdDespite Convolutional Neural Networks (CNNs) based approaches have been successful in objects detection, they predominantly focus on positioning discriminative regions while overlooking the internal holistic part-whole associations within objects. This would ultimately lead to the neglect of feature relationships between object and its parts as well as among those parts, both of which are significantly helpful for detecting discriminative parts. In this paper, we propose to “look insider the objects” by digging into part-whole feature correlations and take the attempts to leverage those correlations endowed by the Capsule Network (CapsNet) for robust object detection. Actually, highly correlated capsules across adjacent layers share high familiarity, which will be more likely to be routed together. In light of this, we take such correlations between different capsules of the preceding training samples as an awareness to constrain the subsequent candidate voting scope during the routing procedure, and a Feature Correlation-Steered CapsNet (FCS-CapsNet) with Locally-Constrained Expectation-Maximum (EM) Routing Agreement (LCEMRA) is proposed. Different from conventional EM routing, LCEMRA stipulates that only those relevant low-level capsules (parts) meeting the requirement of quantified intra-object cohesiveness can be clustered to make up high-level capsules (objects). In doing so, part-object associations can be dug by transformation weighting matrixes between capsules layers during such “part backtracking” procedure. LCEMRA enables low-level capsules to selectively gather projections from a non-spatially-fixed set of high-level capsules. Experiments on VOC2007, VOC2012, HKU-IS, DUTS, and COCO show that FCS-CapsNet can achieve promising object detection effects across multiple evaluation metrics, which are on-par with state-of-the-arts.
查看更多>>摘要:? 2021 Elsevier LtdWriting style is an abstract attribute in handwritten text. It plays an important role in recognition systems and is not easy to define explicitly. Considering the effect of writing style, a writer adaptation method is proposed to transform a writer-independent recognizer toward a particular writer. This transformation has the potential to significantly increase accuracy. In this paper, under the deep learning framework, we propose a general fast writer adaptation solution. Specifically, without depending on other complex skills, a well designed style extractor network (SEN) trained by identification loss (IDL) is introduced to explicitly extract personalized writer information. The architecture of SEN consists of a stack of convolutional layers followed by a recurrent neural network with gated recurrent units to remove semantic context and retain writer information. Then, the outputs of the GRU are further integrated into a one-dimensional vector that is adopted to represent writing style. Finally, the extracted style information is fed into the writer-independent recognizer to achieve adaptation. Validated on offline handwritten text recognition tasks, the proposed fast sentence-level adaptation achieves remarkable improvements in Chinese and English text recognition tasks. Specifically, in the HETR task, a multi-information fusion network that is equipped with a hybrid attention mechanism and that integrates visual features, context features and writing style is proposed. In addition, under the same condition (only one writer-specific text line used as adaptation data), the proposed solution, without consuming extra time, can significantly outperform the previous multiple-pass decoding method. The code is available at https://github.com/Wukong90/Handwritten-Text-Recognition.
查看更多>>摘要:? 2021 Elsevier LtdMeasure-preserving neural networks are well-developed invertible models, however, their approximation capabilities remain unexplored. This paper rigorously analyzes the approximation capabilities of existing measure-preserving neural networks including NICE and RevNets. It is shown that for compact U?RD with D≥2, the measure-preserving neural networks are able to approximate arbitrary measure-preserving map ψ:U→RD which is bounded and injective in the Lp-norm. In particular, any continuously differentiable injective map with ±1 determinant of Jacobian is measure-preserving, thus can be approximated.
查看更多>>摘要:? 2021 Elsevier LtdAn extension of the Neural Additive Model (NAM) called SurvNAM and its modifications are proposed to explain predictions of a black-box machine learning survival model. The method is based on applying the original NAM to solving the explanation problem in the framework of survival analysis. The basic idea behind SurvNAM is to train the network by means of a specific expected loss function which takes into account peculiarities of the survival model predictions. Moreover, the loss function approximates the black-box model by the extension of the Cox proportional hazards model, which uses the well-known Generalized Additive Model (GAM) in place of the simple linear relationship of covariates. The proposed method SurvNAM allows performing local and global explanations. The global explanation uses the whole training dataset. In contrast to the global explanation, a set of synthetic examples around the explained example are randomly generated for the local explanation. The proposed modifications of SurvNAM are based on using the Lasso-based regularization for functions from GAM and for a special representation of the GAM functions using their weighted linear and non-linear parts, which is implemented as a shortcut connection. Many numerical experiments illustrate efficiency of SurvNAM.
查看更多>>摘要:? 2021 Elsevier LtdNeural network pruning can trim the over-parameterized neural networks effectively by removing a number of network parameters. However, the traditional rule-based approaches always depend on manual experience. Existing heuristic search methods in discrete search spaces are usually time consuming and sub-optimal. In this paper, we develop a differentiable multi-pruner and predictor (DMPP) to prune neural networks automatically. The pruner composed of learnable parameters generates the pruning ratios of all convolutional layers as the continuous representation of the network. The neural network-based predictor is employed to predict the performance of different structures, which can accelerate the search process. Pruner and predictor enable us to directly employ gradient-based optimization to find a better structure. In addition, multi-pruner is presented to improve the efficiency of search, and knowledge distillation is leveraged to improve the performance of the pruned network. To evaluate the effectiveness of the proposed method, extensive experiments are performed on CIFAR-10, CIFAR-100, and ImageNet datasets with VGGNet and ResNet. Results show that the present DMPP can achieve a better performance than many previous state-of-the-art methods.
查看更多>>摘要:? 2021 Elsevier LtdWhile previous network compression methods achieve great success, most of them rely on the abundant training data which is, unfortunately, often unavailable in practice due to some reasons, e.g., privacy issues, storage constraints, and transmission limitations. A promising way to solve this problem is to perform compression with a few unlabeled data. Proceeding along this way, we propose a novel few-shot network compression framework named Few-Shot Slimming (FSS). FSS follows the student/teacher paradigm, and contains two steps: (1) construct the student by inheriting principal feature maps from the teacher; (2) refine the student feature representation by knowledge distillation with an enhanced mixing data augmentation method called GridMix. Specifically, in the first step, we employ normalized cross correlation to perform the principal feature analysis, and then theoretically construct a new indicator to select the most informative feature maps from the teacher for the student. The indicator is based on the variances of feature maps which can efficiently quantitate the information richness of the input feature maps in a feature-agnostic manner. In the second step, we perform the knowledge distillation for the initialized student in first step with a novel grid-based mixing data augmentation technique which greatly extends the limited sample dataset. In this way, the student is able to refine its feature representation and achieves a better result. Extensive experiments on multiple benchmarks demonstrate the state-of-the-art performance of FSS. For example, by using 0.2% label-free data of full training set, FSS yields a 60% FLOPs reduction for DenseNet-40 on CIFAR-10 with only a loss of 0.8% in top-1 accuracy, achieving a result on par with that obtained by the conventional full-data methods.
查看更多>>摘要:? 2021 Elsevier LtdIn this paper, we propose an efficient boosting method with theoretical guarantees for binary classification. There are three key ingredients of the proposed boosting method: a fully corrective greedy (FCG) update, a differentiable squared hinge (also called truncated quadratic) loss function, and an efficient alternating direction method of multipliers (ADMM) solver. Compared with traditional boosting methods, on one hand, the FCG update accelerates the numerical convergence rate, and on the other hand, the squared hinge loss inherits the robustness of the hinge loss for classification and maintains the theoretical benefits of the square loss in regression. The ADMM solver with guaranteed fast convergence then provides an efficient implementation for the proposed boosting method. We conduct both theoretical analysis and numerical verification to show the outperformance of the proposed method. Theoretically, a fast learning rate of order O((m/logm)?1/2) is proved under certain standard assumptions, where m is the size of sample set. Numerically, a series of toy simulations and real data experiments are carried out to verify the developed theory.
查看更多>>摘要:? 2021 Elsevier LtdThis paper investigates the problem of adaptive tracking control for a class of nonlinear multi-input and multi-output (MIMO) state-constrained systems with input delay and saturation. During the process of the control scheme, neural network is employed to approximate the unknown nonlinear uncertainties and the appropriate barrier Lyapunov function is introduced to prevent violation of the constraint. In addition, for the issue of input saturation with time delay, a smooth non-affine approximate function and a novel auxiliary system are utilized, respectively. Moreover, adaptive neural tracking control is developed by combining the command filtering backstepping approach, which effectively avoids the explosion of differentiation and reduces the computation burden. The introduced filtering error compensating system brings a significant improvement for the system tracking performance. Finally, the simulation result is presented to verify the feasibility of the proposed strategy.