查看更多>>摘要:Building a human-like integrative artificial cognitive system, that is, an artificial general intelligence (AGI), is the holy grail of the artificial intelligence (AI) field. Furthermore, a computational model that enables an artificial system to achieve cognitive development will be an excellent reference for brain and cognitive science. This paper describes an approach to develop a cognitive architecture by integrating elemental cognitive modules to enable the training of the modules as a whole. This approach is based on two ideas: (1) brain-inspired AI, learning human brain architecture to build human-level intelligence, and (2) a probabilistic generative model (PGM)-based cognitive architecture to develop a cognitive system for developmental robots by integrating PGMs. The proposed development framework is called a whole brain PGM (WB-PGM), which differs fundamentally from existing cognitive architectures in that it can learn continuously through a system based on sensory-motor information.& nbsp;In this paper, we describe the rationale for WB-PGM, the current status of PGM-based elemental cognitive modules, their relationship with the human brain, the approach to the integration of the cognitive modules, and future challenges. Our findings can serve as a reference for brain studies. As PGMs describe explicit informational relationships between variables, WB-PGM provides interpretable guidance from computational sciences to brain science. By providing such information, researchers in neuroscience can provide feedback to researchers in AI and robotics on what the current models lack with reference to the brain. Further, it can facilitate collaboration among researchers in neuro-cognitive sciences as well as AI and robotics. (C)& nbsp;2022 The Author(s). Published by Elsevier Ltd.
查看更多>>摘要:Accurate classification of the children's epilepsy syndrome is vital to the diagnosis and treatment of epilepsy. But existing literature mainly focuses on seizure detection and few attention has been paid to the children's epilepsy syndrome classification. In this paper, we present a study on the classification of two most common epilepsy syndromes: the benign childhood epilepsy with centrotemporal spikes (BECT) and the infantile spasms (also known as the WEST syndrome), recorded from the Children's Hospital, Zhejiang University School of Medicine (CHZU). A novel feature fusion model based on the deep transfer learning and the conventional time-frequency representation of the scalp electroencephalogram (EEG) is developed for the epilepsy syndrome characterization. A fully connected network is constructed for the feature learning and syndrome classification. Experiments on the CHZU database show that the proposed algorithm can offer an average of 92.35% classification accuracy on the BECT and WEST syndromes and their corresponding normal cases. (C) 2022 Elsevier Ltd. All rights reserved.
查看更多>>摘要:This paper proposes a new hierarchical approach to learning rate adaptation in gradient methods, called learning rate optimization (LRO). LRO formulates the learning rate adaption problem as a hierarchical optimization problem that minimizes the loss function with respect to the learning rate for current model parameters and gradients. Then, LRO optimizes the learning rate based on the alternating direction method of multipliers (ADMM). In the process of this learning rate optimization, LRO does not require any second-order information and probabilistic model, so it is highly efficient. Furthermore, LRO does not require any additional hyperparameters when compared to the vanilla gradient method with the simple exponential learning rate decay. In the experiments, we integrated LRO with vanilla SGD and Adam. Then, we compared their optimization performance with the stateof-the-art learning rate adaptation methods and also the most commonly-used adaptive gradient methods. The SGD and Adam with LRO outperformed all the competitors on the benchmark datasets in image classification tasks. (c) 2022 The Author(s). Published by Elsevier Ltd.
查看更多>>摘要:Consider that the constrained convex optimization problems have emerged in a variety of scientific and engineering applications that often require efficient and fast solutions. Inspired by the Nesterov's accelerated method for solving unconstrained convex and strongly convex optimization problems, in this paper we propose two novel accelerated projection neurodynamic approaches for constrained smooth convex and strongly convex optimization based on the variational approach. First, for smooth, and convex optimization problems, a non-autonomous accelerated projection neurodynamic approach (NAAPNA) is presented and the existence, uniqueness and feasibility of the solution to it are analyzed rigorously. We provide that the NAAPNA has a convergence rate which is inversely proportional to the square of the running time. In addition, we present a novel autonomous accelerated projection neurodynamic approach (AAPNA) for addressing the constrained, smooth, strongly convex optimization problems and prove the existence, uniqueness to the strong global solution of AAPNA based on the Cauchy-Lipschitz-Picard theorem. Furthermore, we also prove the global convergence of AAPNA with different exponential convergence rates for different parameters. Compared with existing projection neurodynamic approaches based on the Brouwer's fixed point theorem, both NAAPNA and AAPNA use the projection operators of the auxiliary variable to map the primal variables to the constrained feasible region, thus our proposed neurodynamic approaches are easier to realize algorithm's acceleration. Finally, the effectiveness of NAAPNA and AAPNA is illustrated with several numerical examples. (C)& nbsp;2022 Published by Elsevier Ltd.
查看更多>>摘要:Deep Neural Networks (DNNs) have been vastly and successfully employed in various artificial intelligence and machine learning applications (e.g., image processing and natural language processing). As DNNs become deeper and enclose more filters per layer, they incur high computational costs and large memory consumption to preserve their large number of parameters. Moreover, present processing platforms (e.g., CPU, GPU, and FPGA) have not enough internal memory, and hence external memory storage is needed. Hence deploying DNNs on mobile applications is difficult, considering the limited storage space, computation power, energy supply, and real-time processing requirements. In this work, using a method based on tensor decomposition, network parameters were compressed, thereby reducing access to external memory. This compression method decomposes the network layers' weight tensor into a limited number of principal vectors such that (i) almost all the initial parameters can be retrieved, (ii) the network structure did not change, and (iii) the network quality after reproducing the parameters was almost similar to the original network in terms of detection accuracy. To optimize the realization of this method on FPGA, the tensor decomposition algorithm was modified while its convergence was not affected, and the reproduction of network parameters on FPGA was straightforward. The proposed algorithm reduced the parameters of ResNet50, VGG16, and VGG19 networks trained with Cifar10 and Cifar100 by almost 10 times. (C)& nbsp;2022 Elsevier Ltd. All rights reserved.
查看更多>>摘要:In a competitive game scenario, a set of agents have to learn decisions that maximize their goals and minimize their adversaries' goals at the same time. Besides dealing with the increased dynamics of the scenarios due to the opponents' actions, they usually have to understand how to overcome the opponent's strategies. Most of the common solutions, usually based on continual learning or centralized multi-agent experiences, however, do not allow the development of personalized strategies to face individual opponents. In this paper, we propose a novel model composed of three neural layers that learn a representation of a competitive game, learn how to map the strategy of specific opponents, and how to disrupt them. The entire model is trained online, using a composed loss based on a contrastive optimization, to learn competitive and multiplayer games. We evaluate our model on a pokemon duel scenario and the four-player competitive Chef's Hat card game. Our experiments demonstrate that our model achieves better performance when playing against offline, online, and competitive-specific models, in particular when playing against the same opponent multiple times. We also present a discussion on the impact of our model, in particular on how well it deals with on specific strategy learning for each of the two scenarios. (c) 2022 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
查看更多>>摘要:The propagation of slowly-varying firing rates has been proved significant for the development of the central nervous system. Recent reports have shown that the membrane passive properties of dendrites play a key role in the computation of the single neuron, which is of great importance to the function of neural networks. However, it is still unclear how dendritic passive properties affect the ability of cortical networks to propagate slowly-varying spiking activity. Here, we use twocompartment biophysical models to construct multilayered feedforward neural networks (FFNs) to investigate how dendritic passive properties affect the propagation of the slow-varying inputs. In the two-compartment biophysical models, one compartment represents apical dendrites, and the other compartment describes the soma plus the axon initial segment. Area proportion occupied by somatic compartment and coupling conductance between dendritic and somatic compartments are abstracted to capture the dendritic passive properties. A time-varying signal is injected into the first layer of the FFNs and the fidelity of the signal during propagation is used to qualify the ability of the FFN to transmit wave-like signals. Numerical results reveal an optimal value of coupling conductance between dendritic and somatic compartments to maximize the fidelity of the initial spiking activity. An increase of the dendritic area enhances the initial firing rate of neurons in the first layer by increasing the response of neurons to slow-varying wave-like input, resulting in a delay of attenuation of the firing rate, thus promoting the transmission of signals in FFN. Using a mean-field approach, we examine that changes in area proportion occupied by somatic compartment and coupling conductance between dendritic and somatic compartment affect the signal propagation ability of the FFN by adjusting the input-output transform of a single neuron. With the participation of external noise, a wide range of initial firing rates maintains a unique representation during propagation, which ensures the reliable transmission of slow-varying signals in FFNs. These findings are helpful to understand how passive properties of dendrites participate in the propagation of slowly varying signals in the cerebellum. (c) 2022 Elsevier Ltd. All rights reserved.
查看更多>>摘要:In this paper, a novel dual-channel system for multi-class text emotion recognition has been proposed, and a novel technique to explain its training & predictions has been developed. The architecture of the proposed system contains the embedding module, dual-channel module, emotion classification module, and explainability module. The embedding module extracts the textual features from the input sentences in the form of embedding vectors using the pre-trained Bidirectional Encoder Representations from Transformers (BERT) model. Then the embedding vectors are fed as the inputs to the dual-channel network containing two network channels made up of convolutional neural network (CNN) and bidirectional long short term memory (BiLSTM) network. The intuition behind using CNN and BiLSTM in both the channels was to harness the goodness of the convolutional layer for feature extraction and the BiLSTM layer to extract text's order and sequence-related information. The outputs of both channels are in the form of embedding vectors which are concatenated and fed to the emotion classification module. The proposed system's architecture has been determined by thorough ablation studies, and a framework has been developed to discuss its computational cost. The emotion classification module learns and projects the emotion embeddings on a hyperplane in the form of clusters. The proposed explainability technique explains the training and predictions of the proposed system by analyzing the inter & intra-cluster distances and the intersection of these clusters. The proposed approach's consistent accuracy, precision, recall, and F1 score results for ISEAR, Aman, AffectiveText, and EmotionLines datasets, ensure its applicability to diverse texts.(C)& nbsp;& nbsp;2022 Elsevier Ltd. All rights reserved.
查看更多>>摘要:Deep Reinforcement Learning (RL) is often criticised for being data inefficient and inflexible to changes in task structure. Part of the reason for these issues is that Deep RL typically learns end-to-end using backpropagation, which results in task-specific representations. One approach for circumventing these problems is to apply Deep RL to existing representations that have been learned in a more task agnostic fashion. However, this only partially solves the problem as the Deep RL algorithm learns a function of all pre-existing representations and is therefore still susceptible to data inefficiency and a lack of flexibility. Biological agents appear to solve this problem by forming internal representations over many tasks and only selecting a subset of these features for decision-making based on the task at hand; a process commonly referred to as selective attention. We take inspiration from selective attention in biological agents and propose a novel algorithm called Selective Particle Attention (SPA), which selects subsets of existing representations for Deep RL. Crucially, these subsets are not learned through backpropagation, which is slow and prone to overfitting, but instead via a particle filter that rapidly and flexibly identifies key subsets of features using only reward feedback. We evaluate SPA on two tasks that involve raw pixel input and dynamic changes to the task structure, and show that it greatly increases the efficiency and flexibility of downstream Deep RL algorithms. (C)& nbsp;2022 The Authors. Published by Elsevier Ltd.
查看更多>>摘要:If left untreated, Alzheimer's disease (AD) is a leading cause of slowly progressive dementia. Therefore, it is critical to detect AD to prevent its progression. In this study, we propose a bidirectional progressive recurrent network with imputation (BiPro) that uses longitudinal data, including patient demographics and biomarkers of magnetic resonance imaging (MRI), to forecast clinical diagnoses and phenotypic measurements at multiple timepoints. To compensate for missing observations in the longitudinal data, we use an imputation module to inspect both temporal and multivariate relations associated with the mean and forward relations inherent in the time series data. To encode the imputed information, we define a modification of the long short-term memory (LSTM) cell by using a progressive module to compute the progression score of each biomarker between the given timepoint and the baseline through a negative exponential function. These features are used for the prediction task. The proposed system is an end-to-end deep recurrent network that can accomplish multiple tasks at the same time, including (1) imputing missing values, (2) forecasting phenotypic measurements, and (3) predicting the clinical status of a patient based on longitudinal data. We experimented on 1,335 participants from The Alzheimer's Disease Prediction of Longitudinal Evolution (TADPOLE) challenge cohort. The proposed method achieved a mean area under the receiver-operating characteristic curve (mAUC) of 78% for predicting the clinical status of patients, a mean absolute error (MAE) of 3.5ml for forecasting MRI biomarkers, and an MAE of 6.9ml for missing value imputation. The results confirm that our proposed model outperforms prevalent approaches, and can be used to minimize the progression of Alzheimer's disease.(C) 2022 The Author(s). Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).