期刊,Neural Networks 2022年146卷期_国家学术搜索

期刊信息/Journal information

Neural Networks

Pergamon Press

主办单位：Pergamon Press

国际刊号：0893-6080

Neural Networks/Journal Neural NetworksSCIAHCIEIISTP

正式出版

收录年代

Multi-layer information fusion based on graph convolutional network for knowledge-driven herb recommendation

Yang Y.Rao Y.Yu M.Kang Y....

10页

查看更多>>摘要：? 2021 Elsevier LtdPrescription of Traditional Chinese Medicine (TCM) is a precious treasure accumulated in the long-term development of TCM. Artificial intelligence (AI) technology is used to build herb recommendation models to deeply understand regularities in prescriptions, which is of great significance to clinical application of TCM and discovery of new prescriptions. Most of herb recommendation models constructed in the past ignored the nature information of herbs, and most of them used statistical models based on bag-of-words for herb recommendation, which makes it difficult for the model to perceive the complex correlation between symptoms and herbs. In this paper, we introduce the properties of herbs as additional auxiliary information by constructing herb knowledge graph, and propose a graph convolution model with multi-layer information fusion to obtain symptom feature representations and herb feature representations with rich information and less noise. We apply the proposed model to the TCM prescription dataset, and the experiment results show that our model outperforms the baseline models in terms of Precision@5 by 6.2%, Recall@5 by 16.0% and F1-Score@5 by 12.0%.

原文链接:

NSTL
Elsevier

CSITime: Privacy-preserving human activity recognition using WiFi channel state information

Rathore H.Tiwari K.Pandey H.M.Mathur M....

11页

查看更多>>摘要：? 2021 Elsevier LtdHuman activity recognition (HAR) is an important task in many applications such as smart homes, sports analysis, healthcare services, etc. Popular modalities for human activity recognition involving computer vision and inertial sensors are in the literature for solving HAR, however, they face serious limitations with respect to different illumination, background, clutter, obtrusiveness, and other factors. In recent years, WiFi channel state information (CSI) based activity recognition is gaining momentum due to its many advantages including easy deployability, and cost-effectiveness. This work proposes CSITime, a modified InceptionTime network architecture, a generic architecture for CSI-based human activity recognition. We perceive CSI activity recognition as a multi-variate time series problem. The methodology of CSITime is threefold. First, we pre-process CSI signals followed by data augmentation using two label-mixing strategies — mixup and cutmix to enhance the neural network's learning. Second, in the basic block of CSITime, features from multiple convolutional kernels are concatenated and passed through a self-attention layer followed by a fully connected layer with Mish activation. CSITime network consists of six such blocks followed by a global average pooling layer and a final fully connected layer for the final classification. Third, in the training of the neural network, instead of adopting general training procedures such as early stopping, we use one-cycle policy and cosine annealing to monitor the learning rate. The proposed model has been tested on publicly available benchmark datasets, i.e., ARIL, StanWiFi, and SignFi datasets. The proposed CSITime has achieved accuracy of 98.20%, 98%, and 95.42% on ARIL, StanWiFi, and SignFi datasets, respectively, for WiFi-based activity recognition. This is an improvement on state-of-the-art accuracies by 3.3%, 0.67%, and 0.82% on ARIL, StanWiFi, and SignFi datasets, respectively. In lab-5 users’ scenario of the SignFi dataset, which has the training and testing data from different distributions, our model achieved accuracy was 2.17% higher than state-of-the-art, which shows the comparative robustness of our model.

原文链接:

NSTL
Elsevier

Imitation and mirror systems in robots through Deep Modality Blending Networks

Asada M.Oztop E.Ugur E.Seker M.Y....

14页

查看更多>>摘要：? 2021 Elsevier LtdLearning to interact with the environment not only empowers the agent with manipulation capability but also generates information to facilitate building of action understanding and imitation capabilities. This seems to be a strategy adopted by biological systems, in particular primates, as evidenced by the existence of mirror neurons that seem to be involved in multi-modal action understanding. How to benefit from the interaction experience of the robots to enable understanding actions and goals of other agents is still a challenging question. In this study, we propose a novel method, deep modality blending networks (DMBN), that creates a common latent space from multi-modal experience of a robot by blending multi-modal signals with a stochastic weighting mechanism. We show for the first time that deep learning, when combined with a novel modality blending scheme, can facilitate action recognition and produce structures to sustain anatomical and effect-based imitation capabilities. Our proposed system, which is based on conditional neural processes, can be conditioned on any desired sensory/motor value at any time step, and can generate a complete multi-modal trajectory consistent with the desired conditioning in one-shot by querying the network for all the sampled time points in parallel avoiding the accumulation of prediction errors. Based on simulation experiments with an arm-gripper robot and an RGB camera, we showed that DMBN could make accurate predictions about any missing modality (camera or joint angles) given the available ones outperforming recent multimodal variational autoencoder models in terms of long-horizon high-dimensional trajectory predictions. We further showed that given desired images from different perspectives, i.e. images generated by the observation of other robots placed on different sides of the table, our system could generate image and joint angle sequences that correspond to either anatomical or effect-based imitation behavior. To achieve this mirror-like behavior, our system does not perform a pixel-based template matching but rather benefits from and relies on the common latent space constructed by using both joint and image modalities, as shown by additional experiments. Moreover, we showed that mirror learning (in our system) does not only depend on visual experience and cannot be achieved without proprioceptive experience. Our experiments showed that out of ten training scenarios with different initial configurations, the proposed DMBN model could achieve mirror learning in all of the cases where the model that only uses visual information failed in half of them. Overall, the proposed DMBN architecture not only serves as a computational model for sustaining mirror neuron-like capabilities, but also stands as a powerful machine learning architecture for high-dimensional multi-modal temporal data with robust retrieval capabilities operating with partial information in one or multiple modalities.

原文链接:

NSTL
Elsevier

ARCNN framework for multimodal infodemic detection

Raj C.Meel P.

33页

查看更多>>摘要：? 2021 Elsevier LtdFake news and misinformation have adopted various propagation media over time, nowadays spreading predominantly through online social networks. During the ongoing COVID-19 pandemic, false information is affecting human life in many spheres The world needs automated detection technology and efforts are being made to meet this requirement with the use of artificial intelligence. Neural network detection mechanisms are robust and durable and hence are used extensively in fake news detection. Deep learning algorithms demonstrate efficiency when they are provided with a large amount of training data. Given the scarcity of relevant fake news datasets, we built the Coronavirus Infodemic Dataset (CovID), which contains fake news posts and articles related to coronavirus. This paper presents a novel framework, the Allied Recurrent and Convolutional Neural Network (ARCNN), to detect fake news based on two different modalities: text and image. Our approach uses recurrent neural networks (RNNs) and convolutional neural networks (CNNs) and combines both streams to generate a final prediction. We present extensive research on various popular RNN and CNN models and their performance on six coronavirus-specific fake news datasets. To exhaustively analyze performance, we present experimentation performed and results obtained by combining both modalities using early fusion and four types of late fusion techniques. The proposed framework is validated by comparisons with state-of-the-art fake news detection mechanisms, and our models outperform each of them.

原文链接:

NSTL
Elsevier

Multi-view Teacher–Student Network

Tian Y.Sun S.Tang J.

16页

查看更多>>摘要：? 2021 Elsevier LtdMulti-view learning aims to fully exploit the view-consistency and view-discrepancy for performance improvement. Knowledge Distillation (KD), characterized by the so-called “Teacher–Student” (T-S) learning framework, can transfer information learned from one model to another. Inspired by knowledge distillation, we propose a Multi-view Teacher–Student Network (MTS-Net), which combines knowledge distillation and multi-view learning into a unified framework. We first redefine the teacher and student for the multi-view case. Then the MTS-Net is built by optimizing both the view classification loss and the knowledge distillation loss in an end-to-end training manner. We further extend MTS-Net to image recognition tasks and present a multi-view Teacher–Student framework with convolutional neural networks called MTSCNN. To the best of our knowledge, MTS-Net and MTSCNN bring a new insight to extend the Teacher–Student framework to tackle the multi-view learning problem. We theoretically verify the mechanism of MTS-Net and MTSCNN and comprehensive experiments demonstrate the effectiveness of the proposed methods.

原文链接:

NSTL
Elsevier

Functional connectivity inference from fMRI data using multivariate information measures

Li Q.

13页

查看更多>>摘要：? 2021 Elsevier LtdShannon's entropy or an extension of Shannon's entropy can be used to quantify information transmission between or among variables. Mutual information is the pair-wise information that captures nonlinear relationships between variables. It is more robust than linear correlation methods. Beyond mutual information, two generalizations are defined for multivariate distributions: interaction information or co-information and total correlation or multi-mutual information. In comparison to mutual information, interaction information and total correlation are underutilized and poorly studied in applied neuroscience research. Quantifying information flow between brain regions is not explicitly explained in neuroscience by interaction information and total correlation. This article aims to clarify the distinctions between the neuroscience concepts of mutual information, interaction information, and total correlation. Additionally, we proposed a novel method for determining the interaction information between three variables using total correlation and conditional mutual information. On the other hand, how to apply it properly in practical situations. We supplied both simulation experiments and real neural studies to estimate functional connectivity in the brain with the above three higher-order information-theoretic approaches. In order to capture redundancy information for multivariate variables, we discovered that interaction information and total correlation were both robust, and it could be able to capture both well-known and yet-to-be-discovered functional brain connections.

原文链接:

NSTL
Elsevier

An inertial neural network approach for robust time-of-arrival localization considering clock asynchronization

Xu C.Liu Q.

9页

查看更多>>摘要：? 2021 Elsevier LtdThis paper presents an inertial neural network to solve the source localization optimization problem with l1-norm objective function based on the time of arrival (TOA) localization technique. The convergence and stability of the inertial neural network are analyzed by the Lyapunov function method. An inertial neural network iterative approach is further used to find a better solution among the solutions with different inertial parameters. Furthermore, the clock asynchronization is considered in the TOA l1-norm model for more general real applications, and the corresponding inertial neural network iterative approach is addressed. The numerical simulations and real data are both considered in the experiments. In the simulation experiments, the noise contains uncorrelated zero-mean Gaussian noise and uniform distributed outliers. In the real experiments, the data is obtained by using the ultra wide band (UWB) technology hardware modules. Whether or not there is clock asynchronization, the results show that the proposed approach always can find a more accurate source position compared with some of the existing algorithms, which implies that the proposed approach is more effective than the compared ones.

原文链接:

NSTL
Elsevier

Dilated projection correction network based on autoencoder for hyperspectral image super-resolution

Wang X.Ma J.Jiang J.Zhang X.-P....

13页

查看更多>>摘要：? 2021 Elsevier LtdThis paper focuses on improving the spatial resolution of the hyperspectral image (HSI) by taking the prior information into consideration. In recent years, single HSI super-resolution methods based on deep learning have achieved good performance. However, most of them only simply apply general image super-resolution deep networks to hyperspectral data, thus ignoring some specific characteristics of hyperspectral data itself. In order to make full use of spectral information of the HSI, we transform the HSI SR problem from the image domain into the abundance domain by the dilated projection correction network with an autoencoder, termed as aeDPCN. In particular, we first encode the low-resolution HSI to abundance representation and preserve the spectral information in the decoder network, which could largely reduce the computational complexity. Then, to enhance the spatial resolution of the abundance embedding, we super-resolve the embedding in a coarse-to-fine manner by the dilated projection correction network where the back-projection strategy is introduced to further eliminate spectral distortion. Finally, the predictive images are derived by the same decoder, which increases the stability of our method, even at a large upscaling factor. Extensive experiments on real hyperspectral image scenes demonstrate the superiority of our method over the state-of-the-art, in terms of accuracy and efficiency.

原文链接:

NSTL
Elsevier

Event-centric multi-modal fusion method for dense video captioning

Chang Z.Zhao D.Chen H.Li J....

10页

查看更多>>摘要：? 2021 Elsevier LtdDense video captioning aims to automatically describe several events that occur in a given video, which most state-of-the-art models accomplish by locating and describing multiple events in an untrimmed video. Despite much progress in this area, most current approaches only encode visual features in the event location phase and they neglect the relationships between events, which may degrade the consistency of the description in the identical video. Thus, in the present study, we attempted to exploit visual–audio cues to generate event proposals and enhance event-level representations by capturing their temporal and semantic relationships. Furthermore, to compensate for the major limitation of not fully utilizing multimodal information in the description process, we developed an attention-gating mechanism that dynamically fuses and regulates the multi-modal information. In summary, we propose an event-centric multi-modal fusion approach for dense video captioning (EMVC) to capture the relationships between events and effectively fuse multi-modal information. We conducted comprehensive experiments to evaluate the performance of EMVC based on the benchmark ActivityNet Caption and YouCook2 data sets. The experimental results showed that our model achieved impressive performance compared with state-of-the-art methods.

原文链接:

NSTL
Elsevier

A hybridization of distributed policy and heuristic augmentation for improving federated learning approach

Polap D.Wozniak M.

11页

查看更多>>摘要：? 2021 The Author(s)Modifying the existing models of classifiers’ operation is primarily aimed at increasing the effectiveness as well as minimizing the training time. An additional advantage is the ability to quickly implement a given solution to the real needs of the market. In this paper, we propose a method that can implement various classifiers using the federated learning concept and taking into account parallelism. Also, an important element is the analysis and selection of the best classifier depending on its reliability found for separated datasets extended by new, augmented samples. The proposed augmentation technique involves image processing techniques, neural architectures, and heuristic methods and improves the operation in federated learning by increasing the role of the server. The proposition has been presented and tested for the fruit image classification problem. The conducted experiments have shown that the described technique can be very useful as an implementation method even in the case of a small database. Obtained results are discussed concerning the advantages and disadvantages in the context of practical application like higher accuracy.

原文链接:

NSTL
Elsevier