首页期刊导航|Pattern Recognition
期刊信息/Journal information
Pattern Recognition
Pergamon
Pattern Recognition

Pergamon

0031-3203

Pattern Recognition/Journal Pattern RecognitionSCIAHCIISTPEI
正式出版
收录年代

    Erlang planning network: An iterative model-based reinforcement learning with multi-perspective

    Wang, JiaoZhang, LeminHe, ZhiqiangZhu, Can...
    10页
    查看更多>>摘要:For model-based reinforcement learning (MBRL), one of the key challenges is modeling error, which cripples the effectiveness of model planning and causes poor robustness during training. In this paper, we propose a bi-level Erlang Planning Network (EPN) architecture, which is composed of an upper-level agent and several multi-scale parallel sub-agents, trained in an iterative way. The proposed method focuses upon the expansion of representation by environment: a multi-perspective over the world model, which presents a varied way to represent an agent's knowledge about the world that alleviates the problem of falling into local optimal points and enhances robustness during the progress of model planning. Moreover, our experiments evaluate EPN on a range of continuous-control tasks in MuJoCo, the evaluation results show that the proposed framework finds exemplar solutions faster and consistently reaches the state-of-the-art performance.(c) 2022 Elsevier Ltd. All rights reserved.

    Interpolation-based nonrigid deformation estimation under manifold regularization constraint

    Zhou, HuabingXu, ZhichaoTian, YuluYu, Zhenghong...
    13页
    查看更多>>摘要:This paper addresses the image/surface deformation problem by estimating interpolation functions pixel by pixel(or voxel by voxel) between control point pairs using labeled control points and unlabeled feature points as input. The labeled control points are usually selected by users and labeled through user operations; the unlabeled feature points are extracted from the source image. We formulate the interpolation function estimation at each pixel as a weighted semi-supervised learning problem. Specially, we employ moving least squares to estimate the nonrigid deformation function according to the weights between each pixel and the labeled control points and exploit manifold regularization to preserve the intrinsic geometric information of the unlabeled feature points contained in the object. Moreover, we define the nonrigid deformation function in a reproducing kernel Hilbert space to derive a closed-form solution. To reduce the computational complexity, we also adopt a sparse approximation to realize a fast implementation. It is worth mentioning that our proposed method is a unified framework with two different basis functions. Both basis-function-based methods are applied to 2D image deformation, 3D surface deformation, and medical image registration. Extensive experiments on the data and the resulting mean opinion score (MOS) on the 2D deformation demonstrate that our methods are superior to state-of-the-art ones. (c) 2022 Elsevier Ltd. All rights reserved.

    End-to-end weakly supervised semantic segmentation with reliable region mining

    Zhang, BingfengXiao, JiminWei, YunchaoHuang, Kaizhu...
    13页
    查看更多>>摘要:Weakly supervised semantic segmentation is a challenging task that only takes image-level labels as supervision but produces pixel-level predictions for testing. To address such a challenging task, most current approaches generate pseudo pixel masks first that are then fed into a separate semantic seg-mentation network. However, these two-step approaches suffer from high complexity and being hard to train as a whole. In this work, we harness the image-level labels to produce reliable pixel-level anno-tations and design a fully end-to-end network to learn to predict segmentation maps. Concretely, we firstly leverage an image classification branch to generate class activation maps for the annotated cate-gories, which are further pruned into tiny reliable object/background regions. Such reliable regions are then directly served as ground-truth labels for the segmentation branch, where both global information and local information sub-branches are used to generate accurate pixel-level predictions. Furthermore, a new joint loss is proposed that considers both shallow and high-level features. Despite its apparent sim-plicity, our end-to-end solution achieves competitive mIoU scores ( val : 65.4%, test : 65.3%) on Pascal VOC compared with the two-step counterparts. By extending our one-step method to two-step, we get a new state-of-the-art performance on the Pascal VOC 2012 dataset(val: 69.3%, test : 69.2%). Code is available at: https://github.com/zbf1991/RRM . (c) 2022 Elsevier Ltd. All rights reserved.

    Personalize d knowle dge-aware recommendation with collaborative and attentive graph convolutional networks

    Dai, QuanyuWu, Xiao-MingFan, LuLi, Qimai...
    12页
    查看更多>>摘要:Knowledge graphs (KGs) are increasingly used to solve the data sparsity and cold start problems of col-laborative filtering. Recently, graph neural networks (GNNs) have been applied to build KG-based rec-ommender systems and achieved competitive performance. However, existing GNN-based methods are either limited in their ability to capture fine-grained semantics in a KG, or insufficient in effectively mod-eling user-item interactions. To address these issues, we propose a novel framework with collaborative and attentive graph convolutional networks for personalized knowledge-aware recommendation. Partic-ularly, we model the user-item graph and the KG separately and simultaneously with an efficient graph convolutional network and a personalized knowledge graph attention network, where the former aims to extract informative collaborative signals, while the latter is designed to capture fine-grained semantics. Collectively, they are able to learn meaningful node representations for predicting user-item interactions. Extensive experiments on benchmark datasets demonstrate the effectiveness of the proposed method compared with state-of-the-arts.(c) 2022 Elsevier Ltd. All rights reserved.

    An attention-based framework for multi-view clustering on Grassmann manifold

    Wu, DanyangDong, XiaNie, FeipingWang, Rong...
    16页
    查看更多>>摘要:The key problem of multi-view clustering is to handle the inconsistency among multiple views. This article proposes an attention-based framework for multi-view clustering on Grassmann manifold (AM-CGM). To be specific, the proposed AMCGM framework aims to learn a representative element on Grass-mann manifold with the following four highlights: 1) AMCGM framework performs an attention-based weighted-learning scheme to capture the difference of views; 2) The clustering results can be directly generated by the structured graph learned via AMCGM, avoiding the randomness caused by traditional label-generation procedures, such as K-means clustering; 3) AMCGM has high extensibility since it can generate many multi-view clustering models on Grassmann manifold; 4) On Grassmann manifold, the re-lationship between the projection metric (PM)-based multi-view clustering model and squared projection metric (SPM)-based model is studied. Based on AMCGM framework, we propose some generated models and provide some useful conclusions. Moreover, to solve the optimization problems involved in the pro-posed AMCGM framework and generated models, we propose an efficiently iterative algorithm and pro-vide rigorous convergence analysis. Extensive experimental results demonstrate the superb performance of our framework. (c) 2022 Elsevier Ltd. All rights reserved.

    The Cobb-Douglas Learning Machine

    Maldonado, SebastianLopez, JulioCarrasco, Miguel
    11页
    查看更多>>摘要:In this paper, we propose a novel machine learning approach based on robust optimization. Our pro-posal defines the task of maximizing the two class accuracies of a binary classification problem as a Cobb-Douglas function. This function is well known in production economics and is used to model the relationship between two or more inputs as well as the quantity produced by those inputs. A robust op-timization problem is defined to construct the decision function. The goal of the model is to classify each training pattern correctly, up to a given class accuracy, even for the worst possible data distribution. We demonstrate the theoretical advantages of the Cobb-Douglas function in terms of the properties of the resulting second-order cone programming problem. Important extensions are proposed and discussed, including the use of kernel functions and regularization. Experiments performed on several classification datasets confirm these advantages, leading to the best average performance in comparison to various alternative classifiers.(c) 2022 Elsevier Ltd. All rights reserved.

    Causal GraphSAGE: A robust graph method for classification based on causal sampling

    Zhang, TaoShan, Hao-RanLittle, Max A.
    11页
    查看更多>>摘要:GraphSAGE is a widely-used graph neural network for classification, which generates node embeddings in two steps: sampling and aggregation. In this paper, we introduce causal inference into the GraphSAGE sampling stage, and propose Causal GraphSAGE (C-GraphSAGE) to improve the robustness of the classifier. In C-GraphSAGE, we use causal bootstrapping to obtain a weighting between the target node's neighbors and their label. Then, these weights are used to resample the node's neighbors to enforce the robustness of the sampling stage. Finally, an aggregation function is trained to integrate the features of the selected neighbors to obtain the embedding of the target node. Experimental results on the Cora, Pubmed, and Citeseer citation datasets show that the classification performance of C-GraphSAGE is equivalent to that of GraphSAGE, GCN, GAT, and RL-GraphSAGE in the case of no perturbation, and outperforms these as the perturbation ratio increases. (c) 2022 Elsevier Ltd. All rights reserved.

    Searching part-specific neural fabrics for human pose estimation

    Yang, SenYang, WankouCui, Zhen
    14页
    查看更多>>摘要:Neural architecture search (NAS) has emerged in many domains to jointly learn the architectures and weights of neural networks. The core spirit behind NAS is to automatically search neural architectures for target tasks with better performance-efficiency trade-offs. However, existing approaches emphasize on only searching a single architecture with less human intervention to replace a human-designed neural network, yet making the search process almost independent of the domain knowledge. In this paper, we aim to apply NAS for human pose estimation and we ask: when NAS meets this localization task, can the articulated human body structure help to search better task-specific architectures?& nbsp;To this end, we first design a new neural architecture search space, Cell-based Neural Fabric (CNF), to learn micro as well as macro neural architecture using a differentiable search strategy. Then, by viewing locating human parts as multiple disentangled prediction sub-tasks, we exploit the compositionality of human body structure as guidance to search multiple part-specific CNFs specialized for different human parts. After the search, all these part-specific neural fabrics have been tailored with distinct micro and macro architecture parameters. The results show that such knowledge-guided NAS-based model outperforms a hand-crafted part-based baseline model, and the resulting multiple part-specific architectures gain significant performance improvement against a single NAS-based architecture for the whole body. The experiments on MPII and COCO datasets show that our models1 achieve comparable performance against the state-of-the-art methods while being relatively lightweight.(c) 2022 Elsevier Ltd. All rights reserved.

    HandyPose: Multi-level framework for hand pose estimation

    Gupta, DivyanshArtacho, BrunoSavakis, Andreas
    10页
    查看更多>>摘要:Hand pose estimation is a challenging task due to the large number of degrees of freedom and the frequent occlusions of joints. To address these challenges, we propose HandyPose, a single-pass, end -to-end trainable architecture for 2D hand pose estimation using a single RGB image as input. Adopt-ing an encoder-decoder framework with multi-level features, along with a novel multi-level waterfall atrous spatial pooling module for multi-scale representations, our method achieves high accuracy in hand pose while maintaining manageable size complexity and modularity of the network. HandyPose takes a multi-scale approach to representing context by incorporating spatial information at various levels of the network to mitigate the loss of resolution due to pooling. Our advanced multi-level waterfall module leverages the efficiency of progressive cascade filtering while maintaining larger fields-of-view through the concatenation of multi-level features from different levels of the network in the waterfall module. The decoder incorporates both the waterfall and multi-scale features for the generation of accurate joint heatmaps in a single stage. Our results demonstrate state-of-the-art performance on popular datasets and show that HandyPose is a robust and efficient architecture for 2D hand pose estimation.(c) 2022 Elsevier Ltd. All rights reserved.

    Molecular substructure graph attention network for molecular property identification in drug discovery

    Ye, Xian-binGuan, QuanlongLuo, WeiqiFang, Liangda...
    13页
    查看更多>>摘要:Molecular machine learning based on graph neural network has a broad prospect in molecular property identification in drug discovery. Molecules contain many types of substructures that may affect their properties. However, conventional methods based on graph neural networks only consider the interaction information between nodes, which may lead to the oversmoothing problem in the multi-hop operations. These methods may not efficiently express the interacting information between molecular substructures. Hence, We develop a Molecular SubStructure Graph ATtention (MSSGAT) network to capture the interacting substructural information, which constructs a composite molecular representation with multi-substructural feature extraction and processes such features effectively with a nested convolution plus readout scheme. We evaluate the performance of our model on 13 benchmark data sets, in which 9 data sets are from the ChEMBL data base and 4 are the SIDER, BBBP, BACE, and HIV data sets. Extensive experimental results show that MSSGAT achieves the best results on most of the data sets compared with other state-of-the-art methods.(c) 2022 Elsevier Ltd. All rights reserved.