首页期刊导航|Pattern Recognition
期刊信息/Journal information
Pattern Recognition
Pergamon
Pattern Recognition

Pergamon

0031-3203

Pattern Recognition/Journal Pattern RecognitionSCIAHCIISTPEI
正式出版
收录年代

    Special Issue on Conformal and Probabilistic Prediction with Applications: Preface

    Gammerman, AlexanderVovk, VladimirCristani, Marco
    1页

    An attention-enhanced cross-task network to analyse lung nodule attributes in CT images

    Fu, XiaohangBi, LeiKumar, AshnilFulham, Michael...
    12页
    查看更多>>摘要:Accurate characterization of visual attributes such as spiculation, lobulation, and calcification of lung nodules in computed tomography (CT) images is critical in cancer management. The characterization of these attributes is often subjective, which may lead to high inter-and intra-observer variability. Furthermore, lung nodules are often heterogeneous in the cross-sectional image slices of a 3D volume. Current stateof-the-art methods that score multiple attributes rely on deep learning-based multi-task learning (MTL) schemes. These methods, however, extract shared visual features across attributes and then examine each attribute without explicitly leveraging their inherent intercorrelations. Furthermore, current methods treat each slice with equal importance without considering their relevance or heterogeneity, which limits performance. In this study, we address these challenges with a new convolutional neural network (CNN) based MTL model that incorporates multiple attention-based learning modules to simultaneously score 9 visual attributes of lung nodules in CT image volumes. Our model processes entire nodule volumes of arbitrary depth and uses a slice attention module to filter out irrelevant slices. We also introduce cross attribute and attribute specialization attention modules that learn an optimal amalgamation of meaningful representations to leverage relationships between attributes. We demonstrate that our model outperforms previous state-of-the-art methods at scoring attributes using the well-known public LIDC-IDRI dataset of pulmonary nodules from over 1,0 0 0 patients. Our model also performs competitively when repurposed for benign-malignant classification. Our attention modules provide easy-to-interpret weights that offer insights into the predictions of the model. (c) 2022 Elsevier Ltd. All rights reserved.

    Deep attention aware feature learning for person re-Identification

    Chen, YifanWang, HanSun, XiaoluFan, Bin...
    13页
    查看更多>>摘要:A B S T R A C T Visual attention has proven to be effective in improving the performance of person re-identification. Most existing methods apply visual attention heuristically by learning an additional attention map to re-weight the feature maps for person re-identification, however, this kind of methods inevitably increase the model complexity and inference time. In this paper, we propose to incorporate the ability of predicting attention maps as additional objectives in a person ReID network without changing the original structure, thus maintain the same inference time and model size. Two kinds of attention maps have been considered to make the learned feature maps being aware of the person and related body parts respectively. Globally, a holistic attention branch (HAB) is proposed to make the feature maps obtained by backbone could focus on persons so as to alleviate the influence of background. Locally, a partial attention branch (PAB) is proposed to make the extracted features can be decoupled into several groups that are separately responsible for different body parts, thus increasing the robustness to pose variation and partial occlusion. These two kinds of attentions are universal and can be incorporated into existing ReID networks. We have tested its performance on two typical networks (TriNet [1] and Bag of Tricks [2]) and observed significant performance improvement on five widely used datasets. (c) 2022 Elsevier Ltd. All rights reserved.

    Hierarchical electricity time series prediction with cluster analysis and sparse penalty

    Zhang, JunqiSun, QuanZheng, JianbinPang, Yue...
    10页
    查看更多>>摘要:In big data applications, hierarchical time series prediction is an important element of decision-making and concerns the inherent aggregation consistency, which is maintained by reconciliation methods. The paper proposes a novel multiple alternative clustering time series analysis based hierarchical electricity time series prediction method. Instead of adhering the aggregation consistency passively, we first exploit time series mining to construct a hierarchy, and then apply an optimal reconciliation method to improve the prediction accuracy. In particular, k-means clustering method is employed to cluster time series for many times with different k so as to make a large number of time series clusters (patterns), and then the clusters (patterns) based hierarchies are constructed respectively. With the large number of clusters hierarchies and the original geographical hierarchy, an optimal aggregation consistency reconciliation based prediction approach is proposed. Furthermore, the sparse penalty is adapted in our method for "ideal" clusters selection to improve the prediction performance. Compared with the state-of-the-art methods on real-life datasets, our method achieves the improvement of 11.13 % and 24.07 % accurate one-step ahead forecasts on electricity load and solar power data respectively. (C) 2022 Elsevier Ltd. All rights reserved.

    MoRE: Multi-output residual emb e dding for multi-label classification

    Liu, SiyuSong, XuehuaMa, ZhongchenGanaa, Ernest Domanaanmwi...
    13页
    查看更多>>摘要:Multi-label classification (MLC) is one of the challenging tasks in computer vision, where it confronts high dimensional problem both in output label and input feature spaces. This paper proposed solving MLC through multi-output residual embedding (MoRE), which learns appropriate distance metric by analyzing the residuals between input and output spaces. Unlike traditional MLC paradigms that learn relationships between label space and feature space, our proposed approach further learns a low-rank structure in residuals between input and output spaces. And it encodes such residual projection to achieve dimen-sion reduction in label space, enhancing the performance of the proposed algorithm in processing high dimensional MLC task. Furthermore, considering the label correlations between instances and its neigh-bors, multiple residuals of instances neighbors are also incorporated into the proposed model to further learn more appropriate distance metric in the same way. Overall, with residual embedding learning from instances and their neighbors, the obtained metric can learn a more appropriate low-rank structure in label space to handle high dimensional problem in MLC. Experimental results on several data sets, such as Cal500, Corel5k, Bibtex, Delicious, Tmc2007, 20ng, Mirflickr and Rcv1s1, demonstrate the excellent pre-dictive performance of MoRE among STOA methods, such as LMMO-kNN, M3MDC, KRAM, SEEM, CPLST, CSSP, FaIE. (c) 2022 Elsevier Ltd. All rights reserved.

    Multi-level augmented inpainting network using spatial similarity

    Qin, JiaBai, HuihuiZhao, Yao
    17页
    查看更多>>摘要:Recently, multi-scale neural networks have shown promising improvements in image inpainting. However, most of them adopt the progressive way, in which the errors on lower scales may be propagated on higher scales. Addressing this issue, we propose a multi-level augmented inpainting network (MLA Net) to rationally harmonize the inter-and intra-level contexts. Here, a pyramid reconstruction structure (PRS) with three parallel levels is designed to establish the inter-level relationship, which can boost the representation of the features by integrating the texture details into semantics. Then, we propose a novel spatial similarity based attention mechanism (SSA) to ensure the intra-level local continuity between the holes and related available patches. In SSA, in order to focus on the important textures and structures rather than calculating each pixel of the feature equally, a spatial map is utilized to highlight the corresponding spatial locations during the similarity computation. The experiments are evaluated on multiple challenging datasets, which demonstrate that MLA-Net can generate accurate results with better visual quality compared with the state-of-the-art methods. For the 256 x 256 Places2 dataset, PSNR increases 1.02 dB, while FID decreases 0.075. For the 256 x 256 CelebA-HQ dataset, there are 0.22 dB and 0.613 improvements in PSNR and FID. (c) 2022 Elsevier Ltd. All rights reserved.

    ASMFS: Adaptive-similarity-based multi-modality feature selection for classification of Alzheimer's disease

    Shi, YuangZu, ChenHong, MeiZhou, Luping...
    15页
    查看更多>>摘要:Multimodal classification methods using different modalities have great advantages over traditional single-modality-based ones for the diagnosis of Alzheimer's disease (AD) and its prodromal stage mild cognitive impairment (MCI). With the increasing amount of high-dimensional heterogeneous data to be processed, multi-modality feature selection has become a crucial research direction for AD classification. However, traditional methods usually depict the data structure using pre-defined similarity matrix as a priori, which is difficult to precisely measure the intrinsic relationship across different modalities in high-dimensional space. In this paper, we propose a novel multimodal feature selection method called Adaptive-Similarity-based Multi-modality Feature Selection (ASMFS) which performs adaptive similarity learning and feature selection simultaneously. Specifically, a similarity matrix is learned by jointly considering different modalities and at the same time, an efficient feature selection is conducted by imposing group sparsity-inducing l 2 , 1-norm constraint. Evaluated on the Alzheimer's Disease Neuroimaging Initiative (ADNI) database with baseline MRI and FDG-PET imaging data collected from 51 AD, 43 MCI converters (MCI-C), 56 MCI non-converters (MCI-NC) and 52 normal controls (NC), we demonstrate the effectiveness and superiority of our proposed method against other state-of-the-art approaches for multi modality classification of AD/MCI. (c) 2022 Elsevier Ltd. All rights reserved.

    mSODANet: A network for multi-scale object detection in aerial images using hierarchical dilated convolutions *

    Datla, RajeshreddyBabu, Sobhan ChMohan, Krishna C.Jeripothula, Prudviraj...
    10页
    查看更多>>摘要:A B S T R A C T The object detection in aerial images is one of the most commonly used tasks in the wide-range of computer vision applications. However, the object detection is more challenging due to the following issues: (a) the pixel occupancy vary among the different scales of objects, (b) the distribution of objects is not uniform in aerial images, (c) the appearance of an object varies with different view-points and illumination conditions, and (d) the number of objects, even though they belong to same type, vary across the images. To address these issues, we propose a novel network for multi-scale object detection in aerial images using hierarchical dilated convolutions, called as mSODANet. In particular, we probe hierarchical dilated network using parallel dilated convolutions to learn the contextual information of different types of objects at multiple scales and multiple field-of-views. The introduced hierarchical dilated network captures the visual information of aerial image more effectively and enhances the detection capability of the model. Further, the extensive experiments conducted on three challenging publicly available datasets, i.e., Visdrone2019, DOTA (OBB & HBB), NWPU VHR-10, demonstrate the effectiveness of the proposed mSODANet and achieve the state-of-the-art performance on all three datasets. (c) 2022 Elsevier Ltd. All rights reserved.

    Geometric imbalanced deep learning with feature scaling and boundary sample mining

    Wang, ZheDong, QidaGuo, WeiLi, Dongdong...
    11页
    查看更多>>摘要:Data imbalance is a significant factor affecting classification performance in computer vision. In particular, data imbalance is harmful to classification learning and representation learning. To address this issue, this paper proposes a geometric deep learning framework combined with Feature Scaling Module (FSM) and Boundary Samples Mining Module (BSMM). Considering the geometric information in sample distributions of training samples, FSM is proposed to scale the features by hypersphere radius of each class, which improves the representation ability of minority classes. Meanwhile, it is noteworthy that the relationships and information between samples are essential for classification. Therefore, BSMM is proposed to mine the boundary samples by Gabriel Graph that takes the relationships into account. Finally, a loss scheduler is designed to adjust the training process of these two modules. With the scheduler, the model first learns representation and then focuses more on minority classes gradually. Extensive experiments on three benchmark datasets demonstrate the advantages of the proposed learning framework over the state-of-the-art models for solving the imbalance problem. (c) 2022 Elsevier Ltd. All rights reserved.

    Unsupervised person re-identification via simultaneous clustering and mask prediction

    Yin, JunhuiZhang, SiqingXie, JiyangMa, Zhanyu...
    13页
    查看更多>>摘要:Extracting meaningful representation is a key challenge for person re-identification (re-ID) task, especially in the absence of ground truth labels. However, existing unsupervised approaches simply utilize pseudo labels generated from clustering to supervise re-ID model and thus have not yet fully explored the semantic information existing in data itself. This also limits the representation capabilities of learned models. To address the above problem, we propose mask prediction (MaskPre) as a pretext task for unsupervised re-ID, such that the clustering network can capture more semantic information and separate the images into semantic clusters automatically. Specifically, MaskPre masks region-level features with dynamic dropblock layer to generate differently masked views of a single image. To predict the masked regions and bridge the domain gap across views, we design mask prediction head and moving-average model to learn visual consistency from still image and temporal consistency during training process. Meanwhile, we optimize the model by grouping the two masked views into the same cluster, thus enhancing the consistency across views. Experimental results on three public benchmark datasets show that our proposed method outperforms the existing state-of-the-art approaches. (c) 2022 Elsevier Ltd. All rights reserved.