首页期刊导航|Pattern Recognition
期刊信息/Journal information
Pattern Recognition
Pergamon
Pattern Recognition

Pergamon

0031-3203

Pattern Recognition/Journal Pattern RecognitionSCIAHCIISTPEI
正式出版
收录年代

    Learning upper patch attention using dual-branch training strategy for masked face recognition

    Zhang, YuxuanWang, XinShakeel, M. SaadWan, Hao...
    15页
    查看更多>>摘要:In the context of pandemic, COVID-19, recognition of masked face images is a challenging problem, as most of the facial components become invisible. By utilizing prior information that mask-occlusion is located in the lower half of the face, we propose a dual-branch training strategy to guide the model to focus on the upper half of the face to extract robust features for Masked face recognition (MFR). During training, the features learned at the intermediate layers of the global branch are fed to our proposed attention module, named Upper Patch Attention (UPA), which acts as a local branch. Both branches are jointly optimized to enhance the feature extraction from non-occluded regions. We also propose a self-attention module, which integrates into the backbone network to enhance the interaction between the channels and spatial locations in the learning process. Extensive experiments on synthetic and real-masked face datasets demonstrate the effectiveness of our method. (c) 2022 Elsevier Ltd. All rights reserved.

    Transferring discriminative knowledge via connective momentum clustering on person re-identification

    Lu, YichenDeng, Weihong
    10页
    查看更多>>摘要:Unsupervised domain adaptation in person re-identification remains a challenge to learning discrimina-tive representations due to the absence of labels in target domain. Clustering could provide pseudo-labels, but the limitation mainly comes from imperfect clustering and noisy pseudo-labels. To address this draw-back, we propose Connective Momentum Clustering (CMC) framework to build a connection estimator via graph convolutional networks to transfer rich connection knowledge from the annotation space of source data to target domain. It estimates connections from context to reveal relationship between unlabeled data and helps to discover more reliable clusters. With momentum mechanism, stable pseudo-labels are updated iteratively with confidence and refined consistently to encourage more discriminative networks. Meanwhile, we notice that the huge domain gap between source and target domains results in severe pollution in BatchNorm layers. To tackle this problem, we normalize the data stream separately to de-couple different distribution and further boost the performance in target domain. We adopt our CMC framework on mainstream tasks and achieves 80.2% mAP / 91.3% Rank-1 on Duke -> Market task and 70.4% mAP / 82.4% Rank-1 on Market -> Duke task. (C) 2022 Elsevier Ltd. All rights reserved.

    Low-resolution human pose estimation

    Wang, ChenZhang, FengZhu, XiatianGe, Shuzhi Sam...
    10页
    查看更多>>摘要:Human pose estimation has achieved significant progress on images with high imaging resolution. However, low-resolution imagery data bring nontrivial challenges which are still under-studied. To fill this gap, we start with investigating existing methods and reveal that the most dominant heatmap-based methods would suffer more severe model performance degradation from low-resolution, and offset learning is an effective strategy. Established on this observation, in this work we propose a novel Confidence-Aware Learning (CAL) method which further addresses two fundamental limitations of existing offset learning methods: inconsistent training and testing, decoupled heatmap and offset learning. Specifically, CAL selectively weighs the learning of heatmap and offset with respect to ground-truth and most confident prediction, whilst capturing the statistical importance of model output in mini-batch learning manner. Extensive experiments conducted on the COCO benchmark show that our method outperforms significantly the state-of-the-art methods for low-resolution human pose estimation. (c) 2022 Elsevier Ltd. All rights reserved.

    Deep face recognition for dim images

    Huang, Yu-HsuanChen, Homer H.
    11页
    查看更多>>摘要:The performance of many state-of-the-art deep face recognition models deteriorates significantly for im-ages captured under low illumination, mainly because the features of dim probe face images cannot match well with those of normal-illumination gallery images. The issue cannot be satisfactorily addressed by enhancing the illumination of face images and performing face recognition on the resulted images alone. We propose a novel deep face recognition framework that consists of a feature restoration net -work, a feature extraction network, and an embedding matching module. The feature restoration network adopts a two-branch structure based on the convolutional neural network to generate a feature image from the raw image and the illumination-enhanced image. The feature extraction network encodes the feature image into an embedding, which is then used by the embedding matching module for face verifi-cation and identification. The overall verification accuracy is improved from 1.1% to 6.7% when tested on the Specs on Faces (SoF) dataset. For face identification, the rank-1 identification accuracy is improved by 2.8%. (c) 2022 Published by Elsevier Ltd.

    Rule extraction with guarantees from regression models

    Johansson, UlfSonstrod, CeciliaLofstrom, TuweBostrom, Henrik...
    9页
    查看更多>>摘要:Tools for understanding and explaining complex predictive models are critical for user acceptance and trust. One such tool is rule extraction, i.e., approximating opaque models with less powerful but interpretable models. Pedagogical (or black-box) rule extraction, where the interpretable model is induced using the original training instances, but with the predictions from the opaque model as targets, has many advantages compared to the decompositional (white-box) approach. Most importantly, pedagogical methods are agnostic to the kind of opaque model used, and any learning algorithm producing interpretable models can be employed for the learning step. The pedagogical approach has, however, one main problem, clearly limiting its utility. Specifically, while the extracted models are trained to mimic the opaque, there are absolutely no guarantees that this will transfer to novel data. This potentially low test set fidelity must be considered a severe drawback, in particular when the extracted models are used for explanation and analysis. In this paper, a novel approach, solving the problem with test set fidelity by utilizing the conformal prediction framework, is suggested for extracting interpretable regression models from opaque models. The extracted models are standard regression trees, but augmented with valid prediction intervals in the leaves. Depending on the exact setup, the use of conformal prediction guarantees that either the test set fidelity or the test set accuracy will be equal to a preset confidence level, in the long run. In the extensive empirical investigation, using 20 publicly available data sets, the validity of the extracted models is demonstrated. In addition, it is shown how normalization can be used to provide individualized prediction intervals, thus providing highly informative extracted models. (c) 2022 The Author(s). Published by Elsevier Ltd. This is an open access article under the CC BY license ( http://creativecommons.org/licenses/by/4.0/ )

    An efficient framework for zero-shot sketch-based image retrieval

    Sridharan, SridhaGoan, EthanFookes, ClintonTursun, Osman...
    11页
    查看更多>>摘要:Zero-shot sketch-based image retrieval (ZS-SBIR) has recently attracted the attention of the computer vision community due to its real-world applications, and the more realistic and challenging setting that it presents over SBIR. ZS-SBIR inherits the main challenges of multiple computer vision problems including content-based Image Retrieval (CBIR), zero-shot learning and domain adaptation. The majority of previous studies using deep neural networks have achieved improved results by either projecting sketch and images into a common low-dimensional space, or transferring knowledge from seen to unseen classes. However, those approaches are trained with complex frameworks composed of multiple deep convolutional neural networks (CNNs) and are dependent on category-level word labels. This increases the requirements for training resources and datasets. In comparison, we propose a simple and efficient framework that does not require high computational training resources, and learns the semantic embedding space from a vision model rather than a language model, as is done by related studies. Furthermore, at training and inference stages our method only uses a single CNN. In this work, a pre-trained ImageNet CNN (i.e., ResNet50) is fine-tuned with three proposed learning objects: domain-balanced quadruplet loss, semantic classification loss , and semantic knowledge preservation loss . The domain-balanced quadruplet and semantic classification losses are introduced to learn discriminative, semantic and domain invariant features by considering ZS-SBIR as an object detection and verification problem. To preserve semantic knowledge learned with ImageNet and exploit it for unseen categories, the semantic knowledge preservation loss is proposed. To reduce computational cost and increase the accuracy of the semantic knowledge distillation process, ground-truth semantic knowledge is prepared in a class-oriented fashion prior to training. Extensive experiments are conducted on three challenging ZS-SBIR datasets: Sketchy Extended, TU-Berlin Extended and QuickDraw Extended. The proposed method achieves state-of-the-art results, and outperforms the majority of related works by a substantial margin. (c) 2022 Elsevier Ltd. All rights reserved.

    Robust image matching via local graph structure consensus

    Jiang, XingyuXia, YifanZhang, Xiao-PingMa, Jiayi...
    14页
    查看更多>>摘要:Image matching plays a vital role in many computer vision tasks, and this paper focuses on the mis-match removal problem of feature-based matching. We formulate the problem into a general yet effec-tive optimization framework based on graph matching by combining integer quadratic programming with a compensation term for discouraging matches, termed as Local Graph Structure Consensus (LGSC). Con -sidering the local area similarity of those potential true matches, we design a local graph structure for preserving geometric topology, which contains a local indicator vector and a local affinity vector for each correspondence. The local indicator vector is utilized for edge construction, while the local affinity vector represents the match correctness of the nodes and edges between two graphs. In particular, the ranking shift with scale and rotation invariance is exploited to represent the node affinity. Ultimately, we derive a closed-form solution with linearithmic time and linear space complexity. Moreover, a multi-scale and iterative graph construction strategy is proposed to promote the performance of our method in terms of robustness and effectiveness. Extensive experiments on various real image datasets demonstrate that our LGSC can achieve superior performance over current state-of-the-art approaches. (c) 2022 Elsevier Ltd. All rights reserved.

    Meta-seg: A survey of meta-learning for image segmentation

    Luo, ShuaiLi, YujieGao, PengxiangWang, Yichuan...
    14页
    查看更多>>摘要:A well-performed deep learning model in image segmentation relies on a large number of labeled data. However, it is hard to obtain sufficient high-quality raw data in industrial applications. Meta-learning, one of the most promising research areas, is recognized as a powerful tool for approaching image segmen-tation. To this end, this paper reviews the state-of-the-art image segmentation methods based on meta-learning. We firstly introduce the background of the image segmentation, including the methods and metrics of image segmentation. Second, we review the timeline of meta-learning and give a more com-prehensive definition of meta-learning. The differences between meta-learning and other similar meth-ods are compared comprehensively. Then, we categorize the existing meta-learning methods into model -based, optimization-based, and metric-based. For each categorization, the popular used meta-learning models are discussed in image segmentation. Next, we conduct comprehensive computational experi-ments to compare these models on two pubic datasets: ISIC-2018 and Covid-19. Finally, the future trends of meta-learning in image segmentation are highlighted. (c) 2022 Published by Elsevier Ltd.

    AccLoc: Anchor-Free and two-stage detector for accurate object localization

    Piao, ZhengquanWang, JunboTanga, LinboZhao, Baojun...
    12页
    查看更多>>摘要:Current anchor-free object detectors have obtained detection performances comparable to those of anchor-based object detectors while avoiding the weaknesses of anchor designs. However, two issues limit the localization performance. First, such anchor-free detectors have one stage that predicts the classification and localization results directly. A large regression space reduces the localization performance of such methods. Second, most of the existing detectors extract features which are ineffective for accurate localization. In this paper, for the first issue, we propose two-stage networks to predict regression results stage by stage, thereby reducing the scope of the prediction space. For the second issue, we design two novel modules with the aim of extracting effective features for accurate localization. Experimental results validate that each module in our approach is effective and validate that our approach has better object localization performance than previous related and advanced methods. (c) 2022 Elsevier Ltd. All rights reserved.

    Prediction with expert advice for a finite number of experts: A practical introduction

    Kalnishkan, Yuri
    10页
    查看更多>>摘要:In this paper, prediction with expert advice is surveyed focusing on Vovk's Aggregating Algorithm. The established theory as well as extensions developed in the recent decade are considered. The paper is aimed at practitioners and covers important application scenarios. (c) 2022 Elsevier Ltd. All rights reserved.