首页期刊导航|Pattern Recognition
期刊信息/Journal information
Pattern Recognition
Pergamon
Pattern Recognition

Pergamon

0031-3203

Pattern Recognition/Journal Pattern RecognitionSCIAHCIISTPEI
正式出版
收录年代

    Feature wise normalization: An effective way of normalizing data

    Singh, DalwinderSingh, Birmohan
    14页
    查看更多>>摘要:This paper presents a novel Feature Wise Normalization approach for the effective normalization of data. In this approach, each feature is normalized independently with one of the methods from the pool of normalization methods. It is in contrast to the conventional approach which normalizes the data with one method only and as a result, yields suboptimal performance. Additionally, generalization and superiority among normalization methods are also not ensured owing to different machine learning mechanisms for solving classification tasks. The proposed approach benefits from the collective response of multiple methods to normalize the data better as individual features become a normalization unit. The selection of methods is a combinatorial problem that can be solved with optimization algorithms. For this purpose, Antlion optimization is considered that combines the search of methods with the fine-tuning of classifier parameters. Twelve methods are used to create the pool beside the original scale, and the obtained data is evaluated on four learning algorithms. Experiments are performed on 18 benchmark datasets to show the efficacy of the proposed approach in contrast to conventional normalization. (C) 2021 Elsevier Ltd. All rights reserved.

    Loss functions for pose guided person image generation

    Shi, HaoyueLe WangZheng, NanningHua, Gang...
    14页
    查看更多>>摘要:Pose guided person image generation aims to transform a source person image to a target pose. It is an ill-posed problem as we often need to generate pixels that are invisible in the source image. Recent works focus on designing new architectures of deep neural networks and show promising performance. However, they simply adopt loss functions widely used in generic image generation tasks, e.g., adversarial loss, L1-norm loss, perceptual loss, and style loss, which fail to consider the unique structural patterns of a person. In addition, it remains unclear how each individual loss and their combinations impact the generated person images. The goal of this paper is to have a comprehensive study of loss functions for pose guided person image generation. After revisiting these generic loss functions, we consider the structural similarity (SSIM) index as a loss function since it is widely used as the evaluation metric and can capture the perceptual quality of generated images. In addition, motivated by the observation that a person can be divided into part regions with homogeneous pixel values or texture, we extend the SSIM loss into a novel Part-based SSIM (PSSIM) loss to explicitly account for the articulated body structure. A new PSSIM metric is then proposed naturally to access the quality of generated person images. In order to have a deep investigation of loss functions, we conduct extensive experiments including single-loss analysis, multi-loss combination analysis, optimal loss combination search, and comparison with state-of-the-art methods. Both quantitative and qualitative results indicate that (1) using different loss functions significantly impacts the generated person images, (2) the combination of adversarial loss, perceptual loss, and PSSIM loss is the optimal choice for person image generation, and (3) the proposed PSSIM loss is complementary to prior losses and helps improve the performance of state-of-the art methods. We have made the source code publicly available at https://github.com/shyern/Pose- Transfer-pSSIM.git . (c) 2021 Elsevier Ltd. All rights reserved.

    Biological eagle eye-based method for change detection in water scenes

    Li, JingchunDeng, YiminWang, Fei-YueLi, Xuan...
    11页
    查看更多>>摘要:Change detection (CD) is an important vision task for autonomous landing of unmanned aerial vehicles (UAV) on water. High-density photoreceptors and lateral inhibition mechanisms have inspired a novel bi-ologic computational method based on structure and properties in eagle eyes as proposed for change de-tection. We call this method "STabCD," which ensures spatiotemporal distribution consistency to achieve foreground acquisition, noise reduction, and background adaptability. Therefore, our proposed model re-sponds strongly to object information and suppresses noise and wave textures. Then, we present a cloning method to simulate water scenes and collect a new synthetic dataset (called "Synthetic Boat Sequence") for UAV vision research. Besides, we utilize synthetic datasets and corresponding real datasets to conduct change detection experiments. The experimental results indicate that: 1) the STabCD model achieves the best results in real or synthetic water landing scenes; and 2) change detection models for UAV can be quantitatively analyzed and tested under challenging synthetic scenarios. (c) 2021 Elsevier Ltd. All rights reserved.

    Protect, show, attend and tell: Empowering image captioning models with ownership protection

    Lim, Jian HanChan, Chee SengNg, Kam WohFan, Lixin...
    13页
    查看更多>>摘要:By and large, existing Intellectual Property (IP) protection on deep neural networks typically i) focus on image classification task only, and ii) follow a standard digital watermarking framework that was conventionally used to protect the ownership of multimedia and video content. This paper demonstrates that the current digital watermarking framework is insufficient to protect image captioning tasks that are often regarded as one of the frontiers AI problems. As a remedy, this paper studies and proposes two different embedding schemes in the hidden memory state of a recurrent neural network to protect the image captioning model. From empirical points, we prove that a forged key will yield an unusable image captioning model, defeating the purpose of infringement. To the best of our knowledge, this work is the first to propose ownership protection on image captioning task. Also, extensive experiments show that the proposed method does not compromise the original image captioning performance on all common captioning metrics on Flickr30k and MS-COCO datasets, and at the same time it is able to withstand both removal and ambiguity attacks. Code is available at https://github.com/jianhanlim/ipr-imagecaptioning (c) 2021 Elsevier Ltd. All rights reserved.