期刊,Pattern Recognition 2022年122卷期_国家学术搜索

期刊信息/Journal information

Pattern Recognition

Pergamon

主办单位：Pergamon

国际刊号：0031-3203

Pattern Recognition/Journal Pattern RecognitionSCIAHCIISTPEI

正式出版

收录年代

Effective and efficient pixel-level detection for diverse video copy-move forgery types

Zhong, Jun-LiuGan, Yan-FenVong, Chi-ManYang, Ji-Xiang...

18页

查看更多>>摘要：Video copy-move forgery detection (VCMFD) is a significant and greatly challenging task due to a variety of difficulties, including a huge amount of video information, diverse forgery types, rich forgery objects, and homogenous forgery sources. These difficulties raise four unresolved key challenges in VCMFD: i) ineffective detection in some popular forgery cases; ii) inefficient matching in processing numerous video pixels with hundred-dimensional features under dozens of matching iterations; iii) high false positive (F-p) in detecting forgery videos; iv) low trade-off of efficiency and effectiveness in filling forgery region, and even failing in indicating forgeries at the pixel level. In this paper, a novel VCMFD method is proposed to address these issues: i) an innovatively improved SIFT structure that can address the thorough feature extraction in all video copy-move forgery cases; ii) a novel fast keypoint-label matching (FKLM) algorithm is proposed that creates some keypoint-label groups so that every high-dimensional feature is assigned into one of these groups. As a result, matching of video pixels can be directly done on a small number of keypoint-label groups only, leading to a nearly 500% raise in matching efficiency; iii) a new coarse-to-fine filtering relying on intrinsic attributes of exact keypoint-matches is designed to more effectively reduce the false keypoint-matches; iv) the adaptive block filling relying on true keypoint-matches contributes to the accurate and efficient suspicious region filling, even at the pixel level. Finally, the suspicious region locations with the forgery vision persistence concept indicate forgery videos. Compared to the state-of-art methods, the experiments show that our proposed method achieves the best detection accuracy, lowest F-p, and improved at least 16% and 8% of F-1 scores on the GRIP 2.0 dataset and a combination of SULFA 2.0 & REWIND datasets. Furthermore, the proposed method is with low computational time (4.45 s/Mpixels), which is about 1/2-1/3 times of the latest DFMI-BM (8.02 s/Mpixels) and PM-2D (13.1 s/Mpixels) methods. (C) 2021 Elsevier Ltd. All rights reserved.

原文链接:

NSTL
Elsevier

Efficient COVID-19 testing via contextual model based compressive sensing

Hasaninasab, MehdiKhansari, Mohammad

12页

查看更多>>摘要：The COVID-19 pandemic is threatening billions of people's life all over the world. As of March 6, 2021, covid-19 has confirmed in 115,653,459 people worldwide. It has also a devastating effect on businesses and social activities. Since there is still no definite cure for this disease, extensive testing is the most critical issue to determine the trend of illness, appropriate medical treatment, and make social distancing policies. Besides, testing more people in a shorter time helps to contain the contagion. The PCR-based methods are the most popular tests which take about an hour to make the output result. Obviously, it makes the number of tests highly limited and consequently, hurts the efficiency of pandemic control. In this paper, we propose a new approach to identify affected individuals with a considerably reduced No. of tests. Intuitively, saving time and resources is the main advantage of our approach. We use contextual information to make a graph-based model to be used in model-based compressive sensing (CS). Our proposed model makes the testing with fewer tests required compared to traditional testing methods and even group testing. We embed contextual information such as age, underlying disease, symptoms (i.e. cough, fever, fatigue, loss of consciousness), and social contacts into a graph-based model. This model is used in model-based CS to minimize the required test. We take advantage of Discrete Graph Signal Processing on Graph (DSPG) to generate the model. Our contextual model makes CS more efficient in both the number of samples and the recovery quality. Moreover, it can be applied in the case that group testing is not applicable due to its severe dependency on sparsity. Experimental results show that the overall testing speed (individuals per test ratio) increases more than 15 times compared to the individual testing with the error of less than 5% which is dramatically lower than that of traditional compressive sensing. (C) 2021 Elsevier Ltd. All rights reserved.

原文链接:

NSTL
Elsevier

Deep momentum uncertainty hashing

Fu, ChaoyouWang, GuoliWu, XiangZhang, Qian...

10页

查看更多>>摘要：Combinatorial optimization (CO) has been a hot research topic because of its theoretic and practical importance. As a classic CO problem, deep hashing aims to find an optimal code for each data from finite discrete possibilities, while the discrete nature brings a big challenge to the optimization process. Previous methods usually mitigate this challenge by binary approximation, substituting binary codes for real-values via activation functions or regularizations. However, such approximation leads to uncertainty between real-values and binary ones, degrading retrieval performance. In this paper, we propose a novel Deep Momentum Uncertainty Hashing (DMUH). It explicitly estimates the uncertainty during training and leverages the uncertainty information to guide the approximation process. Specifically, we model bit -level uncertainty via measuring the discrepancy between the output of a hashing network and that of a momentum-updated network. The discrepancy of each bit indicates the uncertainty of the hashing network to the approximate output of that bit. Meanwhile, the mean discrepancy of all bits in a hashing code can be regarded as image-level uncertainty . It embodies the uncertainty of the hashing network to the corresponding input image. The hashing bit and image with higher uncertainty are paid more attention during optimization. To the best of our knowledge, this is the first work to study the uncertainty in hashing bits. Extensive experiments are conducted on four datasets to verify the superiority of our method, including CIFAR-10, NUS-WIDE, MS-COCO, and a million-scale dataset Clothing1M. Our method achieves the best performance on all of the datasets and surpasses existing state-of-the-art methods by a large margin. (c) 2021 Elsevier Ltd. All rights reserved.

原文链接:

NSTL
Elsevier

Two-step domain adaptation for underwater image enhancement

Jiang, QunZhang, YunfengBao, FangxunZhao, Xiuyang...

14页

查看更多>>摘要：In recent years, underwater image enhancement methods based on deep learning have achieved remarkable results. Since the images obtained in complex underwater scenarios lack a ground truth, these algorithms mainly train models on underwater images synthesized from in-air images. Synthesized underwater images are different from real-world underwater images; this difference leads to the limited generalizability of the training model when enhancing real-world underwater images. In this work, we present an underwater image enhancement method that does not require training on synthetic underwater images and eliminates the dependence on underwater ground-truth images. Specifically, a novel domain adaptation framework for real-world underwater image enhancement inspired by transfer learning is presented; it transfers in-air image dehazing to real-world underwater image enhancement. The experimental results on different real-world underwater scenes indicate that the proposed method produces visually satisfactory results. (c) 2021 The Author(s). Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license ( http://creativecommons.org/licenses/by-nc-nd/4.0/ )

原文链接:

NSTL
Elsevier

A Novel Quasi-Newton Method for Composite Convex Minimization

Chai, W. H.Ho, S. S.Quek, H. C.

18页

查看更多>>摘要：A fast parallelable Jacobi iteration type optimization method for non-smooth convex composite optimization is presented. Traditional gradient-based techniques cannot solve the problem. Smooth approximate functions are attempted to be used as a replacement of those non-smooth terms without compromising the accuracy. Recently, proximal mapping concept has been introduced into this field. Techniques which utilize proximal average based proximal gradient have been used to solve the problem. The state-of-art methods only utilize first-order information of the smooth approximate function. We integrate both first and second-order techniques to use both first and second-order information to boost the convergence speed. A convergence rate with a lower bound of O(1/k(2)) is achieved by the proposed method and a super-linear convergence is enjoyed when there is proper second-order information. In experiments, the proposed method converges significantly better than the state of art methods which enjoy O(1/k) convergence. (C) 2021 Elsevier Ltd. All rights reserved.

原文链接:

NSTL
Elsevier

Gradient-Aligned convolution neural network

Hao, YouHu, PingLi, ShiruiUdupa, Jayaram K....

10页

查看更多>>摘要：Although Convolution Neural Networks (CNN) have achieved great success in many applications of computer vision in recent years, rotation invariance is still a difficult problem for CNN. Especially for some images, the content can appear in the image at any angle of rotation, such as medical images, microscopic images, remote sensing images and astronomical images. In this paper, we propose a novel convolution operation, called Gradient-Aligned Convolution (GAConv), which can help CNN achieve rotation invariance by replacing vanilla convolutions in CNN. GAConv is implemented with a prior pixel-level gradient alignment operation before regular convolution. With GAConv, Gradient-Aligned CNN (GACNN) can achieve rotation invariance without any data augmentation, feature-map augmentation, and filter enrichment. In GACNN, rotation invariance does not learn from the training set, but bases on the network model. Different from the vanilla CNN, GACNN will output invariant results for all rotated versions of an object, no matter whether the network is trained or not. This means that we only need to train the network with one canonical version of the object and all other rotated versions of this object should be recognized with the same accuracy. Classification experiments have been conducted to evaluate GACNN compared with some rotation invariant approaches. GACNN achieved the best results on the 360 degrees rotated test set of MNIST-rotation, Plankton-sub-rotation, and Galaxy Zoo 2. (C) 2021 Elsevier Ltd. All rights reserved.

原文链接:

NSTL
Elsevier

Exploiting foreground and background separation for prohibited item detection in overlapping X-Ray images

Shao, FangtaoLiu, JingWu, PengYang, Zhiwei...

11页

查看更多>>摘要：X-ray imagery security screening is an essential component of transportation and logistics. In recent years, some researchers have used computer vision algorithms to replace inefficient and tedious man-ual baggage inspection. However, X-ray images are complicated, and objects overlap with one another in a semi-transparent state, which underperforms the existing object detection frameworks. To solve the severe overlapping problem of X-ray images, we propose a foreground and background separation (FBS) X-ray prohibited item detection framework, which separates prohibited items from other items to ex-clude irrelevant information. First, we design a target foreground and use recursive training to adaptively approximate the real foreground. Thereafter, with the constraints of X-ray imaging characteristics, a de-coder is employed to separate the prohibited items from other irrelevant items to obtain the foreground and background (FB). Finally, we use the attention module to make the detection framework focus more on the foreground. Our method is evaluated on a synthetic dataset with FB ground truth and two pub-lic datasets with only bounding box annotations. Extensive experimental results demonstrate that our method significantly outperforms state-of-the-art solutions. Furthermore, experiments are performed in the case where only a small number of images contain the FB ground truth. The results indicate that our method requires only a small number of FB ground truths to obtain a performance equivalent to that of all FB ground truths. (c) 2021 Elsevier Ltd. All rights reserved.

原文链接:

NSTL
Elsevier

Multi-scale spatial-spectral fusion based on multi-input fusion calculation and coordinate attention for hyperspectral image classification

Yang, LinaZhang, FengqiWang, Patrick Shen-PeiLi, Xichun...

14页

查看更多>>摘要：Recently, the deep learning method that integrates image features has gradually become a hot development trend in hyperspectral image classification. However, these studies did not fully consider the fusion of image features, and did not remove the interference to the classification process caused by the difference in the size of the objects. These factors hinder the further improvement of the classification effect. To eliminate these drawbacks, this paper proposes a more effective fusion scheme (MSF-MIF), which realizes the fusion from the perspective of location characteristics and channel characteristics through 3D convolution and spatial feature concatenation. In view of the size discrepancy of the objects to be classified, this method extracts features from several input patches of different scales and uses the novel calculation method proposed to fuse them, which minimizes the interference caused by size differences. In addition, this research also tried to quote the coordinate attention structure for the first time that combines spatial and spectral attention features to further improve the classification performance. Experimental results on three commonly used data sets prove that this framework has achieved a breakthrough in classification accuracy. (c) 2021 Elsevier Ltd. All rights reserved.

原文链接:

NSTL
Elsevier

A cascade reconstruction model with generalization ability evaluation for anomaly detection in videos

Zhong, YuanhongChen, XiaJiang, JinyangRen, Fan...

13页

查看更多>>摘要：Anomaly detection plays an important role in surveillance video since it maintains public safety efficiently with low cost. In current works, anomaly detection methods based on reconstruction with deep learning has been extensively studied for the powerful representation capacity. These methods use convolutional neural networks to learn model for describing normality at training and detect anomalies according to reconstruction error at testing. However, excessive representation capacity of neural networks will also bring disadvantages to anomaly detection when it is powerful enough to reconstruct abnormal infor-mation. For this reason, we proposed two solutions; firstly, a cascade model which conducts pixel recon-struction followed by optical flow prediction is designed. The conversion from frame to optical flow learns the correlation between object appearance and motion, while pixel reconstruction enlarges the optical flow prediction error to conduct effective anomaly detection. Secondly, the generalization ability evalua-tion based on pseudo-anomaly is proposed, which is used to evaluate the ability of model to represent anomaly, thus selecting an optimal model for anomaly detection. The selected model achieves AUC 88.9% on Avenue, 82.6% on Ped1, 97.7% on Ped2, and 70.7% on ShanghaiTech datasets. Extensive ablation ex-periments have verified the effectiveness of our method. Code will be released at https://github.com/Xia-Chen/Cascade_Reconstruction. (c) 2021 Elsevier Ltd. All rights reserved.

原文链接:

NSTL
Elsevier

Atom correlation based graph propagation for scene graph generation

Lin, BingqianZhu, YiLiang, Xiaodan

12页

查看更多>>摘要：Long-tailed distribution in the dataset is one of the major problems of the scene graph generation task. Previous methods attempt to alleviate this by introducing human commonsense knowledge in the form of statistical correlations between object pairs. However, the reasoning path they used is usually composable and the prior knowledge they employed is generally image-specific, making the knowledge learning less flexible, stable and holistic. In this paper, we propose Atom Correlation Based Graph Propagation (AC GP) for the scene graph generation task. Specifically, diverse atom correlations between objects and their relationships are explored by separating relationships to form new semantic nodes and decomposing the compound reasoning paths. Based on these atom correlations, the knowledge graphs are introduced for the feature enhancement by information propagating in the global category space. By exploiting atom correlations, the introduced prior knowledge can be more common and easy to learn. Moreover, propagating the knowledge in the global category space enables the model aware of more comprehensive and holistic knowledge. As a result, the model capacity and stability can be effectively im proved to mine infrequent and missed relationships. Experimental results on two benchmark datasets: Visual Relation Detection (VRD) and Visual Genome (VG) show the superiority of the proposed AC-GP over strong baseline methods. (c) 2021 Elsevier Ltd. All rights reserved.

原文链接:

NSTL
Elsevier