查看更多>>摘要:Providing computers with the ability to process handwriting is both important and challenging, since many difficulties (e.g., different writing styles, alphabets, languages, etc.) need to be overcome for ad -dressing a variety of problems (text recognition, signature verification, writer identification, word spot-ting, etc.). This paper reviews the growing literature on off-line handwritten document analysis over the last thirty years. A sample of 5389 articles is examined using bibliometric techniques. Using bibliomet-ric techniques, this paper identifies (i) the most influential articles in the area, (ii) the most productive authors and their collaboration networks, (iii) the countries and institutions that have led research on the topic, (iv) the journals and conferences that have published most papers, and (v) the most relevant research topics (and their related tasks and methodologies) and their evolution over the years. (c) 2021 The Author(s). Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license ( http://creativecommons.org/licenses/by-nc-nd/4.0/ )
查看更多>>摘要:To be very specific in this paper, an Attentive Occlusion-adaptive Deep Network, hereafter referred as AODN, is proposed for facial landmark detection, consisting of the geometry-aware module, attention module, and low-rank learning module. Facial Landmark Detection (FLD) is a fundamental pre-processing step of facial related tasks. Occlusion, extreme pose, different expressions and illumination are the main challenges in facial landmark detection related tasks. Convolutional Neural Network (CNN) based FLD methods have attained significant improvement regarding accurate FLD but, to deal with occlusion is still very challenging even for CNN. It is because; probably occlusion misleads CNN on feature representation learning. If faces are partially occluded, the localization accuracy will drop significantly. The role of attention in the human visual system is vital, and researchers proved its significance for the computer vision problem. Taking advantage of geometric relationships among different facial components and attention, we extended our already established Occlusion-adaptive Deep Network (ODN). We introduced the attention module consisting of Channel-wise Attention (CA) and Spatial Attention (SA) to improve its ability to deal with the occlusion and enhance feature representation ability simultaneously. The occlusion probability assists as adaptive weights of high-level features and minimizes the effect of the occlusion and assist in modelling the occlusion. Ablation studies prove the synergistic effect of each module. The summary of our trifold contribution is as follows: i) we introduced attention mechanism in our already established ODN model, to deal with occlusion more precisely, and get the rich feature representation to achieve better performance. ii) As per our best of knowledge, we are the pioneers to introduce CA and SA for FLD to model occlusion. iii) Our proposed methodology reduces the number of entire network parameters, which effectually decreases training time and cost. So, the proposed model is more suitable for scalable data processing. Experimental results prove the better performance of proposed AODN on challenging benchmark datasets. (c) 2021 Elsevier Ltd. All rights reserved.
查看更多>>摘要:A recent study has shown that the real-time anti-noise challenges faced by molecular activity prediction algorithms can be solved by using the part structure features of the molecular graph. However, the sub-structures selected by this method are distributed in a scattered manner such that although they include as many block features as possible, they do not fully consider the connections between these blocks. Therefore, this study was conducted to fully consider the physical interpretation of the betweenness centrality node in the graph, and a sub-structure was obtained by depth-first search (DFS) from this node. This sub-structure not only contains the characteristics of each region but also retains the connections between each region. Then, a cascading multi-layer perception (MLP) model was designed to learn the characteristic representation of the graph from its local structure features. Experiments demonstrated that the performance of our algorithm is superior to that of other algorithms when evaluated on different datasets. (c) 2022 Elsevier Ltd. All rights reserved.
查看更多>>摘要:Label Distribution Learning (LDL) is a popular scenario for solving label ambiguity problems by learning the relative importance of each label to a particular instance. Nevertheless, the label is often incomplete due to the difficulty in annotating label distribution. In this mixing label case with complete and incomplete labels, it is often expected that the learning method can achieve better performance than the baseline method merely utilizing complete labeled data. However, the usage of incomplete labeled data may degrade the performance in real applications. Therefore, it is vital to design a safe incomplete LDL method, which will not deteriorate the performance when exploiting incomplete labeled data. To tackle this important but rarely studied problem, we propose a Safe Incomplete LDL method (SILDL), which learns a classifier that can prevent incomplete labeled instances from worsening the performance. Concretely, we learn predictions from multiple incomplete supervised learners and design an efficient solving algorithm by formulating it as a convex quadratic program. Theoretically, we prove that SILDL can obtain the maximal performance gain against the best one of the multiple baseline methods with mild conditions. Extensive experimental results validate the safeness of the proposed approach and show improvements in performance. (C) 2021 Elsevier Ltd. All rights reserved.
查看更多>>摘要:Accurately assessing the medical image segmentation quality of the automatically generated predictions is essential for guaranteeing the reliability of the results of computer-assisted diagnosis (CAD). Many researchers have studied segmentation quality estimation without labeled ground truths. Recently, a novel idea is proposed, which transforms segmentation quality assessment (SQA) into the pixel-wise or voxelwise error map segmentation task. However, the simple application of vanilla segmentation structures in medical domain fails to achieve satisfactory error segmentation results. In this paper, we propose collaborative boundary-aware context encoding networks called EP-Net for error segmentation task. Specifically, we propose a collaborative feature transformation branch for better feature fusion between images and masks, and precise localization of error regions. Further, we propose a context encoding module to utilize the global predictor from the error map to enhance the feature representation and regularize the networks. Extensive experiments on IBSR V2.0 dataset, ACDC dataset and M&Ms dataset demonstrate that EP-Net achieves better error segmentation results compared with the traditional segmentation patterns. Based on error prediction results, we obtain a proxy metric of segmentation quality, which has high Pearson correlation coefficient with the real segmentation accuracy on all datasets. (c) 2021 Elsevier Ltd. All rights reserved.
查看更多>>摘要:Currently, existing state-of-the-art 3D object detectors are in two-stage paradigm. These methods typically comprise two steps: 1) Utilize a region proposal network to propose a handful of high-quality proposals in a bottom-up fashion. 2) Resize and pool the semantic features from the proposed regions to summarize RoI-wise representations for further refinement. Note that these RoI-wise representations in step 2) are considered individually as uncorrelated entries when fed to following detection headers. Nevertheless, we observe these proposals generated by step 1) offset from ground truth somehow, emerging in local neighborhood densely with an underlying probability. Challenges arise in the case where a proposal largely forsakes its boundary information due to coordinate offset while existing networks lack corresponding information compensation mechanism. In this paper, we propose BADet for 3D object detection from point clouds. Specifically, instead of refining each proposal independently as previous works do, we represent each proposal as a node for graph construction within a given cut-off threshold, associating proposals in the form of local neighborhood graph, with boundary correlations of an object being explicitly exploited. Besides, we devise a lightweight Region Feature Aggregation Module to fully exploit voxel-wise, pixel-wise, and point-wise features with expanding receptive fields for more informative RoI-wise representations. We validate BADet both on widely used KITTI Dataset and highly challenging nuScenes Dataset. As of Apr. 17th, 2021, our BADet achieves on par performance on KITTI 3D detection leaderboard and ranks 1st on Moderate difficulty of Car category on KITTI BEV detection leaderboard. The source code is available at https://github.com/rui-qian/BADet . (C) 2022 Elsevier Ltd. All rights reserved.
查看更多>>摘要:Due to the huge commercial interests behind online reviews, a tremendous amount of spammers manufacture spam reviews for product reputation manipulation. To further enhance the influence of spam reviews, spammers often collaboratively post spam reviews within a short period of time, the activities of whom are called collective opinion spam campaign . The goals and members of the spam campaign activities change frequently, and some spammers also imitate normal purchases to conceal the identity, which makes the spammer detection challenging. In this paper, we propose an unsupervised network embedding-based approach to jointly exploiting different types of relations, e.g., direct common behavior relation, and indirect co-reviewed relation to effectively represent the relevances of users for detecting the collective opinion spammers. The average improvements of our method over the state-of-the-art solutions on dataset AmazonCn and YelpHotel are [14.09%,12.04%] and [16.25%,12.78%] in terms of AP and AUC, respectively. (c) 2022 Elsevier Ltd. All rights reserved.
查看更多>>摘要:A B S T R A C T In this paper, we propose an online model optimization algorithm based on reinforcement learning for quantitative trading. The combination of prediction model and trading policy is the most commonly used framework in practical quantitative trading. Integrated with machine learning methods, this framework brings huge profits to quantified companies. In the framework, the prediction model is used to predict future trading price trend, and the trading policy is used to determine the price and number of orders. Even though, the shortcomings of machine learning models are obvious, mainly are, (1) Slow prediction speed. Huge human-craft features and model computing cost much time, which is ten times of pure trading policy without model. (2) Poor generalization. This kind of models can hardly adapt to market data in each period, because market traders will change time to time at micro level, thus the distribution of market data will change. But current model is trained on a long period dataset, it achieves best effect at average, but can not adapt to different market at each period. To address this problem, we propose a novel online model optimization algorithm. A light model library will be constructed. Each light model in this library corresponds to a different market distribution. By devising the appropriate reward function via inverse reinforcement learning algorithm, the algorithm can accurately estimate the profits of each model. Then the model can be selected automatically in real-time trading, so that the trading policies can automatically adapt to changes in trading market, overcoming previous shortcoming of manually updating model and slow prediction speed. Experimental results show that the proposed algorithm achieves stateof-the-art performance on China Commodity Futures Market Data. (c) 2022 Elsevier Ltd. All rights reserved.
查看更多>>摘要:Although unsupervised person re-identification (Re-ID) has drawn increasing research attention recently, it remains challenging to learn discriminative features without annotations across disjoint camera views. In this paper, we address the unsupervised person Re-ID with a conceptually novel yet simple framework, termed as Multi-label Learning guided self-paced Clustering (MLC). MLC mainly learns discriminative features with three crucial modules, namely a multi-scale network, a multi-label learning module, and a self-paced clustering module. Specifically, the multi-scale network generates multi-granularity person features in both global and local views. The multi-label learning module leverages a memory feature bank and assigns each image with a multi-label vector based on the similarities between the image and feature bank. After multi-label training for several epochs, the self-paced clustering joins in training and assigns a pseudo label for each image. The benefits of our MLC come from three aspects: i) the multi-scale person features for better similarity measurement, ii) the multi-label assignment based on the whole dataset ensures that every image can be trained, and iii) the self-paced clustering removes some noisy samples for better feature learning. Extensive experiments on three popular large-scale Re-ID benchmarks demonstrate that our MLC outperforms previous state-of-the-art methods and significantly improves the performance of unsupervised person Re-ID. (c) 2022 Elsevier Ltd. All rights reserved.
查看更多>>摘要:Accurate detection of COVID-19 is one of the challenging research topics in today's healthcare sector to control the coronavirus pandemic. Automatic data-powered insights for COVID-19 localization from medical imaging modality like chest CT scan tremendously augment clinical care assistance. In this research, a Contour-aware Attention Decoder CNN has been proposed to precisely segment COVID-19 infected tissues in a very effective way. It introduces a novel attention scheme to extract boundary, shape cues from CT contours and leverage these features in refining the infected areas. For every decoded pixel, the attention module harvests contextual information in its spatial neighborhood from the contour feature maps. As a result of incorporating such rich structural details into decoding via dense attention, the CNN is able to capture even intricate morphological details. The decoder is also augmented with a Cross Context Attention Fusion Upsampling to robustly reconstruct deep semantic features back to high-resolution segmentation map. It employs a novel pixel-precise attention model that draws relevant encoder features to aid in effective upsampling. The proposed CNN was evaluated on 3D scans from MosMedData and Jun Ma benchmarked datasets. It achieved state-of-the-art performance with a high dice similarity coefficient of 85.43% and a recall of 88.10%. (c) 2022 Elsevier Ltd. All rights reserved.