Yilmaz, Cigdem SerifogluYilmaz, VolkanGungor, Oguz
43页
查看更多>>摘要:Pansharpening fuses the spatial features of a high-resolution panchromatic (PAN) image with the spectral features of a lower-resolution multispectral (MS) image to generate a spatially enriched MS image. Numerous pansharpening strategies have been developed for more than three decades, which forces the analysts who intend to apply pansharpening to choose from various pansharpening techniques. Hence, this study aims to investigate the performances of many conventional and state-of-the-art pansharpening techniques in order to guide the analysts in this regard. To this aim, the spectral and spatial structure fidelity of the pansharpened images produced from a total of 47 pansharpening methods were evaluated qualitatively and quantitatively. The methods examined were from six pansharpening methods categories, including Multiresolution Analysis (MRA)-based, Component Substitution (CS)-based, Colour-Based (CB), Deep Learning (DL)-based, Variational Optimization (VO)-based and hybrid techniques. The methods in the MRA, DL, CB and VO category were found to exhibit the best pansharpening performances; whereas the hybrid and CS-based techniques showed the poorest performances. We believe that the outcomes of this study will guide the analysts who are in the need to apply pansharpening for their applications.
Perez, Andrea Teresa EspinozaRossit, Daniel AlejandroTohme, FernandoVasquez, Oscar C....
14页
查看更多>>摘要:Mass customized and mass personalized production has become facilitated by the fourth industrial revolution. The resulting industrial environments require the development of information systems able to take the specifications of customers and convey them to the production system in such a way as to contribute to the coordination of all the stakeholders and activities required to fulfill the orders of the customers. This is beyond the capabilities of traditional systems based on MRP and ERP, since the information should be managed in a flexible and decentralized way to exploit the Smart Manufacturing facilities of Industry 4.0. Blockchain, instead, is a technology that provides those features constituting a sound information supporting basis for mass customized/personalized production. Consequently, we explore the potential of blockchain as an information technology able to support industries that base their business models on mass customized/personalized production processes. This survey allows us to identify important challenges for further developments, highlighting three issues in the production setting: (i) to deepen the interoperability of systems, (ii) to generate more implementations, and (iii) to develop efficient consensus protocols. As a response to these insights we provide a conceptual design of how blockchain contributes to managing efficiently mass customized production systems. In our design the information of customer specifications can be fused with data from the production process to generate a plan to fulfill the demand. This design arises as a solution approach to three stated problem, which are faced by mass customized production systems.
查看更多>>摘要:The latest Deep Learning (DL) models for detection and classification have achieved an unprecedented performance over classical machine learning algorithms. However, DL models are black-box methods hard to debug, interpret, and certify. DL alone cannot provide explanations that can be validated by a non technical audience such as end-users or domain experts. In contrast, symbolic AI systems that convert concepts into rules or symbols - such as knowledge graphs - are easier to explain. However, they present lower generalization and scaling capabilities. A very important challenge is to fuse DL representations with expert knowledge. One way to address this challenge, as well as the performance-explainability trade-off is by leveraging the best of both streams without obviating domain expert knowledge. In this paper, we tackle such problem by considering the symbolic knowledge is expressed in form of a domain expert knowledge graph. We present the eXplainable Neural-symbolic learning (X-NeSyL) methodology, designed to learn both symbolic and deep representations, together with an explainability metric to assess the level of alignment of machine and human expert explanations. The ultimate objective is to fuse DL representations with expert domain knowledge during the learning process so it serves as a sound basis for explainability. In particular, X-NeSyL methodology involves the concrete use of two notions of explanation, both at inference and training time respectively: (1) EXPLANet: Expert-aligned eXplainable Part-based cLAssifier NETwork Architecture, a compositional convolutional neural network that makes use of symbolic representations, and (2) SHAP-Backprop, an explainable AI-informed training procedure that corrects and guides the DL process to align with such symbolic representations in form of knowledge graphs. We showcase X-NeSyL methodology using MonuMAI dataset for monument facade image classification, and demonstrate that with our approach, it is possible to improve explainability at the same time as performance.
查看更多>>摘要:Finger vein recognition biometric trait is a significant biometric modality that is considered more secure, reliable, and emerging. This article presents a review to focus on the recent research landscape in biometric finger vein recognition systems. This article focuses on manuscripts related to keywords 'Finger Vein Authentication System', 'Anti-spoofing or Presentation Attack Detection', 'Multimodal Biometric Finger Vein Authentication' and their variations in four main digital research libraries such as IEEE Xplore, Springer, ACM, and Science Direct. The final set of articles is divided into three main categories: Deep Learning (DL) based finger vein recognition, Presentation Attack Detection (PAD), and Multimodal-based finger vein authentication system. Deep learning-based finger vein recognition techniques are further sub-divided into pre-processing (Quality assessment and enhancement) based, feature extraction based, and feature extraction and recognition based schemes. Presentation attack detection methods are sub-divided into software-based and hardware-based approaches. Multimodal-based finger vein biometric system is sub-categorized into feature level fusion, matching level fusion, and hybrid fusion methods. The authors have studied the problem of the recent algorithm and their solution related to finger vein biometric system from the recent literature. Performance analysis and selected the best promising research work from the mentioned studies are also presented. Finally, open challenges, opportunities, and suggested solutions related to deep learning, PAD, and Multimodal based finger vein recognition systems have been discussed in the discussion section. This work would be helpful to the developers, company users, researchers, and readers to get insight into the importance of such technology and the recent problem faced by finger vein authentication technology with respect to deep learning and multimodal systems.
查看更多>>摘要:The network structure exhibits a variety of changes over time. Fusing this structure and the development of communities in dynamic networks plays an important role in analyzing the evolution and development of the entire network. How to ensure the division of the community structure in social network big data, as well as ensure the continuity of the community between the current time and previous time period, are issues that need to be explored. This problem can be solved by fusing the three characteristics of temporal variability, stability, and continuity in dynamic social network communities, and by adopting the multiobjective optimization method to detect community structures in dynamic networks. The probability fusion method is added to the initial step of the algorithm to generate suitable network partitions and ensure fast convergence and high accuracy. Two neighboring fusion strategies are proposed that are suitable for communities: the neighbor diversity strategy and the neighbor crowd strategy. These two strategies make different changes to the candidate network partitions. A continuity metric for dynamic community evolution is formulated to compare the similarity of the dynamic network communities of two consecutive time steps. Experiments on synthetic datasets and actual datasets prove that the proposed method in this paper provides better performance than existing methods.
查看更多>>摘要:Single image super-resolution (SISR), which aims to reconstruct a high-resolution (HR) image from a lowr-esolution (LR) observation, has been an active research topic in the area of image processing in recent decades. Particularly, deep learning-based super-resolution (SR) approaches have drawn much attention and have greatly improved the reconstruction performance on synthetic data. However, recent studies show that simulation results on synthetic data usually overestimate the capacity to super-resolve real-world images. In this context, more and more researchers devote themselves to develop SR approaches for realistic images. This article aims to make a comprehensive review on real-world single image super-resolution (RSISR). More specifically, this review covers the critical publicly available datasets and assessment metrics for RSISR, and four major categories of RSISR methods, namely the degradation modeling-based RSISR, image pairsbased RSISR, domain translation-based RSISR, and self-learning-based RSISR. Comparisons are also made among representative RSISR methods on benchmark datasets, in terms of both reconstruction quality and computational efficiency. Besides, we discuss challenges and promising research topics on RSISR.
查看更多>>摘要:The detection of retinal microaneurysms is crucial for the early detection of important diseases such as diabetic retinopathy. However, the detection of these lesions in retinography, the most widely available retinal imaging modality, remains a very challenging task. This is mainly due to the tiny size and low contrast of the microaneurysms in the images. Consequently, the automated detection of microaneurysms usually relies on extensive ad-hoc processing. In this regard, although microaneurysms can be more easily detected using fluorescein angiography, this alternative imaging modality is invasive and not adequate for regular preventive screening. In this work, we propose a novel deep learning methodology that takes advantage of unlabeled multimodal image pairs for improving the detection of microaneurysms in retinography. In particular, we propose a novel adversarial multimodal pre-training consisting in the prediction of fluorescein angiography from retinography using generative adversarial networks. This pre-training allows learning about the retina and the microaneurysms without any manually annotated data. Additionally, we also propose to approach the microaneurysms detection as a heatmap regression, which allows an efficient detection and precise localization of multiple microaneurysms. To validate and analyze the proposed methodology, we perform an exhaustive experimentation on different public datasets. Additionally, we provide relevant comparisons against different state-of-the-art approaches. The results show a satisfactory performance of the proposal, achieving an Average Precision of 64.90%, 31.36%, and 33.55% in the E-Ophtha, ROC, and DDR public datasets. Overall, the proposed approach outperforms existing deep learning alternatives while providing a more straightforward detection method that can be effectively applied to raw unprocessed retinal images.
查看更多>>摘要:Band selection is one of the most effective methods to reduce the band redundancy of hyperspectral images (HSIs). Most existing band selection methods tend to regard each band as a whole, and then explore the band redundancy with the pixel-wise features directly. However, since the regions of HSIs corresponding to different objects have diverse spectral properties and spatial structure, such above scheme limits the performance of hyperspectral band selection due to the lack of spatial information. To address above issues, a novel band selection method via region-aware latent features fusion based clustering (RLFFC) is proposed. Specifically, we employ the superpixel segmentation to segment HSIs into multiple regions so that the spatial information of HSIs can be fully preserved. In order to capture the priori information, we construct its corresponding Laplacian matrix from which a group of low dimensional latent features are generated to further enhance the separability among different bands. Then, a shared latent feature representation of HSIs is obtained by fusing region-aware latent features to effectively capture the band redundancy of HSIs. Finally, the..-means clustering algorithm is utilized to obtain the index of the selected bands from the shared latent feature representation. As a result, the spectral and spatial properties are well exploited in the proposed method. Extensive experiments on four public hyperspectral datasets show that the proposed method achieves superior performance when compared with other state-of-the-art ones.
查看更多>>摘要:Convolutional Neural Networks have dominated the field of computer vision for the last ten years, exhibiting extremely powerful feature extraction capabilities and outstanding classification performance. The main strategy to prolong this trend in the state-of-the-art literature relies on further upscaling networks in size. However, costs increase rapidly while performance improvements may be marginal. Our main hypothesis is that adding additional sources of information can help to increase performance and that this approach is more cost-effective than building bigger networks, which involve higher training time, larger parametrisation space and higher computational resources requirements. In this paper, an ensemble method for accurate image classification is proposed, fusing automatically detected features through a Convolutional Neural Network and a set of manually defined statistical indicators. Through a combination of the predictions of a CNN and a secondary classifier trained on statistical features, a better classification performance can be achieved cheaply. We test five different CNN architectures and multiple learning algorithms in a diverse number of datasets to validate our proposal. According to the results, the inclusion of additional indicators and an ensemble classification approach help to increase the performance in all datasets. Both code and datasets are publicly available via GitHub at: https://github.com/jahuerta92/cnn- prob-ensemble.
查看更多>>摘要:Light field imaging has drawn broad attention since the advent of practical light field capturing systems that facilitate a wide range of applications in computer vision. However, existing learning-based methods for improving the spatial resolution of light field images neglect the shifts in the sub-pixel domain that are widely used by super-resolution techniques, thus, fail in recovering rich high-frequency information. To fully exploit the shift information, our method attempts to learn an epipolar shift compensation for light field image super-resolution that allows the restored light field image to be angular coherent with the enhancement of spatial resolution. The proposed method first utilizes the rich surrounding views along some typical epipolar directions to explore the inter-view correlations. We then implement feature-level registration to capture accurate sub-pixel shifts of central view, which is constructed by the compensation module equipped with dynamic deformable convolution. Finally, the complementary information from different spatial directions is fused to provide high-frequency details for the target view. By taking each sub-aperture image as a central view, our method could be applied for light field images with any angular resolution. Extensive experiments on both synthetic and real scene datasets demonstrate the superiority of our method over the state-of-the-art qualitatively and quantitatively. Moreover, the proposed method shows good performance in preserving the inherent epipolar structures in light field images. Specifically, our LFESCN method outperforms the state-of-the-art method with about 0.7 dB (PSNR) on average.