首页期刊导航|ISPRS journal of photogrammetry and remote sensing
期刊信息/Journal information
ISPRS journal of photogrammetry and remote sensing
Elsevier
ISPRS journal of photogrammetry and remote sensing

Elsevier

双月刊

0924-2716

ISPRS journal of photogrammetry and remote sensing/Journal ISPRS journal of photogrammetry and remote sensingSCIAHCIISTPEI
正式出版
收录年代

    Overcoming the uncertainty challenges in detecting building changes from remote sensing images

    Li, JiepanHe, WeiLi, ZhuohongGuo, Yujun...
    1-17页
    查看更多>>摘要:Detecting building changes with multi-temporal remote sensing (RS) imagery at a very high resolution can help us understand urbanization and human activities, making informed decisions in urban planning, resource allocation, and infrastructure development. However, existing methods for building change detection (BCD) generally overlook critical uncertainty phenomena presented in RS imagery. Specifically, these uncertainties arise from two main sources: First, current building change detection datasets are designed primarily to detect changes in buildings, while changes in other land-cover classes are often classified as an unchanged background. Because of the manual labeling process, background elements that resemble buildings, such as roads and bridges, are at significant risk of being misclassified as building changes, introducing aleatoric uncertainty at the data level. Second, changes in parts of buildings that affect appearance, texture, or style without altering their semantic meaning, known as pseudo-changes, along with the imbalance between changed and unchanged samples, together lead to epistemic uncertainty at the model level. To address these challenges, we present an Uncertainty-Aware BCD (UA-BCD) framework. In detail, we employ a Siamese pyramid vision transformer to extract multi-level features from bi-temporal images, which are then decoded via a general decoder to obtain a coarse change map with inherent uncertainty. Subsequently, we introduce the aleatoric uncertainty estimation module to estimate the aleatoric uncertainty and embed it into the feature space. Then, a knowledge-guided feature enhancement module is developed to leverage the knowledge encoded in the coarse map to enhance the multi-level features and generate a refined change map. Finally, we propose an epistemic uncertainty estimator that takes the bi-temporal images and the refined change map as input and outputs an estimate of epistemic uncertainty. This estimation is supervised by the entropy calculated from the refined map, ensuring that the UA-BCD framework can produce a change map with lower epistemic uncertainty. To comprehensively validate the efficacy of the UA-BCD framework, we adopt a dual- perspective verification approach. Extensive experiments on five public building change datasets demonstrate the significant advantages of the proposed method over current state-of-the-art methods. Additionally, an application in Dongxihu District, Wuhan, China, confirms the outstanding performance of the proposed method in large-scale BCD. The source code of the project is available at https://github.com/Henryjiepanli/UA-BCD.

    PSO-based fine polarimetric decomposition for ship scattering characterization

    Wang, JunpengQuan, SinongXing, ShiqiLi, Yongzhen...
    18-31页
    查看更多>>摘要:Due to the inappropriate estimation and inadequate awareness of scattering from complex substructures within ships, a reasonable, reliable, and complete interpretation tool to characterize ship scattering for polarimetric synthetic aperture radar (PolSAR) is still lacking. In this paper, a fine polarimetric decomposition with explicit physical meaning is proposed to reveal and characterize the local-structure-related scattering behaviors on ships. To this end, a nine-component decomposition scheme is first established through incorporating the rotated dihedral and planar resonator scattering models, which makes full use of polarimetric information and comprehensively considers the complex structure scattering of ships. In order to reasonably estimation the scattering components, three practical scattering dominance principles as well as an explicit objective function are raised, and a particle swarm optimization (PSO)-based model inversion strategy is subsequently presented. This not only overcomes the underdetermined problem, but also improves the scattering mechanism ambiguity by circumventing the constrained estimation order. Finally, a ship indicator by linearly combining the output scattering contribution is further derived, which constitutes a complete ship scattering interpretation approach along with the proposed decomposition. Experiments carried out with real PolSAR datasets demonstrate that the proposed method adequately and objectively describes the scatterers on ships, which provides an effective way to ship scattering characterization. Moreover, it also verifies the feasibility of fine polarimetric decomposition in a further application with the quantitative analysis of scattering components.

    Target-aware attentional network for rare class segmentation in large-scale LiDAR point clouds

    Zhang, XinlongLin, DongSoergel, Uwe
    32-50页
    查看更多>>摘要:Semantic interpretation of 3D scenes poses a formidable challenge in point cloud processing, which also stands as a requisite undertaking across various fields of application involving point clouds. Although a number of point cloud segmentation methods have achieved leading performance, 3D rare class segmentation continues to be a challenge owing to the imbalanced distribution of fine-grained classes and the complexity of large scenes. In this paper, we present target-aware attentional network (TaaNet), a novel mask-constrained attention framework to address 3D semantic segmentation of imbalanced classes in large-scale point clouds. Adapting the self-attention mechanism, a hierarchical aggregation strategy is first applied to enhance the learning of point-wise features across various scales, which leverages both global and local perspectives to guarantee presence of fine-grained patterns in the case of scenes with high complexity. Subsequently, rare target masks are imposed by a contextual module on the hierarchical features. Specifically, a target-aware aggregator is proposed to boost discriminative features of rare classes, which constrains hierarchical features with learnable adaptive weights and simultaneously embeds confidence constraints of rare classes. Furthermore, a target pseudo-labeling strategy based on strong contour cues of rare classes is designed, which effectively delivers instance-level supervisory signals restricted to rare targets only. We conducted thorough experiments on four multi-platform LiDAR benchmarks, i.e., airborne, mobile and terrestrial platforms, to assess the performance of our framework. Results demonstrate that compared to other commonly used advanced segmentation methods, our method can obtain not only high segmentation accuracy but also remarkable F1-scores in rare classes. Ina submission to the official ranking page of Hessigheim 3D benchmark, our approach achieves a state-of-the-art mean F1-score of 83.84% and an outstanding overall accuracy (OA) of 90.45%. In particular, the F1-scores of rare classes namely vehicles and chimneys notably exceed the average of other published methods by a wide margin, boosting by 32.00% and 32.46%, respectively. Additionally, extensive experimental analysis on benchmarks collected from multiple platforms, Paris-Lille-3D, Semantic3D and WHU-Urban3D, validates the robustness and effectiveness of the proposed method.

    Unwrapping error and fading signal correction on multi-looked InSAR data

    Ma, ZhangfengWang, NanxinYang, YingbaoAoki, Yosuke...
    51-63页
    查看更多>>摘要:Multi-looking, aimed at reducing data size and improving the signal-to-noise ratio, is indispensable for largescale InSAR data processing. However, the resulting "Fading Signal" caused by multi-looking breaks the phase consistency among triplet interferograms and introduces bias into the estimated displacements. This inconsistency challenges the assumption that only unwrapping errors are involved in triplet phase closure. Therefore, untangling phase unwrapping errors and fading signals from triplet phase closure is critical to achieving more precise InSAR measurements. To address this challenge, we propose a new method that mitigates phase unwrapping errors and fading signals. This new method consists of two key steps. The first step is triplet phase closure-based stacking, which allows for the direct estimation of fading signals in each interferogram. The second step is Basis Pursuit Denoising-based unwrapping error correction, which transforms unwrapping error correction into sparse signal recovery. Through these two procedures, the new method can be seamlessly integrated into the traditional InSAR workflow. Additionally, the estimated fading signal can be directly used to derive soil moisture as a by-product of our method. Experimental results on the San Francisco Bay area demonstrate that the new method reduces velocity estimation errors by approximately 9 %-19 %, effectively addressing phase unwrapping errors and fading signals. This performance outperforms both ILP and Lasso methods, which only account for unwrapping errors in the triplet closure. Additionally, the derived by-product, soil moisture, shows strong consistency with most external soil moisture products.

    ChangeRD: A registration-integrated change detection framework for unaligned remote sensing images

    Jing, WeiChi, KaichenLi, QiangWang, Qi...
    64-74页
    查看更多>>摘要:Change Detection (CD) is important for natural disaster assessment, urban construction management, ecological monitoring, etc. Nevertheless, the CD models based on the pixel-level classification are highly dependent on the registration accuracy of bi-temporal images. Besides, differences in factors such as imaging sensors and season often result in pseudo-changes in CD maps. To tackle these challenges, we establish a registration-integrated change detection framework called ChangeRD, which can explore spatial transformation relationships between pairs of unaligned images. Specifically, ChangeRD is designed as a multitask network that supervises the learning of the perspective transformation matrix and difference regions between images. The proposed Adaptive Perspective Transformation (APT) module is utilized to enhance spatial consistency of features from different levels of the Siamese network. Furthermore, an Attention-guided Central Difference Convolution (AgCDC) module is proposed to mine the deep differences in bi-temporal features, significantly reducing the pseudo-change noise caused by illumination variations. Extensive experiments on unaligned bi-temporal images have demonstrated that ChangeRD outperforms other SOTA CD methods in terms of qualitative and quantitative evaluation. The code for this work will be available on GitHub.

    LU5M812TGT: An AI-Powered global database of impact craters ≥ 0.4 km on the Moon

    La Grassa, RiccardoMartellato, ElenaCremonese, GabrieleRe, Cristina...
    75-84页
    查看更多>>摘要:We release anew global catalog of impact craters on the Moon containing about 5 million craters. Such catalog was derived using a deep learning model, which is based on increasing the spatial image resolution, allowing crater detection down to sizes as small as 0.4 km. Therefore, this database includes similar to 69.3% craters with diameter lower than 1 km. The similar to 28.7% of the catalog contains mainly craters in the 1-5 km diameter range, and the remaining percentage (less than or similar to 1.9%) has been identified between 5 km and 100 km of diameter. The accuracy of this new crater database was tested against previous well-known global crater catalogs. We found a similar crater size-frequency distribution for craters >= 1 km, providing a validation for the crater identification method applied in this work. The add-on of craters as small as half a kilometer is new with respect to other published global catalogs, allowing a finer exploitation of the Lunar surface at a global scale. The LU5M812TGT catalog is available at the following link: https://zenodo.org/records/13990480.

    MLC-net: A sparse reconstruction network for TomoSAR imaging based on multi-label classification neural network

    Ouyang, DepengZhang, YuetingGuo, JiayiZhou, Guangyao...
    85-99页
    查看更多>>摘要:Synthetic Aperture Radar tomography (TomoSAR) has garnered significant interest for its capability to achieve three-dimensional resolution along the elevation angle by collecting a stack of SAR images from different cross-track angles. Compressed Sensing (CS) algorithms have been widely introduced into SAR tomography. However, traditional CS-based TomoSAR methods suffer from weaknesses in noise resistance, high computational complexity, and insufficient super-resolution capabilities. Addressing the efficient TomoSAR imaging problem, this paper proposes an end-to-end neural network-based TomoSAR inversion method, named Multi-Label Classification-based Sparse Imaging Network (MLC-net). MLC-net focuses on the l0 norm optimization problem, completely departing from the iterative framework of traditional compressed sensing methods and overcoming the limitations imposed by the l1 norm optimization problem on signal coherence. Simultaneously, the concept of multi-label classification is introduced for the first time in TomoSAR inversion, enabling MLC-net to accurately invert scenarios with multiple scatterers within the same range-azimuth cell. Additionally, a novel evaluation system for TomoSAR inversion results is introduced, transforming inversion results into a 3D point cloud and utilizing mature evaluation methods for 3D point clouds. Under the new evaluation system, the proposed method is more than 30% stronger than existing methods. Finally, by training solely on simulated data, we conducted extensive experimental testing on both simulated and real data, achieving excellent results that validate the effectiveness, efficiency, and robustness of the proposed method. Specifically, the VQA_PC score improved from 91.085 to 92.713. The code of our network is available in https://github.com/OscarYoungDepend/MLC-net.

    PylonModeler: A hybrid-driven 3D reconstruction method for power transmission pylons from LiDAR point clouds

    Wu, ShaolongChen, ChiYang, BishengYan, Zhengfei...
    100-124页
    查看更多>>摘要:As the power grid is an indispensable foundation of modern society, creating a digital twin of the grid is of great importance. Pylons serve as components in the transmission corridor, and their precise 3D reconstruction is essential for the safe operation of power grids. However, 3D pylon reconstruction from LiDAR point clouds presents numerous challenges due to data quality and the diversity and complexity of pylon structures. To address these challenges, we introduce PylonModeler: a hybrid-driven method for 3D pylon reconstruction using airborne LiDAR point clouds, thereby enabling accurate, robust, and efficient real-time pylon reconstruction. Different strategies are employed to achieve independent reconstructions and assemblies for various structures. We propose Pylon Former, a lightweight transformer network for real-time pylon recognition and decomposition. Subsequently, we apply a data-driven approach for the pylon body reconstruction. Considering structural characteristics, fitting and clustering algorithms are used to reconstruct both external and internal structures. The pylon head is reconstructed using a hybrid approach. A pre-built pylon head parameter model library defines different pylons by a series of parameters. The coherent point drift (CPD) algorithm is adopted to establish the topological relationships between pylon head structures and set initial model parameters, which are refined through optimization for accurate pylon head reconstruction. Finally, the pylon body and head models are combined to complete the reconstruction. We collected an airborne LiDAR dataset, which includes a total of 3398 pylon data across eight types. The dataset consists of transmission lines of various voltage levels, such as 110 kV, 220 kV, and 500 kV. PylonModeler is validated on this dataset. The average reconstruction time of a pylon is 1.10 s, with an average reconstruction accuracy of 0.216 m. In addition, we evaluate the performance of PylonModeler on public airborne LiDAR data from Luxembourg. Compared to previous state-of-the-art methods, reconstruction accuracy improved by approximately 26.28 %. With superior performance, PylonModeler is tens of times faster than the current model-driven methods, enabling real-time pylon reconstruction.

    Semantic guided large scale factor remote sensing image super-resolution with generative diffusion prior

    Wang, CeSun, Wanjie
    125-138页
    查看更多>>摘要:In the realm of remote sensing, images captured by different platforms exhibit significant disparities in spatial resolution. Consequently, effective large scale factor super-resolution (SR) algorithms are vital for maximizing the utilization of low-resolution (LR) satellite data captured from orbit. However, existing methods confront challenges such as semantic inaccuracies and blurry textures in the reconstructed images. To tackle these issues, we introduce a novel framework, the Semantic Guided Diffusion Model (SGDM), designed for large scale factor remote sensing image super-resolution. The framework exploits a pre-trained generative model as a prior to generate perceptually plausible high-resolution (HR) images, thereby constraining the solution space and mitigating texture blurriness. We further enhance the reconstruction by incorporating vector maps, which carry structural and semantic cues to enhance the reconstruction fidelity of ground objects. Moreover, pixel- level inconsistencies in paired remote sensing images, stemming from sensor-specific imaging characteristics, may hinder the convergence of the model and the diversity in generated results. To address this problem, we develop a method to extract sensor-specific imaging characteristics and model the distribution of them. The proposed model can decouple imaging characteristics from image content, allowing it to generate diverse super-resolution images based on imaging characteristics provided by reference satellite images or sampled from the imaging characteristic probability distributions. To validate and evaluate our approach, we create the Cross-Modal Super-Resolution Dataset (CMSRD). Qualitative and quantitative experiments on CMSRD showcase the superiority and broad applicability of our method. Experimental results on downstream vision tasks also demonstrate the utilitarian of the generated SR images. The dataset and code will be publicly available at https://github.com/wwangcece/SGDM.

    Refined change detection in heterogeneous low-resolution remote sensing images for disaster emergency response

    Wang, DiMa, GuoruiZhang, HaimingWang, Xiao...
    139-155页
    查看更多>>摘要:Heterogeneous Remote Sensing Images Change Detection (HRSICD) is a significant challenge in remote sensing image processing, with substantial application value in rapid natural disaster response. However, significant differences in imaging modalities often result in poor comparability of their features, affecting the recognition accuracy. To address the issue, we propose a novel HRSICD method based on image structure relationships and semantic information. First, we employ a Multi-scale Pyramid Convolution Encoder to efficiently extract the multi-scale and detailed features. Next, the Cross-domain Feature Alignment Module aligns the structural relationships and semantic features of the heterogeneous images, enhancing the comparability between heterogeneous image features. Finally, the Multi-level Decoder fuses the structural and semantic features, achieving refined identification of change areas. We validated the advancement of proposed method on five publicly available HRSICD datasets. Additionally, zero-shot generalization experiments and real-world applications were conducted to assess its generalization capability. Our method achieved favorable results in all experiments, demonstrating its effectiveness. The code of the proposed method will be made available at https://github. com/Lucky-DW/HRSICD.