Change Detection(CD)is a vital technique for identifying and analyzing changes over time in a specific area using optical signals from remote sensing images.This technique has been extensively utilized in various fields,including national defense security,environmental monitoring,and urban construction.However,some challenges in achieving accurate and reliable CD are still encountered due to inherent disparities in imaging mechanisms,spectral ranges,and spatial resolutions among heterogeneous images.These challenges lead to issues such as inadequate accuracy,missed detections,and false detections.Heterogeneous remote sensing images can be regarded as sequences of different optical signals from the channel perspective.For example,RGB and infrared images can be regarded as sequences of spectral signals from different ranges.Transformers employ a multi-head attention mechanism that can effectively handle and analyze sequence information to achieve accurate heterogeneous CD.Thus,the paper proposes an optical signal token guided CD network for heterogeneous remote sensing images.This paper presents a novel heterogeneous CD network,primarily comprising the optical-signal token transformer(OT-Former)and the cross-temporal transformer(CT-Former).The proposed method demonstrates the capacity to effectively handle diverse remote sensing images of distinct categories and attain precise CD results.Specifically,OT-Former can encode diverse heterogeneous images in channel-wise for adaptively generating the optical-signal tokens.Meanwhile,CT-Former can use the optical-signal tokens as a guide to interact with the patch token for the learning of change rules.Moreover,a Difference Amplification Module(DAM)is embedded into the network to enhance the extraction of difference information.This module utilizes a 1×2 convolutional kernel to effectively fuse difference information.Finally,the differential token is predicted by multilayer perceptron to output the CD results.Experiments were conducted on three heterogeneous datasets and one homogeneous dataset to evaluate the performance of the proposed method.Furthermore,the proposed method was compared with six typical CD methods and evaluated the performance using overall accuracy(OA),Kappa coefficient,and Fl-score,among other evaluation metrics,to validate the effectiveness of the proposed network in this study.A limited number of samples were utilized for training during the experiment.Under identical experimental conditions,the proposed method demonstrated exceptional performance in homogeneous and heterogeneous CD.The results show that the proposed approach surpasses existing state-of-the-art methods in terms of qualitative and visual performance.Additionally,ablation experiments and parameter analyses were conducted to validate the effectiveness of the proposed methods,including the OT-Former,CT-Former,and DAM modules,and to assess the impact of various parameters within the network.Overall,the current study presents a novel heterogeneous CD network based on the transformer framework.Within this network,OT-Former is proposed to achieve the adaptive generation of optical-signal tokens from diverse remote sensing images.Moreover,the CT-Former utilizes these optical-signal tokens as a guide to facilitate interaction with patch tokens for the learning of change rules.Additionally,DAM modules were embedded into the network to effectively extract the difference information.An extremely limited number of samples were utilized only for training in the experiments.Remarkably,the proposed method outperformed the existing state-of-the-art methods,achieving a significantly advanced performance in heterogeneous CD.