Hyperspectral-Image Classification Combining Spatial-Spectral Self-Attention and Multigranularity Feature Extraction
For hyperspectral image(HSI)classification,although convolutional neural network(CNN)-based feature extraction methods have been widely applied and have achieved notable results,they still have limitations such as fixed receptive-field sizes and a tendency to overlook spatial-spectral correlations when extracting local features.In this regard,a Transformer network architecture that integrates multigranularity CNN and spatial-spectral self-attention(SSSA)is proposed herein.This architecture optimizes traditional CNN using multigranularity CNN by employing three-dimensional and two-dimensional convolutions to extract spatial-spectral and deep spatial features.Meanwhile,heterogeneous convolution is employed to finely extract multigranularity features,thereby overcoming the limitation of fixed kernel size in traditional CNN.In addition,to solve the problem of the neglect of local features in the self-attention mechanism in traditional Transformers,the mechanism is improved to enable the involved model to simultaneously construct global correlations for spatial and spectral information.Moreover,by introducing dual-channel depth-separable convolution for spatial-spectral-feature embedding,an effective connection between multigranularity CNN and SSSA is achieved.Further,experimental results show that owing to the successful extraction of local and global features,the involved model outperforms other mainstream HSI classification models on various datasets.