English
登录 注册

S2Mamba: A Spatial-spectral State Space Model for Hyperspectral Image Classification

DOI:10.48550/arXiv.2404.18213CSTR:10441.14.202412.017343
文献基本信息
论文标题:
S2Mamba: A Spatial-spectral State Space Model for Hyperspectral Image Classification
其他标题:
论文语言:
英文
基金项目:
类  型:
期刊
作  者:
Guanchun Wang, Xiangrong Zhang, Zelin Peng, Tianyang Zhang, Licheng Jiao
作者单位:
School of Artificial Intelligence, Xidian University, Xi’an 710071, China; MoE Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University, Shanghai, 200240, China.
摘  要:
Land cover analysis using hyperspectral images (HSI) remains an open problem due to their low spatial resolution and complex spectral information. Recent studies are primarily dedicated to designing Transformer-based architectures for spatial-spectral long-range dependencies modeling, which is computationally expensive with quadratic complexity. Selective structured state space model (Mamba), which is efficient for modeling long-range dependencies with linear complexity, has recently shown promising progress. However, its potential in hyperspectral image processing that requires handling numerous spectral bands has not yet been explored. In this paper, we innovatively propose S2Mamba, a spatial-spectral state space model for hyperspectral image classification, to excavate spatial-spectral contextual features, resulting in more efficient and accurate land cover analysis. In S2Mamba, two selective structured state space models through different dimensions are designed for feature extraction, one for spatial, and the other for spectral, along with a spatial-spectral mixture gate for optimal fusion. More specifically, S2Mamba first captures spatial contextual relations by interacting each pixel with its adjacent through a Patch Cross Scanning module and then explores semantic information from continuous spectral bands through a Bi-directional Spectral Scanning module. Considering the distinct expertise of the two attributes in homogenous and complicated texture scenes, we realize the Spatial-spectral Mixture Gate by a group of learnable matrices, allowing for the adaptive incorporation of representations learned across different dimensions. Extensive experiments conducted on HSI classification benchmarks demonstrate the superiority and prospect of S2Mamba. The code will be made available at: https://github.com/PURE-melo/S2Mamba
期刊信息
期刊名称:
arXiv
出版日期:
2024-08-13
卷  数:
期  数:
起始页码:
结束页码:
12
收录信息:
Others 其他类型编码
参考文献:
[1] M. Imani and H. Ghassemian, “An overview on spectral and spatial information fusion for hyperspectral image classification: Current trends and challenges,” Information fusion, vol. 59, pp. 59–83, 2020. [2] B. Rasti, D. Hong, R. Hang, P. Ghamisi, X. Kang, J. Chanussot, and J. A. Benediktsson, “Feature extraction for hyperspectral imagery: The evolution from shallow to deep: Overview and toolbox,” IEEE Geoscience and Remote Sensing Magazine, vol. 8, no. 4, pp. 60–88, 2020. [3] S. Li, W. Song, L. Fang, Y. Chen, P. Ghamisi, and J. A. Benediktsson, “Deep learning for hyperspectral image classification: An overview,” IEEE Transactions on Geoscience and Remote Sensing, vol. 57, no. 9, pp. 6690–6709, 2019. [4] D. Hong, B. Zhang, X. Li, Y. Li, C. Li, J. Yao, N. Yokoya, H. Li, P. Ghamisi, X. Jia et al., “Spectralgpt: Spectral remote sensing foundation model,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024. [5] A. Plaza, J. A. Benediktsson, J. W. Boardman, J. Brazile, L. Bruzzone, G. Camps-Valls, J. Chanussot, M. Fauvel, P. Gamba, A. Gualtieri et al., “Recent advances in techniques for hyperspectral image processing,” Remote sensing of environment, vol. 113, pp. S110–S122, 2009. [6] X. Zhang, S. Tian, G. Wang, X. Tang, J. Feng, and L. Jiao, “Cast: A cascade spectral aware transformer for hyperspectral image change detection,” IEEE Transactions on Geoscience and Remote Sensing, 2023. [7] Y. Xu, B. Du, F. Zhang, and L. Zhang, “Hyperspectral image classification via a random patches network,” ISPRS journal of photogrammetry and remote sensing, vol. 142, pp. 344–357, 2018. [8] Z. Zhong, J. Li, Z. Luo, and M. Chapman, “Spectral–spatial residual network for hyperspectral image classification: A 3-d deep learning framework,” IEEE Transactions on Geoscience and Remote Sensing, vol. 56, no. 2, pp. 847–858, 2017. [9] M. E. Paoletti, J. M. Haut, J. Plaza, and A. Plaza, “A new deep convolutional neural network for fast hyperspectral image classification,” ISPRS journal of photogrammetry and remote sensing, vol. 145, pp. 120–147, 2018. [10] Y. Dong, Q. Liu, B. Du, and L. Zhang, “Weighted feature fusion of convolutional neural network and graph attention network for hyperspectral image classification,” IEEE Transactions on Image Processing, vol. 31, pp. 1559–1572, 2022. [11] D. Hong, L. Gao, J. Yao, B. Zhang, A. Plaza, and J. Chanussot, “Graph convolutional networks for hyperspectral image classification,” IEEE Transactions on Geoscience and Remote Sensing, vol. 59, no. 7, pp. 5966–5978, 2020. [12] D. Hong, Z. Han, J. Yao, L. Gao, B. Zhang, A. Plaza, and J. Chanussot, “Spectralformer: Rethinking hyperspectral image classification with transformers,” IEEE Transactions on Geoscience and Remote Sensing, vol. 60, pp. 1–15, 2021. [13] L. Sun, G. Zhao, Y. Zheng, and Z. Wu, “Spectral–spatial feature tokenization transformer for hyperspectral image classification,” IEEE Transactions on Geoscience and Remote Sensing, vol. 60, pp. 1–14, 2022. [14] S. K. Roy, A. Deria, C. Shah, J. M. Haut, Q. Du, and A. Plaza, “Spectral–spatial morphological attention transformer for hyperspectral image classification,” IEEE Transactions on Geoscience and Remote Sensing, vol. 61, pp. 1–15, 2023. [15] H. Yu, Z. Xu, K. Zheng, D. Hong, H. Yang, and M. Song, “Mstnet: A multilevel spectral–spatial transformer network for hyperspectral image classification,” IEEE Transactions on Geoscience and Remote Sensing, vol. 60, pp. 1–13, 2022. [16] L. Liang, Y. Zhang, S. Zhang, J. Li, A. Plaza, and X. Kang, “Fast hyperspectral image classification combining transformers and simambased cnns,” IEEE Transactions on Geoscience and Remote Sensing, 2023. [17] X. Yang, W. Cao, Y. Lu, and Y. Zhou, “Hyperspectral image transformer classification networks,” IEEE Transactions on Geoscience and Remote Sensing, vol. 60, pp. 1–15, 2022. [18] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in neural information processing systems, vol. 30, 2017. [19] A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly et al., “An image is worth 16x16 words: Transformers for image recognition at scale,” in International Conference on Learning Representations, 2020. [20] A. Gu and T. Dao, “Mamba: Linear-time sequence modeling with selective state spaces,” arXiv preprint arXiv:2312.00752, 2023. [21] L. Zhu, B. Liao, Q. Zhang, X. Wang, W. Liu, and X. Wang, “Vision mamba: Efficient visual representation learning with bidirectional state space model,” in Forty-first International Conference on Machine Learning, 2024. [22] Y. Liu, Y. Tian, Y. Zhao, H. Yu, L. Xie, Y. Wang, Q. Ye, and Y. Liu, “Vmamba: Visual state space model,” arXiv preprint arXiv:2401.10166, 2024. [23] X. Pei, T. Huang, and C. Xu, “Efficientvmamba: Atrous selective scan for light weight visual mamba,” arXiv preprint arXiv:2403.09977, 2024. [24] K. Chen, B. Chen, C. Liu, W. Li, Z. Zou, and Z. Shi, “Rsmamba: Remote sensing image classification with state space model,” IEEE Geoscience and Remote Sensing Letters, vol. 21, pp. 1–5, 2024. [25] J. Ma, F. Li, and B. Wang, “U-mamba: Enhancing long-range dependency for biomedical image segmentation,” arXiv preprint arXiv:2401.04722, 2024. [26] S. Zhao, H. Chen, X. Zhang, P. Xiao, L. Bai, and W. Ouyang, “Rsmamba for large remote sensing image dense prediction,” arXiv preprint arXiv:2404.02668, 2024. [27] Z. Xing, T. Ye, Y. Yang, G. Liu, and L. Zhu, “Segmamba: Long-range sequential modeling mamba for 3d medical image segmentation,” arXiv preprint arXiv:2401.13560, 2024. [28] K. Li, X. Li, Y. Wang, Y. He, Y. Wang, L. Wang, and Y. Qiao, “Videomamba: State space model for efficient video understanding,” in ECCV 2024, 2024. [29] D. Liang, X. Zhou, X. Wang, X. Zhu, W. Xu, Z. Zou, X. Ye, and X. Bai, “Pointmamba: A simple state space model for point cloud analysis,” arXiv preprint arXiv:2402.10739, 2024. [30] R. Hang, Q. Liu, D. Hong, and P. Ghamisi, “Cascaded recurrent neural networks for hyperspectral image classification,” IEEE Transactions on Geoscience and Remote Sensing, vol. 57, no. 8, pp. 5384–5394, 2019. [31] M. Jiang, Y. Su, L. Gao, A. Plaza, X.-L. Zhao, X. Sun, and G. Liu, “Graphgst: Graph generative structure-aware transformer for hyperspectral image classification,” IEEE Transactions on Geoscience and Remote Sensing, 2024. [32] H. Zhang, Y. Li, Y. Jiang, P. Wang, Q. Shen, and C. Shen, “Hyperspectral classification based on lightweight 3-d-cnn with transfer learning,” IEEE Transactions on Geoscience and Remote Sensing, vol. 57, no. 8, pp. 5813–5828, 2019. [33] G. Cheng, Z. Li, J. Han, X. Yao, and L. Guo, “Exploring hierarchical convolutional features for hyperspectral image classification,” IEEE Transactions on Geoscience and Remote Sensing, vol. 56, no. 11, pp. 6712–6722, 2018. [34] W. Song, S. Li, L. Fang, and T. Lu, “Hyperspectral image classification with deep feature fusion network,” IEEE Transactions on Geoscience and Remote Sensing, vol. 56, no. 6, pp. 3173–3184, 2018. [35] S. K. Roy, G. Krishna, S. R. Dubey, and B. B. Chaudhuri, “Hybridsn: Exploring 3-d–2-d cnn feature hierarchy for hyperspectral image classification,” IEEE Geoscience and Remote Sensing Letters, vol. 17, no. 2, pp. 277–281, 2019. [36] J. M. Haut, M. E. Paoletti, J. Plaza, A. Plaza, and J. Li, “Hyperspectral image classification using random occlusion data augmentation,” IEEE Geoscience and Remote Sensing Letters, vol. 16, no. 11, pp. 1751–1755, 2019. [37] W. Zhao, L. Jiao, W. Ma, J. Zhao, J. Zhao, H. Liu, X. Cao, and S. Yang, “Superpixel-based multiple local cnn for panchromatic and multispectral image classification,” IEEE Transactions on Geoscience and Remote Sensing, vol. 55, no. 7, pp. 4141–4156, 2017. [38] S. Yu, S. Jia, and C. Xu, “Convolutional neural networks for hyperspectral image classification,” Neurocomputing, vol. 219, pp. 88–98, 2017. [39] L. Mou, X. Lu, X. Li, and X. X. Zhu, “Nonlocal graph convolutional networks for hyperspectral image classification,” IEEE Transactions on Geoscience and Remote Sensing, vol. 58, no. 12, pp. 8246–8257, 2020. [40] S. Wan, C. Gong, P. Zhong, S. Pan, G. Li, and J. Yang, “Hyperspectral image classification with context-aware dynamic graph convolutional network,” IEEE Transactions on Geoscience and Remote Sensing, vol. 59, no. 1, pp. 597–612, 2020. [41] Z. Zheng, Y. Zhong, A. Ma, and L. Zhang, “Fpga: Fast patch-free global learning framework for fully end-to-end hyperspectral image classification,” IEEE Transactions on Geoscience and Remote Sensing, vol. 58, no. 8, pp. 5612–5626, 2020. [42] X. Zhang, S. Shang, X. Tang, J. Feng, and L. Jiao, “Spectral partitioning residual network with spatial attention mechanism for hyperspectral image classification,” IEEE transactions on geoscience and remote sensing, vol. 60, pp. 1–14, 2021. [43] L. Mou, P. Ghamisi, and X. X. Zhu, “Deep recurrent neural networks for hyperspectral image classification,” IEEE Transactions on Geoscience and Remote Sensing, vol. 55, no. 7, pp. 3639–3655, 2017. [44] A. Sharma, X. Liu, and X. Yang, “Land cover classification from multitemporal, multi-spectral remotely sensed imagery using patch-based recurrent neural networks,” Neural Networks, vol. 105, pp. 346–355, 2018. [45] X. He, Y. Chen, and Z. Lin, “Spatial-spectral transformer for hyperspectral image classification,” Remote Sensing, vol. 13, no. 3, p. 498, 2021. [46] C. Zhao, B. Qin, S. Feng, W. Zhu, W. Sun, W. Li, and X. Jia, “Hyperspectral image classification with multi-attention transformer and adaptive superpixel segmentation-based active learning,” IEEE Transactions on Image Processing, 2023. [47] Z. Qiu, J. Xu, J. Peng, and W. Sun, “Cross-channel dynamic spatialspectral fusion transformer for hyperspectral image classification,” IEEE Transactions on Geoscience and Remote Sensing, 2023. [48] W. Zhou, S.-I. Kamata, H. Wang, and X. Xue, “Multiscanning-based rnntransformer for hyperspectral image classification,” IEEE Transactions on Geoscience and Remote Sensing, 2023. [49] A. Gu, T. Dao, S. Ermon, A. Rudra, and C. Re, “Hippo: Recurrent ´ memory with optimal polynomial projections,” Advances in neural information processing systems, vol. 33, pp. 1474–1487, 2020. [50] A. Gu, K. Goel, and C. Re, “Efficiently modeling long sequences with structured state spaces,” in International Conference on Learning Representations, 2021. [51] A. Gu, I. Johnson, K. Goel, K. Saab, T. Dao, A. Rudra, and C. Re,´ “Combining recurrent, convolutional, and continuous-time models with linear state space layers,” Advances in neural information processing systems, vol. 34, pp. 572–585, 2021. [52] J. T. Smith, A. Warrington, and S. Linderman, “Simplified state space layers for sequence modeling,” in The Eleventh International Conference on Learning Representations, 2022. [53] A. Gupta, A. Gu, and J. Berant, “Diagonal state spaces are as effective as structured state spaces,” Advances in Neural Information Processing Systems, vol. 35, pp. 22 982–22 994, 2022. [54] J. Ruan and S. Xiang, “Vm-unet: Vision mamba unet for medical image segmentation,” arXiv preprint arXiv:2402.02491, 2024. [55] J. Liu, H. Yang, H.-Y. Zhou, Y. Xi, L. Yu, Y. Yu, Y. Liang, G. Shi, S. Zhang, H. Zheng et al., “Swin-umamba: Mamba-based unet with imagenet-based pretraining,” arXiv preprint arXiv:2402.03302, 2024. [56] G. Chen, Y. Huang, J. Xu, B. Pei, Z. Chen, Z. Li, J. Wang, K. Li, T. Lu, and L. Wang, “Video mamba suite: State space model as a versatile alternative for video understanding,” arXiv preprint arXiv:2403.09626, 2024. [57] X. He, K. Cao, K. Yan, R. Li, C. Xie, J. Zhang, and M. Zhou, “Pan-mamba: Effective pan-sharpening with state space model,” arXiv preprint arXiv:2402.12192, 2024. [58] H. Chen, J. Song, C. Han, J. Xia, and N. Yokoya, “Changemamba: Remote sensing change detection with spatiotemporal state space model,” IEEE Transactions on Geoscience and Remote Sensing, vol. 62, pp. 1– 20, 2024. [59] X. Ma, X. Zhang, and M.-O. Pun, “Rs3mamba: Visual state space model for remote sensing image semantic segmentation,” IEEE Geoscience and Remote Sensing Letters, vol. 21, pp. 1–5, 2024. [60] I. Loshchilov and F. Hutter, “Decoupled weight decay regularization,” in International Conference on Learning Representations, 2018. [61] L. Van der Maaten and G. Hinton, “Visualizing data using t-sne.” Journal of machine learning research, vol. 9, no. 11, 2008.
附件信息
文件名称
大小
上传时间
操作
2404.18213.pdf3.02MB2024-12-18 00:09:07下载

关联推荐信息

数据集(关联文字)

模型(关联文字)

视频(关联文字)

软件(关联文字)

文献(关联文字)

报告(关联文字)

成功

关闭 前往购物车