测绘地理信息2024,Vol.49Issue(6) :111-118.DOI:10.14188/j.2095-6045.20230805

基于多尺度多模态自监督学习的点云分类分割网络

3D Point Cloud Classification and Segmentation Network Based on Multi-scale and Multi-modal Self-Supervised Learning

张梓晴 关雪峰 李旭 吴华意
测绘地理信息2024,Vol.49Issue(6) :111-118.DOI:10.14188/j.2095-6045.20230805

基于多尺度多模态自监督学习的点云分类分割网络

3D Point Cloud Classification and Segmentation Network Based on Multi-scale and Multi-modal Self-Supervised Learning

张梓晴 1关雪峰 1李旭 1吴华意1
扫码查看

作者信息

  • 1. 武汉大学测绘遥感信息工程国家重点实验室,湖北 武汉,430079
  • 折叠

摘要

提出了基于多尺度多模态自监督学习的点云分类分割网络(multi-scale and multi-modal self-supervised learning network,MultiSM-Net),融合对比学习和生成学习两种范式的优势.首先从对比学习角度,网络利用采样和投影得到多尺度点云子集、多视图图片并对其编码,基于"子集-视图"映射关系构建跨模态特征间的多尺度对比损失.进而网络从生成学习角度使用"编码-解码"模型对点云进行重建并计算得到重建损失.网络训练时加权融合两类损失,最小化不同模态的特征差异性,同时充分挖掘了点云深层语义信息.本文在ModelNet40、ShapeNetCore、ScanObjectNN上进行精度对比实验,实验结果表明,本文提出的MultiSM-Net与Point-BERT等最新的自监督方法相比,在预训练表征能力,下游分类、分割任务中的精度均有显著提升.

Abstract

This paper proposes a point cloud classification and segmentation model(multi-scale and multi-modal self-super-vised learning network,multism-net)based on multi-scale and multi-modal self-supervised learning,which combines the advantages of the two paradigms,contrastive learning and generative learning.First,from the perspective of contrastive learning,the model uses sampling and projection to obtain multi-scale point cloud subsets and multi-view images and en-code them,and builds multi-scale contrastive loss between cross-modal features based on the subset-view mapping rela-tionship.Furthermore,the model uses the encoding-decoding model to reconstruct the point cloud from the perspective of generative learning and establishes the reconstruction loss.During training,the two losses are weighted and fused to min-imize the feature differences of different modalities while min-ing the deep semantic information of the point cloud.This pa-per conducts experiments on ModelNet40,ShapeNetCore,and ScanObjectNN.Experimental results show that com-pared with latest self-supervised models like Point-BERT,the MultiSM-Net proposed has significantly improved pre-training representation capabilities and accuracy in down-stream classification and segmentation tasks.

关键词

点云分类/点云分割/多模态数据/自监督学习

Key words

point cloud classification/point cloud segmentation/multi-modal data/self-supervised learning

引用本文复制引用

出版年

2024
测绘地理信息
武汉大学

测绘地理信息

CSTPCDCSCD
影响因子:0.563
ISSN:1007-3817
段落导航相关论文