A residual MLP-based multi-modal point cloud classification network
Currently,the advanced point cloud algorithms such as PCT suffer from issues like single modality,complex feature extractors,high parameter count and low computational efficiency.To address these problems,this paper proposes a streamlined and fast multi-modal point cloud classification network called Res-CLIP.The network combines ResMLP-PC with CLIP to leverage multi-modal information and improve the performance and transfer learning capabilities of the backbone network.The residual MLP is employed to enhance algorithm efficiency.The affine transformation module is integrated into the backbone network to improve algorithm accuracy.Our experimental results on the drainage pipeline defect dataset show ResMLP-PC exhibits improved precision and recall rates compared to PCT algorithm while it reduces the parameter count by almost half,thus improving the detection speed by 23%. Our Zero-Shot experiments demonstrate Res-CLIP achieves superior zero-shot accuracy on two publicly available datasets,surpassing ULIP by 4.6% and 0.5% respectively compared to existing multimodal point cloud networks.