具有细粒度感受野的多尺度融合口腔模型分割

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：目的从口内扫描点云模型上精确分割牙齿是计算机辅助牙科治疗中重要的任务,但存在手动执行耗时且烦琐的问题.近年来,计算机视觉领域涌现出一些端到端的方法实现三维形状分割.然而,大多数方法没有注意到口腔分割需要网络具有更加细粒度的感受野,因此分割精度仍然受到限制.为了解决该问题,设计了一个端到端的具有细粒度感受野的全自动牙齿分割网络——TRNet,用于在未加工的口内扫描点云模型上自动分割牙齿.方法首先,TRNet使用了具有细粒度感受野的编码器,其基于多尺度融合从不同的尺度提取到更全面的口腔模型特征,并通过更适合口腔模型分割的细粒度分组查询半径以及具有相对坐标归一化的特征提取层来提升分割性能.其次,TRNet采用了基于层级连接的特征嵌入方式,网络学习到口腔模型中由各个局部区域到覆盖更大范围空间的关键特征,特征提取更全面,提升了网络的分割精度.同时,TRNet使用了基于软性注意力机制的特征融合方式,使网络更好地从融合特征中关注到口腔模型的关键信息.结果使用由口内扫描仪获取的患者口内扫描点云模型数据集评估了 TRNet.经过5折交叉验证的实验结果中,TRNet的总体准确率(overall accuracy,OA)达到了97.015±0.096％,平均交并比(mean intersection over union,mIoU)达到了 92.691±0.454％,显著优于现有方法.结论实验结果表明,提出的具有细粒度感受野的多尺度融合口腔分割模型在口内扫描点云模型上取得了较好表现,提高了网络对于口腔模型的分割能力,使点云分割结果更准确.

外文标题：Dental model segmentation network with fine-grained receptive fields and multiscale fusion

外文摘要：Objective Dental computer-aided therapy relies on the use of dental models to aid dentists in their practice.One of the most fundamental tasks in dental computer-aided therapy is the automated segmentation of teeth using point cloud data obtained from intra-oral scanners(IOS).The precise segmentation of each individual tooth in this procedure pro-vides vital information for a variety of subsequent tasks.These segmented dental models facilitate customized treatment planning and modeling,thus providing extensive assistance in carrying out further treatments.However,the automated seg-mentation of individual teeth from dental models faces three significant challenges.First,the indistinct boundary between teeth and gums poses difficulties in segmentation based solely on geometric features.Second,certain factors,such as occlusion during scanning,can lead to suboptimal results,particularly in posterior dental regions,thereby further compli-cating the segmentation process.Lastly,teeth often exhibit complex anomalies in patients,including crowding,missing teeth,and misalignment issues,which further complicate the task of accurate segmentation.To address these challenges,two conventional methods are proposed for segmenting teeth in images obtained from IOS scanners.The first method employs a projection-based approach,wherein a 3D dental scan image is initially projected into a 2D space,segmentation is then performed in a 2D space,and the result is remapped back into the 3D space.The second method adopts a geometry-based approach and typically utilizes geometric attributes,such as surface curvature,geodesic information,harmonic fields,and other geometric properties,to distinguish tooth structures.However,these methods are not fully automated and rely on domain-specific knowledge and experience.Moreover,the predefined low-level attributes used by these methods lack robustness when dealing with the complex appearance of patietns'teeth.Considering the impactful application of con-volutional neural networks(CNN)in computer vision and medical image processing,several deep learning methods rooted in CNN have been introduced.Some of these methods directly extract translation-invariant depth geometric features from 3D point cloud data but suffer from a lack of necessary receptive field for fine-grained tasks,such as dental model segmen-tation.Moreover,the network structure exhibits redundancy and neglects the crucial details of dental models.To address these issues,a fully automatic tooth segmentation network called TRNet is proposed in this paper,which can automatically segment teeth on unprocessed intra-oral scanned point cloud models.Method In the proposed end-to-end 3D point cloud-based multi-scale fusion dental model segmentation method,an encoder with a fine-grained receptive field is employed to address those challenges posed by the small size of each tooth within the dental model and the lack of distinct features between the teeth and gums.Each tooth within the dental model is relatively small in comparison to the entire dental model,and the boundaries between the teeth and gums lack distinct features.Consequently,a fine-grained receptive field is essential for extracting features from this model.The network adopts a small radius for querying the neighborhood,thus narrowing the receptive field and enabling the network to focus on detailed features.Additionally,downsampling can lead to the uneven density of the point cloud,thereby causing the network trained on sparse point clouds to struggle in recogniz-ing fine-grained local structures.Multiscale feature fusion coding is implemented to address these issues.Given that the encoder uses a small query radius to create a fine-grained receptive field,the relative coordinates become relatively small.Consequently,the network needs to learn large weights to operate on these relative coordinates,thereby introducing further challenges in network optimization.TRNet normalizes the relative coordinates in the feature extraction layer to facilitate network optimization and enhance segmentation performance.The network also employs a highly efficient decoder.Previ-ous segmentation methods often utilize the U-Net structure,which incorporates jump connections for multi-level feature aggregation between the input features of the cascaded decoder and the outputs of the corresponding layer encoder.How-ever,this top-down propagation is considered inefficient for feature aggregation.The decoding approach used by TRNet directly combines the features outputted from all cascade encoders,thereby allowing the network to learn the importance of each cascade.The discrepancies in scales or dimensions of the features represented by fused information in the network may also introduce unwanted bias during the fusion process.To address these issues and ensure that the network focuses on crucial information within the fused features,a soft attention mechanism is incorporated into the fusion process.Specifi-cally,a soft attention operation is performed on the newly combined features after their connection,thereby enabling the network to adaptively balance the discrepancies of different scales or levels in the propagated features.Result A dataset comprising dental models taken from numerous patients with irregular tooth shapes,such as crowding,misalignment,and underdeveloped teeth,was compiled.To establish the labeled values,an experienced dentist meticulously segmented and annotated these models.The dataset was then randomly divided into two subsets,with 146 models allocated for training and 20 models reserved for testing.Data augmentation techniques,such as random panning and scaling,were employed to enhance the diversity of the training set.In each iteration,intra-oral scan images were shifted by a randomly selected value within the range of[-0.1,0.1]and scaled by a randomly chosen magnification within the range of[0.8,1.25],thereby generating new training data.Experimental results from a 5-fold cross-validation reveal that TRNet achieved an overall accuracy(OA)of 97.015±0.096％and a mean intersection over union(mIoU)of 92.691±0.454％,significantly outper-forming the existing methods.Conclusion An end-to-end deep learning network called TRNet is introduced in this paper for the automatic segmentation of teeth in 3D dental images acquired from intra-oral scanners.An encoder with fine-grained receptive fields was also implemented to enhance the local feature extraction capabilities essential for dental model segmen-tation.Additionally,a decoder based on hierarchical connections was employed to allow the network to decode efficiently by learning the significance of each level.This refinement significantly improves the precision of dental model segmenta-tion.A soft attention mechanism was also integrated into the feature fusion process to enable the network to focus on key information within dental model features.Experimental results indicate that TRNet shows excellent performance on intra-oral scanned point cloud models and enhances the ability of the network to segment dental models,thereby improving the accuracy of point cloud segmentation results.

外文关键词：

automatic dental model segmentationpoint cloudfine-grained receptive fieldsmultiscale feature fusioncoordinate normalizationsoft attention mechanism

作者：

周新文、朱洋、葛峻沂、潘钱家、魏然、顾敏

展开 >

作者单位：

常州大学计算机与人工智能学院,常州 213164

苏州大学附属第三医院\常州市第一人民医院,常州 213003

常州大学医学与健康工程学院,常州 213164

关键词：

自动口腔模型分割点云细粒度感受野多尺度特征融合坐标归一化软性注意力机制

出版年：

2024

DOI：

10.11834/jig.230769

中国图象图形学报

中国科学院遥感应用研究所,中国图象图形学学会 ,北京应用物理与计算数学研究所

中国图象图形学报

CSTPCD北大核心

影响因子：1.111

ISSN：1006-8961

年,卷(期)：2024.29(12)