位置结构导向的多模态代码摘要生成方法

扫码查看

原文链接

万方数据
维普

中文摘要：针对软件维护中的自动代码摘要任务，提出了一种创新的模型，旨在解决现有方法在保留源代码语义结构信息方面的不足。该模型采用图神经网络和Transformer技术，以更全面地捕捉代码的语义信息和结构信息。此外，采用字节对编码算法来处理未登录词问题，并通过四元组的形式保留抽象语法树的结构信息。这样的组合使得模型在处理源代码时不仅能够全面地捕捉代码的语义特征，还能够准确地学习到代码的语法结构。在真实Java数据集上的实验结果表明，该模型在BLEU、METEOR和ROUGE指标上均优于基线模型，从而验证了其在生成更准确代码摘要方面的有效性。

外文标题：A positional structure-oriented multimodal code summarization generation approach

外文摘要：For the task of automatic code summarization in software maintenance,an innovative model was proposed to address the limitations of existing methods in preserving semantic and structural information from source code.This model leveraged graph neural networks and Transformer technology to comprehensively capture both semantic and structural aspects of code.Additionally,byte pair encoding algorithm was employed to handle out-of-vocabulary words,and abstract syntax tree structure information was preserved using quadruples.This combination enabled the model to not only comprehensively capture the semantic features of source code but also accurately learn its syntactic structure.Experimental results on a real-world Java dataset demonstrate that this model outperforms baseline models in terms of BLEU,METEOR and ROUGE metrics,validating its effectiveness in generating more accurate code summarization.

外文关键词：

automatic code summarizationbyte pair encodingabstract syntax treeTransformer

作者：

张学君、侯霞

展开 >

作者单位：

北京信息科技大学计算机学院,北京 100101

关键词：

自动代码摘要字节对编码抽象语法树 Transformer

基金：

北京市自然科学基金青年基金

项目编号：

4224090

出版年：

2024

DOI：

10.16508/j.cnki.11-5866/n.2024.02.007

北京信息科技大学学报(自然科学版)

北京信息科技大学

北京信息科技大学学报(自然科学版)

影响因子：0.363

ISSN：1674-6864

年,卷(期)：2024.39(2)

参考文献量15