武汉大学自然科学学报(英文版)2024,Vol.29Issue(4) :349-356.DOI:10.1051/wujns/2024294349

问我任何类型:嵌入Web浏览器和集成开发环境(IDE)的代码片段类型推理插件

Ask Me Any Type:Type Inference Plugin for Partial Code on the Web and in the Integrated Development Environment

程煜 黄冠鸣 吴贻顺 赵梓杰 何祯豪 卢家兴
武汉大学自然科学学报(英文版)2024,Vol.29Issue(4) :349-356.DOI:10.1051/wujns/2024294349

问我任何类型:嵌入Web浏览器和集成开发环境(IDE)的代码片段类型推理插件

Ask Me Any Type:Type Inference Plugin for Partial Code on the Web and in the Integrated Development Environment

程煜 1黄冠鸣 2吴贻顺 1赵梓杰 1何祯豪 1卢家兴1
扫码查看

作者信息

  • 1. 江西师范大学计算机信息工程学院,江西南昌 330022
  • 2. 江西师范大学附属中学,江西南昌 330013
  • 折叠

摘要

推理代码片段中未声明的接收对象和非完全限定类型名称(非FQNs)的完全限定名称(FQNs)对于有效搜索、理解和重用代码片段至关重要.现有的类型推断工具,如COSTER和SNR,依赖于符号知识库并采用字典查找策略,将未声明的接收对象和非FQNs的简单名称映射到FQNs.然而,构建符号知识库需要解析可编译的代码文件,它限制了API和代码上下文的收集,导致待搜索的FQN不在符号知识库范围.为克服符号知识库在FQN推理中的局限性,本文实现了一种嵌入Web浏览器和集成开发环境(IDE)的类型推理插件 Ask-Me-Any-Type(AMAT).AMAT使用填空式策略而不是字典查找策略进行类型推理,通过将代码视为文本,把经过微调的大型语言模型(LLM)作为神经知识库,避免了代码编译的需要.实验结果表明,AMAT的性能优于COSTER和SNR等工具.在实践中,开发人员可以运用AMAT实时推理未解析类型名称的FQNs,直接重用代码片段.

Abstract

Inferring the fully qualified names(FQNs)of undeclared receiving objects and non-fully-qualified type names(non-FQNs)in partial code is critical for effectively searching,understanding,and reusing partial code.Existing type inference tools,such as COSTER and SNR,rely on a symbolic knowledge base and adopt a dictionary-lookup strategy to map simple names of undeclared receiving objects and non-FQNs to FQNs.However,building a symbolic knowledge base requires parsing compilable code files,which limits the collection of APIs and code contexts,resulting in out-of-vocabulary(OOV)failures.To overcome the limitations of a symbolic knowledge base for FQN inference,we implemented Ask Me Any Type(AMAT),a type of inference plugin embedded in web browsers and integrated develop-ment environment(IDE).Unlike the dictionary-lookup strategy,AMAT uses a cloze-style fill-in-the-blank strategy for type inference.By treating code as text,AMAT leverages a fine-tuned large language model(LLM)as a neural knowledge base,thereby preventing the need for code compilation.Experimental results show that AMAT outperforms state-of-the-art tools such as COSTER and SNR.In practice,de-velopers can directly reuse partial code by inferring the FQNs of unresolved type names in real time.

关键词

类型推理/大型语言模型/提示学习/网页和集成开发环境插件

Key words

type inference/large language model/prompt learning/web and integrated development environment(IDE)plugin

引用本文复制引用

基金项目

Key Scientific and Technological Research Projects of the Jiangxi Provincial Department of Education(GJJ2200303)

National Social Science Foundation Major Bidding Project(20&ZD068)

出版年

2024
武汉大学自然科学学报(英文版)
武汉大学

武汉大学自然科学学报(英文版)

CSTPCD
影响因子:0.066
ISSN:1007-1202
段落导航相关论文