首页|CodeQG: Automated Multiple Question Generation for Source Code Comprehension

CodeQG: Automated Multiple Question Generation for Source Code Comprehension

扫码查看
During software maintenance and evolution, developers spend more than half of their time on code comprehension activities. In order to understand an unfamiliar code base, they would naturally ask different types of questions related to code snippets and try to find the answers. In this paper, we conduct an initial work to explore the possibility of automatic question generation for program comprehension. We construct a large-scale data set containing pairs of source code and questions that are automatically transformed from inline comments based on dependency analysis and semantic role labeling. We also build a comprehensive taxonomy of question types so as to generate questions concerning different aspects of code snippets, such as purpose, implementation details and so on. Then, we propose a deep learning-based prototype CodeQG to automatically generates multiple types of questions for code snippets. We evaluate CodeQG by using both typical performance metrics and manual evaluation. The results show that (1) we can achieve a value of 42.02 on BLEU4 and 60.81 on ROUGE-L for the generated questions; (2) overall, the questions are very correct in grammatical, semantic and format; (3) the questions are related to the corresponding code snippet and are helpful for developers in source code comprehension activities. Our work gives insights into automatically generating multiple types of questions for code comprehension. We expect this exploration will improve the applicability and generality of machine code comprehension.

Code question generationcode comprehensiondata sets constructionneural networks

Xiaowei Zhang、Lin Chen、Kaiyuan Qi、Weiqin Zou、Liye Pang、Lianfa Zhang、Peng Zhang、Guanqun Xu、Dong Zhang

展开 >

Inspur Group, Co., Ltd., Jinan 250101, P. R. China||Jinan Inspur Data Technology Co., Ltd., Jinan 250101, P. R. China||State Key Laboratory of High-End Server System, Jinan 250101, P. R. China

State Key Laboratory for Novel Software Technology Nanjing University, Nanjing 210023, P. R. China

Jinan Inspur Data Technology Co., Ltd., Jinan 250101, P. R. China||State Key Laboratory of High-End Server System, Jinan 250101, P. R. China

Nanjing University of Aeronautics and Astronautics, Nanjing 210016, P. R. China||State Key Laboratory for Novel Software Technology Nanjing University, Nanjing 210023, P. R. China

Jinan Inspur Data Technology Co., Ltd., Jinan 250101, P. R. China

IEIT SYSTEMS Co., Ltd., Jinan 250101, P. R. China||State Key Laboratory of High-End Server System, Jinan 250101, P. R. China

展开 >

2025

International journal of software engineering and knowledge engineering
  • 48