CodeAttention: translating source code to comments by exploiting the code constructs

作者:Wenhao Zheng, Hongyu Zhou, Ming Li, Jianxin Wu

摘要

Appropriate comments of code snippets provide insight for code functionality, which are helpful for program comprehension. However, due to the great cost of authoring with the comments, many code projects do not contain adequate comments. Automatic comment generation techniques have been proposed to generate comments from pieces of code in order to alleviate the human efforts in annotating the code. Most existing approaches attempt to exploit certain correlations (usually manually given) between code and generated comments, which could be easily violated if coding patterns change and hence the performance of comment generation declines. In addition, recent approaches ignore exploiting the code constructs and leveraging the code snippets like plain text. Furthermore, previous datasets are also too small to validate the methods and show their advantage. In this paper, we propose a new attention mechanism called CodeAttention to translate code to comments, which is able to utilize the code constructs, such as critical statements, symbols and keywords. By focusing on these specific points, CodeAttention could understand the semantic meanings of code better than previous methods. To verify our approach in wider coding patterns, we build a large dataset from open projects in GitHub. Experimental results in this large dataset demonstrate that the proposed method has better performance over existing approaches in both objective and subjective evaluation. We also perform ablation studies to determine effects of different parts in CodeAttention.

论文关键词:software mining, machine learning, code comment generation, recurrent neural network, attention mechanism

论文评审过程:

论文官网地址:https://doi.org/10.1007/s11704-018-7457-6