The flowing nature matters: feature learning from the control flow graph of source code for bug localization

作者:Yi-Fan Ma, Ming Li

摘要

Bug localization plays an important role in software maintenance. Traditional works treat the source code from the lexical perspective, while some recent researches indicate that exploiting the program structure is beneficial for improving bug localization. Control flow graph (CFG) is a widely used graph representation, which essentially represents the program structure. Although using graph neural network for feature learning is a straightforward way and has been proven effective in various software mining problems, this approach is inappropriate since adjacent nodes in the CFG could be totally unrelated in semantics. On the other hand, previous statements may affect the semantics of subsequent statements along the execution path, which we call the flowing nature of control flow graph. In this paper, we claim that the flowing nature should be explicitly considered and propose a novel model named cFlow for bug localization, which employs a particular designed flow-based GRU for feature learning from the CFG. The flow-based GRU exploits the program structure represented by the CFG to transmit the semantics of statements along the execution path, which reflects the flowing nature. Experimental results on widely-used real-world software projects show that cFlow significantly outperforms the state-of-the-art bug localization methods, indicating that exploiting the program structure from the CFG with respect to the flowing nature is beneficial for improving bug localization.

论文关键词:Control flow graph, Bug localization, Flow-based GRU, Software mining

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10994-021-06078-4