Grouping related stack overflow comments for software developer recommendation
作者:Viral Sheth, Kostadin Damevski
摘要
Stack Overflow is a question and answer forum widely used by developers all over the world. Contributors share their knowledge on this platform not only in the form of answers, but also as comments to those answers. With millions of developer-contributed comments, the valuable knowledge contained within them remains difficult to locate by readers. Moreover, Stack Overflow’s comment hiding mechanism that only shows the top five most highly voted comments and hides the remaining leads to wealth condensation. Recently, researchers have observed that the Stack Overflow’s comment display mechanism hides important and relevant comments and makes it difficult for readers to understand the conversational context, as many comments are related to other hidden comments. In this paper, we propose a set of features and a machine learning-based technique to identify the relatedness of pairs of comments. Further, we extend the relatedness into comment clustering, as, with clusters, readers can get the entire context of a set of comments that form a single conversational thread. We evaluate our methods against several baselines to show that they provide strong improvements, although the problem in general is made difficult by the short text and narrow topic of discussion in the comments.
论文关键词:Software developer discussions, Developer forums, Stack overflow, Comment grouping, Comment ranking
论文评审过程:
论文官网地址:https://doi.org/10.1007/s10515-022-00339-9