Chen, Juntong2022-09-102022-09-102022-09-09vt_gsexam:35469http://hdl.handle.net/10919/111786StackOverflow (SO) is a widely used question-and-answer (QandA) website for software developers and computer scientists. GitHub is a code hosting platform for collaboration and version control. Popular software libraries are open-source and published in repositories on GitHub. Preliminary observation shows developers cite SO questions in their GitHub repository. This observation inspired us to explore the relationship between SO posts and GitHub repositories; to help software developers better understand the characterization of SO answers that are reused by GitHub projects. For this study, we conducted an empirical study to investigate the SO answers reused by Java code from public GitHub projects. We used a hybrid approach to ensure precise results: code clone detection, keyword-based search, and manual inspection. This approach helped us identify the leveraged answers from developers. Based on the identified answers, we further investigated the topics of the discussion threads; answer characteristics (e.g., scores, ages, code lengths, and text lengths) and developers' reuse practices. We observed both reused and unused answers. Compared with unused answers, We found that the reused answers mostly have higher scores, longer code, and longer plain text explanations. Most reused answers were related to implementing specific coding tasks. In one of our observations, 9% (40/430) of scenarios, developers entirely copied code from one or multiple answers of an SO discussion thread. Furthermore, we observed that in the other 91% (390/430) of scenarios, developers only partially reused code or created brand new code from scratch. We investigated 130 SO discussion threads referred to by Java developers in 356 GitHub projects. We then arranged those into five different categories. Our findings can help the SO community have a better distribution of programming knowledge and skills, as well as inspire future research related to SO and GitHub.ETDenIn CopyrightEmpiricalStackOverflowGitHubanswer reuseclone detectionHow Do Java Developers Reuse StackOverflow Answers in Their GitHub Projects?Thesis