VTechWorks staff will be away for the winter holidays starting Tuesday, December 24, 2024, through Wednesday, January 1, 2025, and will not be replying to requests during this time. Thank you for your patience, and happy holidays!
 

How Do Developers Reuse StackOverflow Answers in Their GitHub Projects?

Files

TR Number

Date

2024-10-27

Journal Title

Journal ISSN

Volume Title

Publisher

ACM

Abstract

StackOverflow (SO) is a widely used question-and-answer (Q&A) website for software developers and computer scientists. GitHub is an online development platform used for storing, tracking, and collaborating on software projects. Prior work relates the information mined from both platforms without carefully inspecting the answer-reuse practices. For this paper, we did an empirical study by mining the SO answers reused by Java projects available on GitHub. We created a hybrid approach of clone detection, keyword-based search, and manual inspection, to identify the answer(s) actually used by developers. Based on those answers, we studied topics of the discussion threads, answer characteristics (e.g., scores, ages, code lengths, and text lengths), and developers’ reuse practices.

We observed that most reused answers offer programs to implement specific coding tasks. Among all analyzed SO discussion threads, the reused answers often have higher scores, older ages, longer code, and longer text than unused answers. In only 9% of scenarios (40/430), developers fully copied answer code for reuse. In the remaining scenarios, they reused partial code or created brand new code from scratch. Our study characterized 130 SO discussion threads referred to by Java developers in 357 GitHub projects. Our observations can guide SO answerers to provide better answers, and shed lights on future human-centric research that creates better tools to help with code reuse.

Description

Keywords

Citation