Identifying Product Defects from User Complaints: A Probabilistic Defect Model

TR Number




Journal Title

Journal ISSN

Volume Title


Department of Computer Science, Virginia Polytechnic Institute & State University


The recent surge in using social media has created a massive amount of unstructured textual complaints about products and services. However, discovering and quantifying potential product defects from large amounts of unstructured text is a nontrivial task. In this paper, we develop a probabilistic defect model (PDM) that identifies the most critical product issues and corresponding product attributes, simultaneously. We facilitate domain-oriented key attributes (e.g., product model, year of production, defective components, symptoms, etc.) of a product to identify and acquire integral information of defect. We conduct comprehensive evaluations including quantitative evaluations and qualitative evaluations to ensure the quality of discovered information. Experimental results demonstrate that our proposed model outperforms existing unsupervised method (K-Means Clustering), and could find more valuable information. Our research has significant managerial implications for mangers, manufacturers, and policy makers.



Data and text mining