Wanye, FrankGleyzer, VitaliyKao, EdwardFeng, Wu-chun2024-03-042024-03-042024-01-252334-329Xhttps://hdl.handle.net/10919/118262Community detection is a well-studied problem with applications in domains ranging from networking to bioinformatics. Due to the rapid growth in the volume of real-world data, there is growing interest in accelerating contemporary community detection algorithms. However, the more accurate and statistically robust methods tend to be hard to parallelize. One such method is stochastic block partitioning&#x00A0;(SBP) &#x2013; a community detection algorithm that works well on graphs with complex and heterogeneous community structure. In this paper, we present a <underline>sam</underline>pling-<underline>ba</underline>sed <underline>S</underline>BP&#x00A0;(<monospace>SamBaS</monospace>) for accelerating SBP on sparse graphs. We characterize how various graph parameters affect the speedup and result quality of community detection with <monospace>SamBaS</monospace> and quantify the trade-offs therein. To evaluate <monospace>SamBas</monospace> on real-world web graphs without known ground-truth communities, we introduce <underline>p</underline>artition <underline>q</underline>uality <underline>s</underline>core (PQS), an evaluation metric that outperforms modularity in terms of correlation with F1 score. Overall, <monospace>SamBaS</monospace> achieves speedups of up to 10&#x00D7; while maintaining result quality (and even improving result quality by over 150&#x0025; on certain graphs, relative to F1 score).Pages 1-13In CopyrightSamBaS: <b>Sam</b>pling-<b>Ba</b>sed <b>S</b>tochastic Block PartitioningArticle - RefereedIEEE Transactions on Network Science and Engineeringhttps://doi.org/10.1109/TNSE.2024.3358301PP992327-4697