Sparsity-aware Kernel Selection for Edge-Connected Jaccard Similarity in Graph Datasets

Loading...
Thumbnail Image

TR Number

Date

2025-09-08

Journal Title

Journal ISSN

Volume Title

Publisher

ACM

Abstract

Performance of graph algorithms often depends both on the underlying hardware architecture and on structural properties of the input graph. Optimizations that deliver high performance on one class of graphs, such as hypersparse graphs with low average degree, can degrade performance on other classes, for example denser graphs with high average degree. In this work, we investigate sparsityaware GPU kernel selection for computing the Jaccard similarity index, a measure of neighborhood overlap in graph datasets. In our kernel selection approach, we use the vertex-centric Jaccard similarity implementation from the cuGraph library as the baseline and include both vertex- and edge-centric variants of this kernel, with set-intersection algorithms varying between two-pointer linear search, binary search, and adaptive dynamic search.

We use 80 real-world graphs in our evaluation with variation in average degree, maximum degree, Gini index, and average intersection cost. A random forest classifier, trained on a subset of these graphs on an NVIDIA A100 GPU, achieves 88.8% inference accuracy in predicting the fastest kernel. Kernels selected by the classifier achieve a 4.37× mean speedup over the vertex-centric cuGraph baseline from NVIDIA.

Description

Keywords

Citation