Dr. Guang Wang, an Assistant Professor in the Computer Science Department, has one paper got accepted by the 50th International Conference on Very Large Databases (VLDB’24), which will be held in Guangzhou China in late August, 2024. This research paper is titled “Complex-Path: Effective and Efficient Node Ranking with Paths in Billion-Scale Heterogeneous Graphs”. Dr. Guang Wang is the corresponding author of this paper.
Node ranking in heterogeneous graphs, which quantifies the relative importance of nodes, can often be improved by incorporating information from relevant paths. Graph database and heterogeneous graph neural network (HGNN) are two main approaches to better solve this problem. Graph databases support efficient path queries for flexible path types but require manual design to combine results for node ranking. Conversely, current HGNNs can automatically integrate semantic information from multiple linear path types for accurate node ranking. However, their experiments show that current HGNNs fail to
outperform a multi-layer perceptron model that utilizes features extracted from multiple nonlinear conditional paths, which can be handled by graph databases. Therefore, they aim to enable HGNN to take advantage of these path types for better performance. However, HGNNs require a generalized path schema to define the structure of input paths, and incorporating each additional path type will significantly increase the required system memory and sampling time for HGNNs. To address these limitations, this paper introduces CompNode, a novel framework based on a new unified path schema definition called Complex-path, which is used to describe all the required path types, including nonlinear conditional path types. They also design a pre-aggregation method to reduce the required system memory and sampling time by pre-aggregating the same type of complex-path. Furthermore, they develop a model that combines semantic information from all aggregated complex-paths for accurate node ranking. Experiments with real-world datasets show that CompNode outperforms state-of-the-art HGNNs by 20% in average precision.
The paper will be presented at the VLDB 2024 in Guangzhou, China, in August.