Dr. Shayok Chakraborty has a paper accepted at the Neural Information Processing Systems (NeurIPS) 2024 conference. NeurIPS is a flagship conference in machine learning and AI. The paper is titled “Empowering Active Learning for 3D Molecular Graphs with Geometric Graph Isomorphism” and is in collaboration with Prof. Yi Liu at Stony Brook University. Ronast Subedi, an MS student in the Computer Science department at FSU, is a joint primary author of this paper, together with Wenhan Gao, a PhD student at Stony Brook University.

Molecular learning is pivotal in many real-world applications, such as drug discovery. Supervised learning requires heavy human annotation, which is particularly challenging for molecular data. Active learning (AL) automatically queries labels for most informative samples, thereby remarkably alleviating the annotation hurdle. In this paper, the authors present a principled AL paradigm for molecular learning, where molecules are treated as 3D molecular graphs. Specifically, the authors propose a new diversity sampling method to eliminate mutual redundancy built on distributions of 3D geometries. They first propose a set of new 3D graph isometries for 3D graph isomorphism analysis. The proposed method is provably at least as expressive as the Geometric Weisfeiler-Lehman (GWL) test. The moments of the distributions of the associated geometries are then extracted for efficient diversity computing. To ensure the AL paradigm selects samples with maximal uncertainties, the authors design a Bayesian geometric graph neural network to compute uncertainties specifically for 3D molecular graphs. Active sampling is then formulated as a quadratic programming (QP) problem using the proposed components. Experimental results demonstrate the effectiveness of the AL paradigm, as well as the proposed diversity and uncertainty methods.

The paper will be presented at the NeurIPS conference in Vancouver, Canada in December 2024.