Ram Sharan Chaulagain, a PhD candidate in the Computer Science department, under the guidance of Dr. Xin Yuan, has made a significant contribution to interconnection networks. His paper on improving the state-of-the-art interconnect routing schemes, titled “Enhanced UGAL Routing Schemes for Dragonfly Networks,” has recently been accepted by the ACM International Conference on Supercomputing (ICS).
The Dragonfly networks have been deployed in the current generation supercomputers such as the Frontier supercomputer, the world’s first exascale supercomputing. They will be deployed in the future supercomputers and data centers. Effective routing on Dragonfly is challenging. Universal Globally Adaptive Load-balanced routing (UGAL) is the state-of-the-art routing algorithm for Dragonfly. For each packet, UGAL selects either a minimal path or a non-minimal path based on their estimated latencies. Practical UGAL makes routing decisions with local information, deriving the estimated latency for each path from the local queue occupancy and path hop count information. In this work, we develop techniques to improve the accuracy of the latency estimation for UGAL with local information, which results in more effective routing decisions. In particular, our schemes are able to proactively mitigate the potential network congestion with imbalanced network traffic. Extensive simulation experiments using synthetic traffic patterns and application workloads demonstrate that our enhanced UGAL schemes significantly improve the routing performance for many common traffic conditions.