Postdoctoral Researcher - Software Defined Networking for HPC systems

Updated: over 1 year ago
Job Type: FullTime
Deadline: 14 Dec 2022

Investigate how Software-defined networking can improve large-scale HPC systems

HPC systems often use complex low-diameter network topologies, to balance network bandwidth, latency and construction cost. The interconnection between the thousands of nodes in such a system is recognized as an important performance bottleneck.  The (static) architecture of HPC systems limits their usefulness to a few selected applications, as the ideal topology for applications requiring large bisection bandwidth differs from the one needed for more localized workloads – leading to mediocre performance when executing dynamic applications.

Furthermore, the network typically only extends between nodes, and does not form an integral part of the compute nodes themselves, i.e., the Network-on-Chip used within the compute nodes is not co-optimized with the global interconnect network. In this position, you will investigate how principles of Software-defined networking and time-sensitive networking can improve the performance of these large-scale systems, from a theoretical perspective, but also from a practical implementation standpoint.  As it is an integral part of the architecture of a system, multi-disciplinary interactions with people developing the software programming models, IC architecture and system performance models are expected.

This project is an initiative of the Compute Systems Architecture Unit (CSA). The CSA unit researches emerging workloads and their performance on large-scale supercomputer architectures for next-generation Artificial Intelligence (AI) and high-performance computing (HPC) applications. The team is responsible for algorithm research, runtime management innovations, performance modeling, architecture simulation and prototyping for these future applications and the future systems to execute them, to reach multiple orders of magnitude better performance, energy-efficiency, and total-cost-of-ownership.



Similar Positions