The U.S. Department of Energy (DOE)’s Early Career Award (ECA) gave me the opportunity to advance the understanding of memory performance in extreme scale or exascale computing. This understanding is the key to increasing the accuracy and scalability of many mission-critical scientific simulations run on modern supercomputers.
Exascale computing systems often integrate thousands of computing units into a single chip. Each of the computing chips need data to process. As the number of computing units keeps increasing, the total demands for data grow rapidly. But the speed for data to move from memory to processors increases much more slowly. This expanding gap fundamentally limits the current achievable performance of exascale computing.
With the ECA support, I pioneered a few research directions for narrowing this gap. These directions are mainly about data reorganization and code optimization on Graphic Processing Units (GPU), an important type of processors in exascale systems.
One of the techniques deals with irregular memory during program executions. Irregular memory accesses read or write data without patterns. They are not helpful to memory systems and hurt the memory access speed. The new technique transforms a program such that at runtime its irregular accesses become regular. Little overhead is incurred, but data access speed increases a lot. Other techniques enable computing systems to flexibly manage the many parallel contexts on exascale systems to further speed up memory accesses.
These techniques have prepared some important foundations for code optimizations on exascale systems. They have influenced the development and improvements of numerous pieces of modern software, in high performance computing and beyond. They have inspired many studies and received over 3,000 citations. By greatly shortening simulation times, these techniques have helped accelerate scientific research and discoveries.
Xipeng Shen is a professor of computer science at North Carolina State University.
The Early Career Research Program provides financial support that is foundational to early career investigators, enabling them to define and direct independent research in areas important to DOE missions. The development of outstanding scientists and research leaders is of paramount importance to the Department of Energy Office of Science. By investing in the next generation of researchers, the Office of Science champions lifelong careers in discovery science.
For more information, please go to the Early Career Research Program.
Data Locality Enhancement of Dynamic Simulations for Exascale Computing
Computer simulation is important for scientific research in many disciplines. Many such programs are complex and transfer a large amount of data in a dynamically changing pattern. Memory performance is key to maximizing computing efficiency in the era of Chip Multiprocessors (CMP) due to the growing disparity between the slowly expanded memory bandwidth and the rapidly increased demands for data by processors.
The importance is underlined by the trend toward exascale computing, in which the processors are expected to each contain hundreds or thousands of (heterogeneous) cores. Unfortunately, today’s computer systems lack support for a high degree of memory transfer. This project proposes to improve memory performance of dynamic applications by developing two new techniques that are tailored especially for the emerging features of CMP.
The first technique is asynchronous streamlining, which analyzes the memory reference patterns of an application during runtime and regulates both control flows and memory references on the fly.
The second technique is neighborhood‐aware locality optimization, which concentrates on the non‐uniform relations among computing elements.
This research will produce a robust tool for scientific users to enhance program locality on multi‐ and many‐core systems that is not possible to achieve with existing tools. Further, it will contribute to the advancement of computational sciences and promote academic research and education in the challenging field of scientific computing.
E.Z. Zhang, Y. Jiang, Z. Guo, K. Tian, and X. Shen, "On-the-fly elimination of dynamic irregularities for GPU computing." Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems, Pages 369-380, (March 2011). [DOI: 10.1145/1950365.1950408]
G. Chen, B. Wu, D. Li, and X. Shen, "PORPLE: An extensible optimizer for portable data placement on GPU." The 47th Annual IEEE/ACM International Symposium on Microarchitecture, (December 2014). [DOI: 10.1109/MICRO.2014.20]
G. Chen, X. Shen, B. Wu, and D. Li, "Optimizing data placement on GPU memory: A portable approach." IEEE Transactions on Computers 66, (2017). [DOI: 10.1109/TC.2016.2604372]
DOE Explains… offers straightforward explanations of key words and concepts in fundamental science. It also describes how these concepts apply to the work that the Department of Energy’s Office of Science conducts as it helps the United States excel in research across the scientific spectrum. For more information on exascale computing and DOE’s research in this area, please go to “DOE Explains… Exascale Computing.”
Additional profiles of the Early Career Research Program award recipients can be found at /science/listings/early-career-program.
The Office of Science is the single largest supporter of basic research in the physical sciences in the United States and is working to address some of the most pressing challenges of our time. For more information, please visit www.energy.gov/science.