Solving global shallow water equations on heterogeneous supercomputers

PLoS One. 2017 Mar 10;12(3):e0172583. doi: 10.1371/journal.pone.0172583. eCollection 2017.

Abstract

The scientific demand for more accurate modeling of the climate system calls for more computing power to support higher resolutions, inclusion of more component models, more complicated physics schemes, and larger ensembles. As the recent improvements in computing power mostly come from the increasing number of nodes in a system and the integration of heterogeneous accelerators, how to scale the computing problems onto more nodes and various kinds of accelerators has become a challenge for the model development. This paper describes our efforts on developing a highly scalable framework for performing global atmospheric modeling on heterogeneous supercomputers equipped with various accelerators, such as GPU (Graphic Processing Unit), MIC (Many Integrated Core), and FPGA (Field Programmable Gate Arrays) cards. We propose a generalized partition scheme of the problem domain, so as to keep a balanced utilization of both CPU resources and accelerator resources. With optimizations on both computing and memory access patterns, we manage to achieve around 8 to 20 times speedup when comparing one hybrid GPU or MIC node with one CPU node with 12 cores. Using a customized FPGA-based data-flow engines, we see the potential to gain another 5 to 8 times improvement on performance. On heterogeneous supercomputers, such as Tianhe-1A and Tianhe-2, our framework is capable of achieving ideally linear scaling efficiency, and sustained double-precision performances of 581 Tflops on Tianhe-1A (using 3750 nodes) and 3.74 Pflops on Tianhe-2 (using 8644 nodes). Our study also provides an evaluation on the programming paradigm of various accelerator architectures (GPU, MIC, FPGA) for performing global atmospheric simulation, to form a picture about both the potential performance benefits and the programming efforts involved.

MeSH terms

  • Algorithms
  • Climate
  • Computer Simulation*
  • Computers
  • Water / chemistry

Substances

  • Water

Grants and funding

This work was supported in part by the National Key R&D Program of China (863 Grant No. 2016YFA0602200, 2013AA01A208), the National Natural Science Foundation of China (grant no. 61303003, 41374113, 91530323, 91530103 and 61361120098), the National High-Tech R&D Program of China (grant no. 2013AA01A208 and 2015AA01A302), China Postdoctoral Science Foundation (no. 2016M601031), Tsinghua University Initiative Scientific Research Program (no. 20131089356), China Special Fund for Meteorological Research in the Public Interest under Grant No. GYHY201306062, UK EPSRC, the European Union Seventh Framework Programme (grant agreement no. 257906, 287804, and 318521), HiPEAC NoE, and the Maxeler University Program. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.