Why U.S. science needs costly supercomputers; China could overtake US

·6 min read
Jack Dongarra spoke to the Friends of ORNL recently.
Jack Dongarra spoke to the Friends of ORNL recently.

Thanks to Frontier, the Department of Energy’s first $600 million exascale supercomputer — located at Oak Ridge National Laboratory — the United States ranks first on the Top500 supercomputer list in the number of calculations that can be performed per second. But Jack Dongarra noted that China, which has two supercomputers ranked sixth and ninth on the list, could overtake the United States' computing capacity for solving scientific and technological problems.

Dongarra, who has appointments at the University of Tennessee, ORNL, and the University of Manchester in the United Kingdom, is this year’s winner of the prestigious A.M. Turing Award from the Association of Computing Machinery, the equivalent of the Nobel Prize in computing. It carries a $1 million prize.

Dongarra recently spoke to Friends of ORNL (FORNL).

Jack Dongarra is photographed inside his office on the University of Tennessee at Knoxville campus on Monday, May 2, 2022. Dongarra is the 2021 Turing Award recipient.
Jack Dongarra is photographed inside his office on the University of Tennessee at Knoxville campus on Monday, May 2, 2022. Dongarra is the 2021 Turing Award recipient.

In 1993, he explained, he provided a software standard or benchmark for evaluating the relative performances of the world’s top 500 supercomputers twice a year by having them address a problem of solving linear equations. He and two others manage the Top500 list.

Jack Dongarra
Jack Dongarra

Dongarra told FORNL that Frontier has executed calculations at a rate of 1.1 exaflops, or 1.1 quintillion calculations per second — that is, a billion times a billion floating point operations per second, or FLOPS (e.g., addition and multiplication of numbers with decimal points). Frontier is five times faster than the most powerful supercomputers in use today.

The latest TOP500 list shows that DOE’s Frontier supercomputer at ORNL is ranked No. 1 in the world.
The latest TOP500 list shows that DOE’s Frontier supercomputer at ORNL is ranked No. 1 in the world.

To help his audience grasp the power of Frontier, which takes up the space of two tennis courts, Dongarra suggested that we imagine that UT has 60 Neyland Stadiums, each with 100,000 filled seats. To perform the number of calculations per second you can get on Frontier, you must give each person a laptop capable of 166 billion FLOPS and connect the laptops in all the stadiums together.

China

Dongarra has been out of touch with Chinese computer scientists in the past two years, but he has heard rumors that China may have built two exascale computers, but chooses to keep its achievement under wraps. He said that the Taiwan Semiconductor Manufacturing Co. supplies American and Chinese companies with state-of-the-art semiconductor fabrication for supercomputers. Taiwan is an independent democracy that has a strong partnership with the U.S. and resists becoming part of Communist China.

Dongarra noted that China leads the world with 173 supercomputers used for science. The U.S. comes in second with 126 supercomputers. One of the American supercomputers is Summit at ORNL, which once ranked first and now is fourth in the Top500. Dongarra said it will be disassembled when it is five years old because its maintenance cost will be too high.

Teams of dedicated people overcame numerous hurdles, including pandemic-related supply chain issues, to complete Frontier’s installation. Despite these challenges, delivery of the system took place from September to November 2021. Credit: Carlos Jones/ORNL, U.S. Dept. of Energy
Teams of dedicated people overcame numerous hurdles, including pandemic-related supply chain issues, to complete Frontier’s installation. Despite these challenges, delivery of the system took place from September to November 2021. Credit: Carlos Jones/ORNL, U.S. Dept. of Energy

The cost

The U.S. Department of Energy's Exascale Computing Program will cost taxpayers $3.6 billion over seven years. Next year, in addition to Frontier at ORNL, two more $600 million DOE exascale supercomputers — Aurora at Argonne National Laboratory and El Capitan at Lawrence Livermore National Laboratory — should be in operation, running 21 science applications.

Thanks to Frontier, a $600 million exascale supercomputer at Oak Ridge National Laboratory, the United States ranks first on the Top500 supercomputer list in the number of calculations that can be performed per second.
Thanks to Frontier, a $600 million exascale supercomputer at Oak Ridge National Laboratory, the United States ranks first on the Top500 supercomputer list in the number of calculations that can be performed per second.

These exascale supercomputers will test mathematical models that simulate complex physical phenomena or designs, such as climate and weather, evolution of the cosmos, small modular reactors, new chemical compounds that might be used in vaccines or therapeutic drugs, power grids, wind turbines, and combustion engines.

Although rapid calculations can be made using cloud computing, Dongarra said that large, powerful supercomputers are needed to provide “better fidelity and more accuracy in our calculations” and to get “three-dimensional, fully realistic implementation of what scientists are trying to model, as well as better resolution as they find out at a deeper level what is going on at a finer scale.”

The equations being solved are based on the laws of physics, and most supercomputers are programmed using the Fortran and C++ languages.

Another purpose for bigger supercomputers, he added, is to “optimize a model” of, say, a combustion engine by “running thousands of different models with adjusted parameters” to identify an engine design that uses fuel with maximum efficiency and minimum emissions of pollutants.

Dongarra said that computational simulations are the third pillar of science after theory and experimentation. He noted that researchers use computer models to get answers to some questions because, for example, it is too difficult to build large wind tunnels and too dangerous to try a new chemical on humans to see if it would be an effective drug.

We would have to wait too long to find out how much the climate would change if we doubled our combustion of fossil fuels, he indicated.

Frontier was built by Hewlett Packard Enterprises (HPE) using an interconnect from Cray (which HPE recently purchased), and almost eight million compute processing units (CPUs) and graphics processing units (GPUs) made by Advanced Micro Devices (AMD) that must be coordinated. GPUs, which are used for videogames and video editing and have more transistors than CPUs, perform complex mathematical and geometric calculations needed for graphics rendering.

“The GPUs are providing 98% of the performance capability of Frontier,” Dongarra said. He added that the calculation speed could be quadrupled if the sizes of the numbers with decimal points were represented in a compressed way (e.g., in the case of pi, from 3.14159265 to 3.14). That could increase Frontier’s peak performance of 2 exaflops for modeling and simulation to 11.2 exaflops.

To explain supercomputer performance, Dongarra used a car analogy.

“When driving your car, your speedometer might indicate you can go up to 160 miles per hour. That’s the theoretical potential of your car. But the highest speed you can achieve safely on the road is 120 mph. That’s what we are trying to measure — the achievable rate of execution in the supercomputer’s performance," he said.

In his summary, Dongarra said that supercomputer hardware is constantly changing, so programmers must keep developing algorithms and designing software to match hardware capabilities. He added that a major revolution in high-performance computing will be the upcoming increase in the use of artificial intelligence (AI), the ability of a computer program or machine to learn and think like humans by finding patterns in a vast database.

An example is the supercomputers behind weather predictions, Dongarra said.

“Weather forecasting starts by constructing a three-dimensional model of today’s weather. AI learns by searching data on all the weather conditions over the past hundred years to get insights. Based on knowledge of those conditions and of physical laws, it can predict with accuracy what the weather will be in the next few days," he said.

At ORNL, Summit has already used AI for some of its simulations.

The future is here.

This article originally appeared on Oakridger: Why U.S. science needs costly supercomputers