What it took to build the world’s first exascale supercomputer
By Shannon Smith, Teknovation Assistant Editor, PYA
The fastest supercomputer in the world is right here in East Tennessee. We all know that. Oak Ridge National Laboratory (ORNL) has had several supercomputers nab that title over the past decade.
But did you ever stop to think about how it’s physically made? This isn’t your standard desktop computer we’re talking about here. Frontier, the current supercomputer, is a beast.
Matt Sieger, Frontier Deputy Project Director within the Oak Ridge Leadership Computing Facility (OLCF), explained the construction process to a crowd at Jewelry Television (JTV) Thursday as the keynote speaker at the Knoxville Technology Council’s (KTech) PULSE Technology Summit.
Why do we care about supercomputers? Because they can do a lot of cool things for a lot of people.
“Since the Leadership Computing Facility was set up in 2004, we’ve delivered nine of these system upgrades, all of them, and this is our mantra, are on scope, on schedule, and on budget,” said Sieger. “Our mission is to field the most powerful machines for open science in the world.”
In the past, those supercomputers housed at ORNL have included Jaguar, Titan (named after the Tennessee Titans football team), and Summit (named after former Lady Vols Basketball Head Coach Pat Summit). All three of these were the fastest in the world at their time.
So let’s chat about why Frontier is a big deal, even though its name doesn’t have a specific tie to Tennessee. It’s the fastest supercomputer in the world, the world’s first exascale supercomputer, and the first to ever break the exaflop computing barrier at 1.1 exaflops. Yes, that does sound like a made-up word, but an exaflop is 10 to the 18th floating point operations per second. That’s a lot of zeroes, and what makes this computer exascale capable.
“It’s also the most energy-efficient supercomputer in the world,” said Sieger. “It currently holds both number one and number two spots on the Green 500, which is a ranking of energy efficiency.”
The number one spot is held by a single cabinet of Frontier.
“It’s the most powerful machine available in the world for doing artificial intelligence,” said Sieger.
So when it comes to the physical makeup of Frontier, it has a lot going on in that department, too. Frontier is made up of 9,408 nodes split onto 74 racks, which weigh about 8,000 pounds each.
“They all work together,” said Sieger. “They’re designed to take a problem, break it up, distribute it across all those different machines, and then crunch like crazy on it until you get your answer.”
The Frontier system as a whole is rated for 29 megawatts of power to operate it, with 9.2 petabytes of total memory in the system. “It’s crazy how powerful this machine is,” said Sieger.
Frontier launched in 2021 and reached full capabilities in 2022, but construction on the facility that would house it began in 2018. And it’s big.
“We got kicked out of our offices, which were renovated into transformers, and we’ve all had to share offices for the past four years,” said Sieger.
Frontier is a water-cooled system, and ORNL had to deliver about 4,000 gallons per minute of water to the machines. Sieger said the only place available to build a water cooling plant was at the other end of the building, which was occupied by laboratory space.
“We had to rip all that out and then build a pipe bridge that went from one building up over the roof and down into the other to deliver all that water to the frontier system,” he said.
A lot more construction had to be done to accommodate the water and the sheer weight of these computing racks.
“We had to reinforce the foundations of the entire building, import a whole new slab back, and then build a support structure for all of the piping that had to go in there to carry all that water. So we call that the jungle gym. It was a network of girders and beams, which helps support all that power. And then we installed all the machinery to deliver that 30 megawatts of cooling power to Frontier,” said Sieger.
All the physical construction was done by local East Tennessee workers. On the subject of water, chemists currently monitor the water cooling system to make sure no bacteria start to grow inside.
And in an effort to lay more power lines to Frontier, environmental scientists were involved when they learned the trees they needed to remove for the lines held bat nests for most of the year. See, supercomputers need more than just computer scientists. And no bats were harmed in the process.
Now, Frontier is used to help scientists across the world with their research. Full access for science applications is expected at the beginning of 2023.
“We have more requests for use than hours available,” said Sieger.
Examples of projects Frontier can soon help with include modeling the Earth’s climate, simulating the inside of a nuclear reactor, and running tests to see what materials and methods can get humans to the surface of Mars safely.
Thousands of people made Frontier possible, and it will continue helping solve the world’s problems with science.
Sieger is now leading the effort to procure the successor to Frontier. We’re still waiting to hear what its name will be.