your123 - stock.adobe.com

News

HPC, supercomputing share stage with enterprise AI at SC24

AI's effects on supercomputing helped shift the focus toward enterprise demands at SC24, as vendors -- including Dell and HPE -- showcased new offerings.

Adam Armstrong, News Writer

Published: 22 Nov 2024

The annual Supercomputing Conference is typically aimed at high-performance computing and academia, but AI is starting to change that.

This year at SC24, IT vendors including Dell, HPE, Weka, Pure Storage and DDN not only unveiled new supercomputing products but also focused on enterprise offerings that suggest HPC's shift from research and academia to the enterprise. Some vendors, such as HPE and Dell, also highlighted liquid cooling at the show, as the technique for cooling infrastructure gains interest due to the heat generation and energy demands of GPUs in AI workloads.

While the Supercomputing Conference has been around since 1988, in the last 10 years there has been a noticeable change from displays of distant future enterprise technology to today's tech, according to Matt Kimball, an analyst at Moor Insights & Strategy.

The rate and pace of innovation has drastically accelerated from the innovators to the deployers.

Matt KimballAnalyst, Moor Insights & Strategy

"The rate and pace of innovation has drastically accelerated from the innovators to the deployers," he said.

AI and the enterprise

Enterprise and supercomputing might have different goals and processes, but they share several of the same vendors, according to Camberley Bates, an analyst at The Futurum Group.

HPE owns HPC maker Cray. It also built the AMD-powered Frontier supercomputer of Oak Ridge National Laboratory, a federally funded science and research lab in Oak Ridge, Tenn. And Dell powers the Texas Advanced Computing Center supercomputer in Austin. Enterprise infrastructure isn't becoming the dominant draw at SC, but the distinction between supercomputing and HPC workloads and enterprise workloads is beginning to blur due to AI.

This is causing a shift in who comes to SC conferences and how the show is presented, Bates said.

Another shift is the rapid adoption of AI workloads. Next year, AI will fully come into the enterprise as companies move beyond proofs of concepts, according to Steve McDowell, founder and analyst at NAND Research.

"Large language models are going to bring the need for additional compute, even if I'm doing RAG [retrieval-augmented generation] and fine-tuning," McDowell said.

Supercomputing and HPC require high levels of compute, he said, but it is still unclear how much of these computing methods or advancements will filter into the enterprise.

Some of the SC24 highlights from infrastructure and storage vendors included the following:

Dell: Dell's new high-performance PowerEdge XE9685L and XE7740 servers were launched for enterprise AI and HPC workloads. The new servers are also part of the vendor's Integrated Rack Scalable System (IRSS). IRSS is a pre-configured rack scale system that includes Dell Smart Cooling, which monitors and manages various cooling technologies, and aims to ease deployment of AI with its plug and play nature.

Dell also said it would support the upcoming Nvidia GB200 Grace Blackwell NVL4 superchip, which combines GPUs and CPUs, in the Dell IR7000. Dell expanded its Data Lakehouse to now include Apache Spark for data processing at scale.

HPE: HPE is the vendor behind El Capitan, a direct liquid-cooled supercomputer. The vendor also highlighted its new, liquid-cooled Cray Supercomputing blades, the EX4252 Gen 2 and EX154n Accelerator; its new Cray storage system, the E2000; and two new ProLiant Servers for enterprise AI, the XD680 and liquid-cooled XD685.

DataDirect Networks: DDN launched its fourth-generation A³I, an AI storage system with higher performance and scalability. DDN also collaborated with Nvidia on xAI's Project Colossus, a supercomputer built by Elon Musk's AI company primarily aimed at training xAI's chatbot Grok.

Pure Storage: Pure Storage introduced its GenAI Pod, which is the vendor's validated designs for turnkey offerings around generative AI storage. The vendor's FlashBlade//S500, the vendor's performance and capacity optimized storage for unstructured data, achieved Nvidia DGX SuperPOD certification. And last week, Pure also invested in the GPU cloud provider CoreWeave.

Weka: Weka previewed a high-performance storage offering combining its parallel file system, Nvidia Grace CPUs, Supermicro servers, Nvidia ConnectX-7 network interface card and Nvidia BlueField data processing units. This offering is focused on enterprise AI use cases.

Weka also introduced its reference architecture for retrieval-augmented generation, the Weka AI RAG Reference Platform or WARRP. It provides users a blueprint to execute RAG as well as inferencing using its data platform, according to the vendor.

Storage's role in SC and AI

McDowell said he believes the theme of AI optimization seen at SC24 will find its way into storage offerings for enterprises. A year ago, only a few vendors, such as Weka, Vast and Hammerspace, talked about storage's potential for data management and data manipulation. Those discussions will only grow as compute is optimized for AI, and storage follows, he said.

"This change is happening in AI training first, but it is absolutely finding its way into enterprise," McDowell said.

HPC and supercomputing have been the domain of parallel file systems, Futurum Group's Bates said. This is also the case with AI, but the data requirements for AI are different. HPC does massive amounts of reads and then gives a conclusion, whereas AI involves more activity with smaller files, she said.

"We are changing in terms of what those requirements look like," Bates said.

Whether AI will cause drastic changes for storage is yet to be seen, according to Bates. It will depend on what the data is being used for and what type is being analyzed. But, she added, it's unlikely to depend on the size of the data.

"If data blows up, we will figure out an invention to shrink it," she said.

Liquid cooling is cool

Liquid cooling was a significant topic at SC24, represented by 22 independent vendors as well as offerings from others, such as Dell, HPE and Lenovo. Moor Insights & Strategy's Kimball found the scale of vendors to be impressive. However, he said he still sees different methods trying to find their place in the market.

"I think we are still very early in this cooling game, and what we are seeing in today's market is kind of like the days of discovering fire and inventing the wheel," Kimball said.

The liquid cooling problems seen on the front lines of AI are problems supercomputing has been solving for the last three decades, according to Chirag Dekate, an analyst at Gartner. HPC and supercomputing have used the technique to control heat and energy for some time; he said that vendors such as Dell and HPE are making the concepts more mainstream in enterprise IT.

"Suddenly, the nerds are cool kids in town," he said.

Adam Armstrong is a TechTarget Editorial news writer covering file and block storage hardware and private clouds. He previously worked at StorageReview.

HPC, supercomputing share stage with enterprise AI at SC24

AI's effects on supercomputing helped shift the focus toward enterprise demands at SC24, as vendors -- including Dell and HPE -- showcased new offerings.

AI and the enterprise

Storage's role in SC and AI

Liquid cooling is cool

Dig Deeper on Data center hardware and strategy

Bristol goes live with UK AI supercomputer

GTC 2024: Storage suppliers queue up to ride the Nvidia AI wave

Inside India’s supercomputing journey

Chip sector gears up for AI revolution