To meet and even exceed today’s most demanding performance benchmarks, HPC systems need to incorporate a balanced portfolio of building blocks. Due to recent exponential growth in the size of data sets and the number of read/write operations that HPC applications need to perform, it's often storage and memory performance—rather than processor speed—that limits the overall performance of the system.
Today’s most advanced HPC storage and memory technologies exist as part of a continuum that extends from DRAM in the hot tier to inexpensive long-term storage media in the cold tier. System architects must identify the products and capabilities that best meet their individual HPC workload’s performance needs. Increasingly, these won’t be found in traditional NVM and storage solutions.
HPC Storage and Memory Challenges
HPC’s boundaries are constantly expanding as real-world computational problems require that large and ever-growing volumes of data be collected, stored, accessed, and processed. The sheer size of these data sets presents memory and storage challenges: simply put, DRAM’s capacities are too small and hard disk drives are too slow.
When relying on traditional memory and storage solutions, HPC system architects have had to make difficult trade-offs between storage capacity, performance, and cost. It was challenging to bridge the gaps between localized hot data near the CPU and greater nonvolatile storage capacity for the full range of diverse HPC workloads. In particular, two significant gaps remained:
- Between DRAM, with its high cost and low capacity, and NAND-based SSDs, which offer more-affordable capacity but introduce latency issues.
- Between NAND SSDs and HDDs, which can provide massive storage at low cost but have significant power, cooling, and physical space requirements, pose reliability challenges, and introduce even greater latency.
What’s Needed: Latency Reduction and Increased Storage Capacity
For many HPC workloads, the rate at which data can be brought to the processor presents the primary real-world performance bottleneck. HPC solution architects have attempted to overcome this limitation through the use of local caching and by deploying growing pools of DRAM to keep more data closer to the CPU. DRAM provides fast access to its contents but is expensive, subject to size constraints that make it impractical for use with large in-memory databases, and volatile.
Volatile memory solutions are a poor fit for the extreme performance demands that today’s HPC systems face. The consequences of data loss whenever there’s a system crash can be catastrophic, and long reboot times seriously erode productivity.
Storing greater volumes of data on nonvolatile media like NAND SSDs or HDDs introduces more-significant performance challenges, however. Storage systems designed for traditional HDD media and POSIX input/output (I/O) capabilities simply can’t keep pace with the complex random read and write patterns that analytics and simulation workloads generate, nor are they adequate for the read-intensive needs of AI workloads.
In fact, I/O demands per compute node are growing across the board—at exascale as well as for smaller systems—increasing the demands on HPC storage solutions as all workloads become more complex.
Choosing Optimal HPC Storage and Memory Solutions for Your Workload
Traditional HPC Clusters
For high-performance simulation and modeling applications, such as fluid dynamics prediction, climate modeling, and financial forecasting, computation is typically distributed across several machines configured to act as a single HPC cluster. Faster HPC storage and memory is needed to enable finer-grained modeling, faster generation of results, and greater productivity.
Artificial Intelligence (AI) Systems
AI workloads are coming into increasingly widespread use among HPC applications. These workloads require far more read operations than traditional HPC workloads, and those that interact with instrument clusters or other real-time streaming data services demand higher ongoing quality of service (QoS) to avoid critical data loss. Write intensity also increases, as does the ingest phase of AI. These systems require low-latency, high-message-rate communications, ideally bypassing the operating system entirely, to ensure that machine learning and inference algorithms function with the necessary speed and accuracy.
High Performance Data Analytics (HPDA)
Even as data volumes are growing exponentially, so too is the need for analytics to be performed at speed. Not only do HPDA workloads have far greater I/O demands than typical “big data” workloads, but they require larger compute clusters and more-efficient networking. The HPC memory and storage demands of HPDA workloads are commensurately greater as well.
Supercomputers and Exascale Systems
The scalability and cost advantages of modern HPC storage and memory solutions are especially important for supercomputing clusters and exascale systems. As these HPC solutions come into ever more widespread use in enterprises and academia, cost is increasingly becoming a factor. Yet it remains vital that these solutions continue to push the boundaries of known computing capacity, and the only way to do so is with HPC memory and storage solutions whose performance is in line with advances in processors, fabric, and other HPC components.
HPC Storage and Memory Products
With its comprehensive portfolio of HPC storage and memory solutions, together with Distributed Asynchronous Object Storage (DAOS)—the foundation for the Intel® exascale software stack—Intel is revolutionizing HPC storage architecture. These technologies are closing the gaps between in-memory data and storage capacity for large data sets to support transformational projects that require world-class computing performance.
Intel® Optane™ Persistent Memory
Intel® Optane™ persistent memory is a new class of HPC memory solution that supports near-real-time analysis of even today’s largest data sets. Intel® Optane™ supplies high-capacity, high-performing persistent memory that can reside on the same bus/channels as DRAM and act as DRAM in storing volatile data. Intel® Optane™ can also operate in persistent mode without power applied and can provide greater storage capacity on a lower cost/GB basis. This allows HPC solution architects to make use of a large persistent memory tier between DRAM and SSDs – one that’s both fast and affordable.
Intel® Optane™ Solid State Drives (SSDs)
Intel® Optane™ Solid State Drives (SSDs) provide an entirely new type of data storage tier between memory and 3D NAND SSDs. Intel® Optane™ DC SSDs offer high random read/write performance and consistent low latency, making them ideal to accelerate caching. Intel® Optane™ technology also offers the service quality and endurance that HPC workloads need to achieve breakthrough performance.
Intel® QLC 3D NAND SSDs
Intel® QLC 3D NAND SSD technology is transforming the economics of storage today by supplying cost-efficient, highly dense storage that offers a reliable mix of performance, capacity, and value. Based on proven vertical floating gate technology but with greater areal density and a unique support circuitry architecture, Intel® QLC 3D NAND SSDs are designed to deliver optimal performance for HPC workloads with a heavy write mix or extensive caching, particularly when partnered with Intel® Optane™ technologies.
Distributed Asynchronous Object Storage (DAOS)
Designed for latency reduction in HPC workloads, Distributed Asynchronous Object Storage (DAOS) is an open source software ecosystem that’s fully optimized for Intel® Optane™ persistent memory and Intel® Optane™ DC SSDs as well as other Intel® HPC solutions and products. DAOS was architected to make full use of the benefits of NVM technologies, providing high bandwidth, low latency, and high input/output operations per second (IOPS) storage containers for HPC applications.
Intel® Select Solutions for HPC
It’s challenging to ensure that all HPC cluster components are validated to interoperate and meet your particular workload’s performance requirements. Intel® Select Solutions for HPC provide easy and quick-to-deploy HPC infrastructures with the right combination of compute, fabric, memory, storage, and software for balanced systems that will accelerate the time it takes to achieve insights and breakthroughs for analytics clusters or particular HPC applications.