Building an Optimized HPC Architecture

Design your high performance computing (HPC) system to scale with future workloads.

Building Blocks of an HPC System:

  • Designing your HPC system may involve a combination of parallel computing, cluster computing, and grid/distributed computing strategies.

  • A hybrid cloud approach that combines your on-premise infrastructure with public cloud resources lets you scale up as needed, reducing the risk of lost opportunities.

  • Intel® technologies for HPC include processors, memory, and fabric, providing a foundation for high-performance, incredibly scalable systems.

author-image

By

In today’s accelerated business environment, the foundation for a successful HPC technology adoption begins with a well-defined HPC architecture. Depending on your organization’s workloads and computing goals, different HPC system designs and supporting resources are available to help you achieve productivity gains and scalable performance.

Designing HPC Systems

High performance computing architecture (HPC architecture) takes many forms based on your needs. Organizations can choose different ways to design HPC systems.

Parallel Computing
HPC parallel computing allows HPC clusters to execute calculations simultaneously or in parallel. Vital for addressing big and complex problems, parallel computing takes large workloads and splits them into separate computational tasks that are carried out at the same time.

These systems can be designed to either scale up or scale out. Scale-up parallelism involves taking a job within a single system and breaking it up so that individual cores can perform the work, using as much of the server as possible. In contrast, scale-out parallelism involves taking that same job, splitting it into manageable parts, and distributing those parts to multiple servers or computers with all work performed in parallel.

Cluster Computing
With a high performance computing cluster, multiple computers, or nodes, are linked together through a local-area network (LAN) to create an HPC cluster architecture. This acts as a single computer, yet one with cutting edge computational power. The configuration of an HPC cluster is uniquely designed to solve one problem by spanning it across the nodes in a system. HPC clusters have a defined network topology and allow organizations to tackle advanced computations with uncompromised processing speeds.

Grid and Distributed Computing
HPC grid computing and HPC distributed computing are synonymous computing architectures. These involve multiple computers connected through a network that share a common goal, such as solving a complex problem or performing a large computational task. This approach is ideal for addressing jobs that can be split into separate chunks and distributed across the grid. Each node within the system can perform tasks independently without having to communicate with other nodes.

Common HPC Application Compatibility
Intel has collaborated with industry partners to define best practices for promoting HPC applications and cluster systems built on Intel® architecture. The Intel® HPC Platform Specification provides common software and hardware requirements that application developers can use to build foundations for cluster solutions. A system that complies with these requirements provides a defined set of characteristics to the application layer, including the Intel® software runtime components that provide the best performance paths. The platform specification includes configuration and compliance information across a wide domain of common community applications.

HPC Cloud Infrastructure

In the past, HPC systems were limited to the capacity your on-premise infrastructure could provide. Today, the cloud gives you the opportunity to extend local capacity with resources in the cloud.

The latest cloud management platforms make it possible to take a hybrid cloud approach, which blends your on-premise infrastructure with public cloud services so that workloads can flow seamlessly across all available resources. This gives you greater flexibility in how you deploy HPC systems and how quickly you can scale up, along with the opportunity to optimize your total cost of ownership (TCO).

Typically, an on-premise HPC system offers a lower TCO than the equivalent HPC system reserved 24/7 in the cloud. However, an on-premise solution sized for peak capacity will be fully utilized only when that peak capacity is reached. Much of the time, the on-premise solution will be underutilized, leading to idle resources. On the other hand, a workload that can’t be computed due to a lack of available capacity can result in a lost opportunity.

In short, using the cloud to augment your on-premise HPC infrastructure for time-sensitive jobs can mitigate the risk of missing big opportunities. 

Selecting HPC Processors for Scalability and Performance

With our breadth of expertise in HPC technologies, Intel delivers the performance requirements for handling the most demanding future workloads. Intel® Xeon® Scalable processors provide a highly versatile platform that can seamlessly scale to support the diverse performance requirements of critical HPC workloads.

Working with our partners, Intel has prioritized efforts in creating blueprints that inform the most optimized HPC system designs. For validating performance requirements, Intel® Cluster Checker ensures that your HPC cluster system is intact and configured to run parallel applications with incredible portability for moving between on-premise and HPC cloud systems.

With Intel® CoFluent™ technology you can speed up the deployment of complex systems and and help determine optimal settings by modeling simulated hardware and software interactions.

A Breakthrough in HPC Memory

Memory is an integral component in HPC system design. Responsible for a system’s short-term data storage, memory can be a limiting factor to your workflow performance. Intel® Optane™ technology helps overcome these bottlenecks in the data center by bridging gaps in the storage and memory hierarchy, so you can keep compute fed.

Scaling Performance with HPC Fabric

To effectively scale HPC systems, you need a high-performance fabric that is designed to support HPC clusters. Intel® Omni-Path Architecture (Intel® OPA) overcomes the performance limitations of current fabric technologies by providing the capacity to scale to tens of thousands of nodes and more. This gives application developers an end-to-end solution that covers their adaptive routing, dispersive routing, traffic flow optimization, packet integrity protection, and dynamic lane scaling needs. Intel high-performance fabrics are designed to meet the needs for tomorrow’s HPC computing workloads at a price competitive with fabrics available now.

An Easier Path to HPC Adoption

Intel provides the critical expertise to understand the applications you wish to run and how a specific HPC system—one that combines on-premise and cloud resources—will help you produce results and maximize the work you can accomplish. With HPC architecture based on a foundation of Intel® technologies, you can be ready to meet the HPC and exascale needs of the future.

Today, the cloud lets you scale up HPC systems by seamlessly extending your access to computing, storage, and networking resources.

Disclaimers and Notices

Intel® technologies' features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on system configuration. No product or component can be absolutely secure. Check with your system manufacturer or retailer or learn more at thailand.intel.com.

Cost reduction scenarios described are intended as examples of how a given Intel®-based product, in the specified circumstances and configurations, may affect future costs and provide cost savings. Circumstances will vary. Intel does not guarantee any costs or cost reduction.

Intel disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability, fitness for a particular purpose, and non-infringement, as well as any warranty arising from course of performance, course of dealing, or usage in trade.

Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. © Intel Corporation.