As companies move their enterprise applications to the cloud, IT organizations are also looking at migrating or augmenting their high performance computing (HPC) workloads off-premises. The benefits are considerable, but the process can be complex and overwhelming for those not familiar with running technical computing on virtual instances. Appsbroker*, Intel, and Google formed the Extreme Cloud Center of Excellence (ECCoE) to help customers successfully migrate the right applications to the cloud for the greatest benefit.
Many high performance computing (HPC) workloads are moving to the cloud. While organizations continue to run simulation and AI projects on on-premise clusters, when there is greater demand for computational resources than compute cycles available, companies turn to the cloud. Smaller enterprises without HPC resources are also turning to the cloud to complete projects. And those projects run off-premise are getting bigger.
Professor Andrew V. Sutherland at Massachusetts Institute of Technology (MIT) has shown that giant compute jobs can be run in the cloud. In April 2017, Professor Sutherland broke a record by standing up a 220,000 core Google Cloud Platform cluster for a highly compute intensive workload. In June of 2018, he beat that record with a 580,000 core cluster, proving that massive compute jobs are no longer the sole domain of the world’s largest machines. This show enormous opportunities for companies to run these large jobs in the cloud.
So the question for an enterprise considering this option is how do they go about planning a migration strategy and moving their HPC workloads to the cloud?
Migrating enterprise high performance computing (HPC) workloads from an on-premises cluster to the cloud can be daunting. For both IT and the users, key questions rise to the surface: Will my jobs run longer? Is my application available for the cloud platform I want to use? Is the software optimized for the cloud? Will it cost more? What about licensing the application? Can I benchmark before I decide—without paying for the service? Who is going to do all the groundwork so I can make a decision? Which Cloud Service Provider (CSP) should I use?
Running in the cloud has many benefits. You pay only for the time the cluster is in use. For most services, standing up a cluster is very quick. However, with some CSPs, you have to bring your own experts, bring your own software, do your own benchmarking, and pay for it all while you’re doing it.
The Extreme Cloud Centre of Excellence (ECCoE) has changed all that.
The Extreme Cloud Centre of Excellence (ECCoE) is a partnership between Google, Intel, and Appsbroker to show organizations how Google Cloud can be used to augment an HPC infrastructure. ECCoE is making it easier for HPC to run in the cloud.
ECCoE is a testbed for High-Performance and High-Throughput Computing (HPC/HTC) workloads. ECCoE brings together a team of skilled people with a facility offering leadership, best practices, and workshops.” — Geoff Newell, technical director at Appsbroker Limited
ECCoE allows IT and users to understand the entire application migration requirements, processes, and benefits before they commit to a new HPC paradigm. It provides a path to a rapid Proof of Concept with complete benchmarking capabilities of the applications they use. And, it includes a Total Cost of Ownership (TCO) service, so customers know what HPC in the cloud can cost before they financially commit.
Appsbroker is a purely Google Premier partner; they only work with the Google Cloud Platform (GCP). To provide the depth needed to benchmark and enable effective migrations to cloud, Appsbroker built their own on-premise cluster on which they run customer applications. They then compare on-premise performance to the same workload running on a similar Google Cloud cluster.
“For the ECCoE experience,” explained David Young, Head of Architecture at Appsbroker, “our team of experts first hosts a workshop with the customer to discover their needs, understand user expectations, and identify the applications they’re running locally that they would consider putting in the cloud.”
Not all HPC applications are good candidates for cloud deployments. Knowing what on-premises codes can migrate to an off-prem model requires more than simply moving the code to a standardized cluster template. For example, certain customizations might be required to optimize a memory-bound application for GCP. Appsbroker’s developers bring their many years of devops skills, best of breed technology know-how, and a library of cluster-building best practices for deploying customer workloads to GCP.
The Extreme Cloud Centre of Excellence (ECCoE)
The ECCoE is part of a three-way investment between Appsbroker, Intel, and Google Cloud designed to help users of Extreme Cloud systems benchmark their current workloads on the latest technology from Intel and Google. Based at Appsbroker’s UK office, ECCoE is a purpose built customer facility where engineers from Google, Intel, and Appsbroker are able to work with customers to assess suitable workflows to move onto the Google Cloud Platform, produce business cases to support the move, and develop roadmaps to accomplish business outcomes for the customers using the latest technology from both Intel and Google.
“We work closely with ISVs, like ANSYS*, Altair, plus many open source projects, to profile and fully understand our customer’s particular applications needs when deployed in the cloud,” added Young. “We optimize a GCP configuration for their particular problem running an application, like Siemens STAR-CCM, and benchmark their jobs on both our internal eight-node cluster and the GCP instance.” The internal cluster used for ECCoE comparisons, built by IT provider XMA, is based on Intel® Xeon® Platinum 8160 processors with Cornelis Networks fabric and Intel® SSD Data Center Series (Intel® SSD DC Series) storage.
“There’s more to moving HPC to the cloud than just deploying applications,” explained Newell. “There’s the whole user experience, integrating cloud with internal IT operations, and the financial side. If any aspect complicates the business or operations side, the downside quickly offsets the benefits of cloud. We help the customer through the entire process to help ensure an optimal migration and experience over the long term.”
Appsbroker has helped customers move their technical computing workloads to GCP. The range of industries include manufacturing, engineering, and financial. “Our customers are running computational fluid dynamics (CFD), Monte Carlo simulations, and other types of scientific workloads,” stated Young.
Through the methodology used in ECCoE, the Appsbroker team is building out a complete, best practice, hybrid implementation addressing the challenges that enterprises have to deal with when integrating their on-premise systems with Google Cloud. These concerns include but are not limited to Active Directory (AD), Domain Name System (DNS), Lightweight Directory Access Protocol (LDAP), Intrusion Detection, centralized logging, and Wide Area Network topologies.
Appsbroker, Intel, and Google launched ECCoE to help companies augment their HPC needs with applications on GCP instances running Intel Xeon Scalable processor-based servers. Appsbroker teams provide deep expertise in cloud deployments and optimizing instances for a range of applications on GCP. Appsbroker provides consulting, benchmarking, and solution engineering for customers to help them move their workloads to the cloud. Benchmarking is done on their in-house cluster built on Intel Xeon Scalable processors, Cornelis Networks, and Intel SSD DC Series storage.
- 384 core Intel® Xeon® Platinum 8160 processors
- Intel® Omni-Path Edge Switch 100 Series
- Eight Intel Omni-Path Host Fabric Adapters
- Eight 960GB Intel® SSDs DC S4600