What is HPC?

High-performance computing (HPC) harnesses the power of computer clusters to solve complex problems with massive data sets

High Performance Computing, or HPC, refers to clustering computing resources to take on complex problems with massive amounts of data. By distributing computing tasks over many interconnected computers HPC is capable of solving problems that standard computers either could not handle or could not complete in a reasonable amount of time. Life Sciences, Scientific Research, Energy Exploration, Financial Services, Manufacturing, and Climate Modeling all rely on HPC to solve complex problems and deliver results in time to deliver value. For example, in financial services HPC is used for fraud detection in real time - standard computing could not deliver results fast enough to stop fraudulent transactions. If this sounds a lot like Artificial Intelligence/Machine learning that’s because HPC and AI/ML in many ways are converging.

Parallelism is critical
For HPC to leverage the power of many computers the workload must be evenly distributed over all CPUs in the system and these CPUs must have unfettered access to the data. This distribution of work is known as “Parallelism.” For maximum efficiency, these nodes should operate on their parallel tasks with minimal cross-talk with other nodes. A workload that can be easily divided and has little to no interprocess communication requirements amongst nodes is referred to as “embarrassingly parallel” (as in “embarrassment of riches”) in the HPC community.

There are 4 main components of HPC:

  • Compute Nodes: computers that process data
  • Storage: the repository of data
  • Network: connectivity for compute nodes and storage
  • Job scheduler: Orchestration of tasks across the HPC cluster

Compute Nodes are grouped into “Clusters” that have shared access to a large data set via a high-speed network. Data is processed in parallel across all nodes in the cluster. To increase the processing power of the cluster more nodes are added rather than adding resources to individual nodes. GPU-Accelerated Systems provide additional computing power for workloads such as simulation and modeling.

Storage maintains and protects data sets that can scale from petabytes to exabytes in capacity. Storage for HPC must be able to support massive parallelism - simultaneous and independent access by the compute nodes. Traditionally these storage systems were highly specialized block storage devices presenting data via Parallel File Systems such as Lustre, GPFS, and BGeeFS.

Networking needs to fast enough to provide sufficient bandwidth for compute nodes to access the data from storage nodes as well as supporting low latency such that communications between compute nodes is processed efficiently. HPC networking must spread traffic evenly over all links and connections to support parallelism. Specialized networking protocols such as Infiniband, Slingshot, and Omnipath are typically deployed, however advances in high speed ethernet is driving adoption.

The Job Scheduler is responsible for orchestrating the work across the nodes to meet the needs of multiple applications and users. HPC jobs must be queued, distributed over the available resources and monitored for efficient and reliable operations.

HPC use cases:

  • Health Care/Life Sciences
    • Genomic sequencing
    • Modeling and simulation of molecules and reactions
    • Protein Folding
    • Bio-informatics
  • Financial
    • Fraud detection
    • Market simulation
    • Risk analysis
    • Quantitative trading
  • Media and Entertainment
    • Rendering
    • 3D modeling
    • VFX
  • Manufacturing
    • Simulation
    • Eletronic Design Automation (EDA)
    • Digital Twin
  • Energy exploration
    • Seismic
    • Petroleum engineering
    • Reservoir simulation

Democratization of HPC
HPC like AI/ML before it, had once been the domain of only scientific computing is seeing wide spread adoption across industries. Advances in technologies like high-speed ethernet and all-flash Network Attached Storage (NAS) along with the emergence of highly parallel implementations of NFS is making HPC more accessible. Without the specialized gear and need for highly trained staff the complexity and expense of HPC is reduced.

What's the future of HPC?
The lines between HPC and Artificial Intelligence are blurring. With the cloud as a catalyst for innovation new applications and workloads are now finding a footing in what had been strictly the domain of HPC. HPC clusters will need to be highly adaptable to support these workloads without the need for complicated tuning or re-architecting.

Real World Case Study:
Learn how DUG Technology - a leading provider of HPC-as-service (HPCaaS) leverages VAST to provide exascale flash storage for their customers’ HPC data sets.

Have a question?

Get expert help from the VAST Data solution engineering team quick.