When Simplicity Pairs With Scale: VAST-Powered NVIDIA DGX SuperPOD to Unite NAS With Advanced AI

Author

Jeff Denworth

It is with great pride that we unveil today that VAST’s deep learning data platform is in the process of being certified by NVIDIA for DGX SuperPOD, and will be the first enterprise NAS solution to power this turnkey AI data center solution.

VAST will join a select collection of technology providers that have certified high-performance solutions for NVIDIA AI at scale. While all the other DGX SuperPOD storage solutions are based on parallel file systems - the VAST offering provides an alternate approach to scaling AI-ready infrastructure - pioneering a simple yet massively scalable NAS-based approach.

The origins of VAST’s work with NVIDIA go back to our earliest days, starting in 2016, when we implemented a new form of RDMA-based storage networking, NVMe-over-Fabrics. This new data-center-scale transport made it possible for VAST to disaggregate data storage logic from a storage cluster’s SSDs and then scale both independently in a manner where each storage controller’s stateless CPU can share global access to all of a cluster’s data. This discovery ultimately led to the creation of VAST’s disaggregated, shared-everything (DASE) architecture, the third data storage architecture in modern history and the first to introduce true system-level parallelism.

“NAS is not designed for AI and HPC!”

This is a story we hear from so many customers and prospects who have encountered significant challenges using legacy approaches to enterprise scale-out NAS. When we compare VAST’s architectural approach to legacy architectures, the reasons VAST is the right choice for the scalable DGX SuperPOD design immediately become obvious:

When Simplicity Pairs With Scale: VAST-Powered NVIDIA DGX SuperPOD to Unite NAS With Advanced AI

Legacy NAS systems are designed via shared-nothing principles. Each storage node owns a partition of the namespace, and updates, reads and rebuilds to this namespace must be coordinated across a backend fabric. As these systems scale, the amount of internal traffic, particularly for parallel operations, overwhelms the cluster. Customers then experience the law of diminishing returns with scale.

VAST, on the other hand, has pioneered a new Disaggregated, Shared-Everything (DASE) architecture which enables simple Docker containers to share a collection of NVMe devices over NVMe-oF. These NVMe storage enclosures, connected and powered by NVIDIA BlueField DPUs, are simple NVMe JBOFs that provide access to shared devices to all of the system’s stateless docker containers. With this approach, storage processors no longer need to talk with each other when serving data. There’s no controller-level caching, so there’s no need for cache coherence or metadata updates across servers. No two storage controller CPUs need to talk to each other in the read or write path, ever. Each CPU scales in an embarrassingly-parallel fashion to achieve levels of scale that were previously unthinkable with NFS-based storage.

NFS… on Steroids

Our scalable architecture enables our system to service thousands of GPUs in parallel… but scalability, alone, doesn’t satisfy the I/O requirements of data-hungry GPU servers. To solve and break through the legacy limits of NAS, VAST has been working with the kernel community to up-level the standard NFS storage client over the last several years. Today we support far more than the legacy TCP-based, single-path experience that customers have come to expect from NFS. With VAST, customers can achieve well over 100GB/s of storage throughput per NVIDIA DGX system thanks to innovations such as NFS over RDMA, MultiPathing and NVIDIA GPUDirect Storage. To learn more about accelerated NAS access, read Subramanian Kartik’s blog where he details how to turn NFS performance and scalability up to 11.

Extra Efficiency From the DPU

The use of NVIDIA BlueField DPUs supports both Ethernet and InfiniBand connectivity using low-latency 100, 200 and eventually 400Gb/s networking. They offload networking, RDMA (which supports NFS over RDMA and GPUDirect Storage), and management functionality. In fact, the latest VAST NVMe scalable storage nodes no longer require dedicated X86 CPUs, since the DPUs run the storage software and handle all the networking, resulting in a 33% reduction in power consumption and up to a 50% lower rackspace footprint for the same capacity, to go with the faster storage performance.

An Affordable, Simple and Rock-Solid Enterprise NAS Experience

Theory meets practice when applied to the systems we are supporting and the software we are deploying into the field. VAST’s value proposition goes way beyond scale and performance:

Our next-generation approach to flash efficiency means customers no longer need to tier their training and inference infrastructure, just scale all-flash storage performance at HDD economics. VAST systems save customers up to 90% vs. competing all flash solutions.
Our simple appliance model means customers don’t have to deal with integration challenges that are common with SW-defined storage. VAST’s legendary support via our global team of Co-Pilots stems from the fact that every system deployed is based on a platform that we QA test the living hell out of.
We’ve now achieved six 9s of availability across our fleet of customers, which is pretty magical when you consider that our average cluster size is over 10PB. Upgrades and expansions are always done online without downtime. Our system experience is best reflected by our recent Gartner Peer Insights surveys, where 100% of VAST customers said they would recommend VAST.

Tune in, Turn on

Stay tuned for more as we publish our reference architectures in collaboration with NVIDIA and showcase how incorporating NVIDIA BlueField DPU technology further simplifies and accelerates access to VAST amounts of data.

Until then, we’ll leave you with some eye candy: below is a storage cluster that’s powering a 8,000+ server AI supercomputer pioneering new fields of machine learning and analytics. Scale out is finally… simple.