Formula One cars are impressive machines. They combine optimized weight, aerodynamics and engine power to get the driver around the track as fast as possible. Optimization itself is not one-dimensional, as there are different racetracks in a single season which have different requirements and a good balance has to be struck between speed, acceleration and cornering speeds. But anything done to the car – in the design itself or the setup for a specific track – is done towards one metric: winning the race.
Even the drivers are part of the optimization. If you ever have the chance to see a Formula One car close up you will notice that it was not built for the average person: Formula One drivers are usually rather small and light to fit into the cockpit.
If you want to win a race or the whole championship then there is only one way. You need to build a car, tune it carefully for each track and have a driver with enough talent and experience.
What does all this have to do with parallel file systems? More than one might think.
Parallel File Systems Background
Parallel file systems were developed to provide faster access than traditional data center storage for compute tasks with high requirements for storage capacity and performance. Those tasks were found in the field of High Performance Computing - HPC.
Typical HPC compute jobs scale out across dozens or hundreds of compute nodes, utilizing thousands of CPU cores or GPUs. IO is often done in waves between the computational steps – reading the input in the beginning, writing checkpoints and finally producing the result in the form of output files. The nature of the parallel job dictates that all individual processes or threads must finish before the whole job finishes. The optimization in this case is ensuring IOs are done as fast as possible and don’t hold up the completion.
You will have a hard time finding a file system or storage solution that drives the IO for a large (HPC) compute job faster than a well-tuned parallel file system. It provides exactly the features needed to ensure that the time to resolution can be minimal. And in that sense a parallel file system is like a Formula One car. Without compromise it is made to deliver performance.
So do parallel file systems represent the prime class of storage solutions? Hardly. Much like cars on the street are not Formula One cars, storage solutions are usually not parallel file systems.
But if nothing is faster than a well tuned parallel file system, then why aren’t they more commonly deployed?
Two reasons: the lack of enterprise features and the complexity to deploy, operate and tune them.
Parallel File Systems Limitations
Parallel file systems offer only the features needed to fulfill the IO requirements of large compute jobs. In contrast, they typically don’t have the features enterprise users have come to expect in traditional file offerings, including snapshots, replication, advanced data reduction, access via standard-protocols (NFS, SMB, S3), configuration via REST-APIs or seamless integration with enterprise authentication methods.
Therefore, important functionality is missing for use cases outside large-scale compute jobs that either prevents the use of a parallel file system altogether or makes it necessary to bridge the functional gaps with add-on software or complex workarounds.
“Ease of use” or “simplicity” has never been a design objective for a parallel file system; they don’t serve the main purpose of faster storage performance (in the same way the need for maintenance windows or offline reconfigurations was not seen as a problem as long as a higher performance could be obtained during operation).
And when it comes to tuning the degrees of freedom are massive. There usually are dozens or hundreds of different settings to change how the system deals with different types of IO patterns - and some of them are conflicting with each other or change the performance extremely (and not gradually).
While fine-tuning might seem an advantage at first, enabling users to optimize and get the best out of the system they bought, it quickly becomes a disadvantage in environments running more than a single type of job or application. This is especially true when many settings have no real “default” value that achieves a good performance for the large majority of IO patterns. It’s then that the “well-tuned parallel filesystem” quickly becomes a unachievable objective.
Therefore, the complexity in deploying, operating and tuning parallel file systems shouldn’t be underestimated. Much like how the Formula One car needs a pit crew, careful tuning and an experienced driver, the parallel file system requires bespoke configurations and tuning as well as very experienced system administrators (plural!) to deploy, manage, maintain, and upgrade the system. And last but not least, you need users who know how to use the provided resources properly to achieve best performance.
A New Scale-Out Architecture Offers a New Choice
But if you need fast storage access, then you have to use a parallel file system, right? If your workload depends on storage performance, then there is no way around it, correct?
Until a few years ago the answer would have been most likely “yes.” Parallel file systems were the only way to scale storage performance and capacity efficiently. Enterprise NAS systems were feature rich, but couldn’t scale the performance to acceptable levels and even scale-out NAS systems (usually based on “shared nothing” architectures) couldn’t scale enough.
But if we look at where storage performance is needed these days we see a lot of use cases outside of the traditional HPC workloads. They come from very different fields: life sciences, financial services, media and entertainment, as well as artificial intelligence (AI) just to name a few.
As a consequence, parallel file systems were deployed and used by organizations that didn’t have a HPC background or the experienced administrators required for the job. Moreover, often the effort and personnel needed to extract real business value from these solutions was heavily underestimated. They were deployed because there was no other choice.
Today, if you need fast storage access you don’t need to use a parallel file system. The VAST Data Platform has eliminated the tradeoffs between performance and complexity, between performance and enterprise functionality, as our customers have seen first-hand. and how analyst firm IDC examined in a recent white paper.
By designing the first fundamentally new storage systems architecture (DASE) since the early 2000s, VAST has solved the challenges of today’s data-centric era. The VAST Data Platform was specifically engineered to avoid all external management of complexity and heavy lifting for parallel file system operation and optimization. It was made to deliver HPC-like performance with enterprise features, simplicity and availability.
This is why VAST’s certification for NVIDIA DGX SuperPOD is such a game-changer. Never before has an enterprise offering been able to deliver HPC and AI performance and scalability with the enterprise simplicity and resiliency of NAS.
VAST has been deployed successfully at a scale of hundreds of petabytes in customer environments, delivering TB/s bandwidth and millions of IOPS - and all that from a single namespace. The platform is designed to be resilient and always online. It can withstand multiple media and controller failures and rebuilds data extremely quickly from redundancy information and without manual intervention. The system is power-loss resilient as there is never data in volatile storage that can be lost. And all system management and upgrades are performed without taking a downtime or maintenance window and by using a unified management console.
Your Test Drive Awaits
If you have a need to extract real business value from data, there now is a better choice of a scalable storage solution with highest performance.
Very similar to Formula One cars being fast but not the best solution for most people’s mobility needs – parallel file system solutions are fast but not the best solution for most storage requirements. Even if storage performance is one key aspect of these requirements!
In most cases a Porsche 911 is way more appealing - it doesn’t require the pit crew or a service after every race, the average driver will be able to drive it, and it still can go superfast. And it comes with the Sat-Nav and the heated seats!
Take a spin with VAST’s Porsche today.