Our Biggest Software Release, Ever

Authored by

Jeff Denworth

This blog post was written in 2023 and reflects product capabilities at that time. Some information may be outdated.

v4.6 – v4.7.

Our new release train has arrived at the VAST product station.

What’s in a name? Software companies for years have tried to de-correlate numbers from releases because of their perceived significance or lack thereof based upon the numbering applied. The Android team at Google may have done the best job at this… but at the end of the day, Android Froyo was still Android OS 2.2 and while the customers were left hungry for frozen yogurt, the name doesn’t help you understand what the ingredients of the release actually are.

Which brings us to our 4.6 + 4.7 release announcement VAST Data is making this week. While the numbering isn’t awe-inspiring, the feature content within represents our largest software release in history.

~300

Sized at roughly 300 engineers, the VAST R&D team is a formidable organization that has now achieved a level of size, scale and sophistication that puts us shoulder-to-shoulder with some of the world’s leading infrastructure engineering teams in the enterprise software and cloud infrastructure space. As such, we’re inventing and developing VAST OS simultaneously in multiple directions and in multiple dimensions. This new release is the summation of the last 6 months of tireless work and genuine blank-sheet innovation.

OK, enough hyperbole… let’s get into the meat of it:

This release codifies VAST’s leadership in secure and intelligent data services delivery.

Today’s announcement unveils exciting new capabilities that take VAST and our customers into new frontiers and will be known as the first proper release that starts to unfold VAST’s larger data platform agenda. There’s a lot to unpack, and several of the features could be the centerpiece of a major feature release alone… so, to give them the right attention we will, over the next five weeks, introduce each of the major capabilities with longer-form introductory content that explains each major capability in detail. Today, it’s my honor to introduce the whole spectrum of innovation and preview all of what we’re bringing to the table.

Spotlight on the VAST Data Catalog

The Semantic Layer and Data Layer Are Now Synthesized

For decades, organizations have built independent database infrastructure alongside their content stores to make sense of unstructured data and even semistructured data. PACS medical record archives are always coupled with SQL datastores. Photo services marry object storage and NoSQL databases. Hadoop Data Lakes leverage Hive for their metastores.

The proliferation of systems of record in the modern IT stack have created a level of complexity that make it impossible to get a complete spectrum of information from any one data store. The IT industry has been accustomed to the idea that structured, unstructured and semi-structured data stores should all be distinct only because no single system has been designed to achieve true data synthesis, until now.

The VAST Catalog is an extension of the Element Store which now makes it possible for VAST clusters to catalog each and every file and object written into an extensible tabular format that enables data users to further enrich and tag data with additional user-defined context (in the form data which is recorded in new columns) and to query upon massive datasets at any level of scale. Now users and administrators can tap into a powerful tool that provides global insight into their vast reserves of data with a fully synthesized and synchronized data catalog that requires no integration and where your catalog is never out of sync with your datastore.

The VAST Catalog will enable a number of new applications of VAST OS:

VAST administrators will leverage the Catalog for capacity management & chargeback
Backup and archive applications can use a new differential to traverse the namespace even faster, resulting in faster backups and the application of rapid data migration
Applications can replace POSIX functions with SQL statements to see rapid accelerations for mundane POSIX operations… why find when you can select * 1,000 times faster? Have a need for speed??? The Catalog is your answer.

Spotlight on Policy-Based Quality of Service

Service Classes, Logically-Provisioned All The Way To User-Level

For nearly half of a decade, customers invariably ask our sales teams for some tiering mechanism until they understand the revolutionary all-flash economics we bring to the table. Having said that, we’ve also observed that many hyperscale service providers have developed service classes by thin provisioning vast pools of hardware, logically, into presentations that have different service class personalities.

With this new release, we’re extending our QOS offering beyond the Pools concept we’ve previously introduced, and we’re now offering an additional ability to create different service plans that sets min/max limits on bandwidth and IOPS per each VAST User or View (our term for a share, export or bucket) such that service providers (public and private cloud) can contain noisy neighbors.

User-Level views, in particular, are a big step forward for a number of our service provider partners who have been stuck with the more limited approaches offered by the likes of NetApp.

For those prospects who still want tiered solutions to offer their application owners, well… just give them different service classes that can be flexibly provisioned (and changed) from one simple-to-manage and scalable cluster before you rush out to implement some legacy, tiered datastore approach.

Spotlight on Enterprise Key Management

From Onboard to External, Encryption Is Now Even More Secure

Consistent with the above service provider focus, we’re now applying the same thinking to external key management by supporting KMIP 1.2. This standard Key Vault interface API allows you to connect the most popular EKM systems to your VAST Clusters so that:

tenants can bring and manage their own unique encryption keys
tenants can rotate their keys with any regularity they’d like to
customers can exert control over how they encrypt while still saving on infrastructure – thanks to our revolutionary Similarity-based approach to global data reduction

To start, we’re offering support for Thales CipherTrust and IBM Key Protect. In the future, we will work to expand this support matrix according to our customers’ priority.

Spotlight on Global Snapshots and Global Clones

Stitching Together A VAST Namespace From A Constellation of Clusters

VAST’s Element Store has always featured a fine-grained approach to implementing Snapshots. Snapshots are reserved without taking performance away from the application, they are fine-grained data reservation operations with byte-level granularity and customers can flexibly take 100,000s of snapshots at any namespace depth with as little as 15-second granularity. With VAST, there’s never partitions or volumes you need to worry about – each VAST server accesses the same global dataset thanks to our Disaggregated, Shared-Everything (DASE) Architecture.

Starting with this new release, VAST clusters now also support the ability to share and extend snapshots to multiple remote clusters. Each remote site can mount another site’s snapshots and even make clones of this data to turn a snapshot into a read/write View. This capability lays the foundation for other work we will unveil, relating to building a global namespace from edge to cloud.

Spotlight on Uplink with ML-Informed Capacity Prediction

A More Intelligent Approach to Global Fleet Capacity Management

As a reminder, Uplink is our global, cloud-based fleet management console designed to make it easy for customers to manage all of their VAST clusters from a single pane of glass.

With the power of elastic computing (which comes from the cloud) and some sprinkling of Python-based machine learning libraries, customers can now become much smarter about the data utilization trends within their clusters and take advantage of Capacity Predictions so that they can keep ahead of demand.

Uplink now provides a one-click (tunable) prediction basis and a prediction horizon that is built on a time-series database and made available as a capacity estimation tool. ML also helps the system understand how to normalize against data trend anomalies that would otherwise confuse a classic statistical model, making for more accurate capacity predictions.

Spotlight on More Zero-Trust Security Work

Did we mention cloud service providers, yet? 🙂

While Zero-Trust is a critical capability for VAST as we become the foundation for leading cloud service providers, it’s equally important to our government and enterprise customers who are constantly swimming upstream to keep their infrastructure maximally secure.

In addition to supporting KMIP/EKM, we’re also very excited to extend the Zero-Trust agenda by announcing support for:

TLS-Based Access for NFS (arguably an easier method of providing in-flight encryption)
PseudoFS (now allowing NFS 4.1 customers limit access to subsets of directories)
Rocky Linux for VAST OS (moving beyond CentOS to enable compliance, STIG, FIPS)
S3 Audit (adding to the NFS and SMB audit we already support)

Spotlight on VAST’s Kubernetes CSI Update

VAST’s CSI now supports different storage storage classes for persistent and ephemeral volumes and Helm Charts are now provided for easier implementation of VAST’s CSI… all helping to make it easier and easier to power your on-prem and private computing clouds with VAST.

OK, so I love/hate to say that this is just a listing of the high level features from this release.

Love, because I’m astounded by the effort of our R&D team and their pace of innovation
Hate, because I’m out of blog space to explain any more 🙂

Stay tuned in to the VAST blog over the coming weeks for additional detail behind each of these new feature releases. Coming up first: the VAST Catalog.

Customers, just ask for the release notes. Competitors, your move…

– Jeff