VAST Data Platform for Artificial Intelligence

The Data Platform for the AI Era

Trusted by world-leading AI organizations to deliver the scale, speed, and reliability needed to serve the entire AI data pipeline, VAST accelerates time-to-insights by delivering the only data platform capable of addressing needs across data capture, data preparation, model training, and model serving.

Trusted by the world’s leading artificial intelligence organizations
View All Customers

Designed from the ground up to make the entire AI data pipeline simple to deploy and manage, the VAST Data Platform is a next-generation architecture that delivers file, object, database, and edge-to-cloud services in one scalable, affordable all-flash system.

Overview

The new age of computing requires a new approach to data pipelines.

Every industry is at the dawn of a new AI-powered era thanks to the exponential advancements in artificial intelligence. Still, the barrier to entry is high for many IT organizations. New machine learning workloads such as training generative AI models exceed the performance and scale capabilities of traditional enterprise infrastructures. HPC systems based on parallel file systems provide adequate performance but complexity and lack of enterprise features make them difficult for many IT teams to support.

Often when organizations are first exploring their needs for AI infrastructure, the focus is on only one aspect of the AI data pipeline – Model Training. This results in a narrow search for a solution that only fits the needs of one aspect of what is needed. Taking a step back and considering the entire end-to-end AI pipeline, many different requirements emerge that typically require unnecessary movement of data, between systems and regions. The result is complicated and tedious data pipelines where data must be constantly copied from tier to tier to give AI training access to the data it needs.

Enter VAST, a comprehensive AI Data Platform with the performance and scalability for the most demanding AI applications, combined with revolutionary data efficiency technologies that reduce the cost of flash to archive-tier economics. When all data throughout the pipeline is available with high-performance, training and serving workflows are simplified and time-to-insight is reduced.

Key Benefits

Unleashing Exascale Innovation

The VAST Data Platform accelerates your entire AI data pipeline by offering a holistic approach to addressing the needs of today’s data-driven organizations. If desired, its unique features can address any of the individual stages of the pipeline. However, as its purpose is expanded into more stages, the time-to-insight benefits compound, while the amount of infrastructure, and copies of data decreases.

Coming 2024:  The DataEngine is an intelligent computing environment that customers deploy from edge to cloud. By embedding logic directly into the VAST Data Platform, the system can schedule processing events in real-time, triggered by data activities.

A Unified, Multi-Protocol Platform

A unified multi-protocol platform for unstructured (NFS, SMB, and S3) and structured data (native SQL applications and query engines like Spark and Trino).

AI-Optimized Client Access

With support for NFS-over-RDMA and NVIDIA Magnum IO GPUDirect™ Storage access, VAST delivers the performance of a parallel file system without any of the parallel file system complexity.

Unified Support for Data Prep & Model Training

Accelerate feature engineering with VAST + Spark while simultaneously feeding large training and inference workloads.

Simplified Data Pipelines

Eliminate time-consuming data copy workflows with all data available in real-time plus high-performance access via Infiniband and Ethernet.

Edge to Cloud Data Access

Achieve instant data availability across locations using VAST’s global namespace with strict write consistency and the performance of local systems.

Secure Multi-Tenant Security

Mitigate risks in multi-tenant environments with customer-managed encryption keys, audit tools, QoS, and data management isolation from the host operating system.

A Unified, Multi-Protocol Platform

A unified multi-protocol platform for unstructured (NFS, SMB, and S3) and structured data (native SQL applications and query engines like Spark and Trino).

AI-Optimized Client Access

With support for NFS-over-RDMA and NVIDIA Magnum IO GPUDirect™ Storage access, VAST delivers the performance of a parallel file system without any of the parallel file system complexity.

Unified Support for Data Prep & Model Training

Accelerate feature engineering with VAST + Spark while simultaneously feeding large training and inference workloads.

Simplified Data Pipelines

Eliminate time-consuming data copy workflows with all data available in real-time plus high-performance access via Infiniband and Ethernet.

Edge to Cloud Data Access

Achieve instant data availability across locations using VAST’s global namespace with strict write consistency and the performance of local systems.

Secure Multi-Tenant Security

Mitigate risks in multi-tenant environments with customer-managed encryption keys, audit tools, QoS, and data management isolation from the host operating system.

Reference Architecture

The Data Platform for The Entire AI Pipeline

A full-stack, end-to-end AI solution that simplifies the creation and expansion of AI deployments.

images