VAST Data Platform for Life Sciences

Decode the Building Blocks of Life

The VAST Data Platform delivers the scale and performance for advanced life sciences workloads from genomics to structural biology to AI workflows that accelerate life-saving therapies.

Trusted by the world’s leading genomics & life science organizations
View All Customers

Extract deeper insights and accelerate breakthroughs with fully synthesized structured and unstructured data. The VAST Data Platform delivers the power of real-time data access to feed AI-driven research at the speed of an all-flash parallel file system without any of the complexity.

Overview

Scientific discovery has evolved beyond legacy storage approaches.

Data is the foundation for transformative discoveries in life science. Each generation of life sciences technology creates more data than the previous: Genome sequencing requires 30 times more data than exome sequencing and cryo-electron tomography (Cryo-ET) will replace cryo-electron microscopy (Cryo-EM) with 10 times larger data payloads. When these data sets can be made available in real-time for machine learning and shared for open science collaboration, innovation is accelerated. However, the storage systems that many institutions relied on can no longer provide the performance at the scale needed for efficient bioinformatic pipelines and the lack of cloud integration impedes collaboration.

The VAST Data Platform is the foundation for accelerating new discoveries. VAST delivers the performance to excel at workloads such as AlphaFold and provides the scale for the largest biobanks. Effective data management requires a new approach that fuses unstructured and structured data. To this end, VAST created the VAST Catalog, a built-in, fully automated metadata index that makes it easy to find and curate precise datasets. As a platform that can be deployed across edge, core and cloud, VAST creates a global namespace that simplifies cloud-based life science pipelines, including cloud bursting and collaboration.

Taking advantage of new advancements in AI will be pivotal to help us make sense of all of this data, and the VAST Data Platform allows us to collect massive amounts of data, so that we can ultimately map as many neural circuits as possible - and its mechanisms for collaboration enable us to rapidly share that data around the world.

David Feng
Director of Scientific Computing, Allen Institute
Key Benefits

The next chapter in AI-powered life sciences is being written.

The VAST Data Platform is built as an intelligent system, designed entirely from the ground up to make AI-accelerated life science workflows simple to deploy and manage. VAST presents a unified view of both unstructured and structured data. The VAST DataBase is a high-speed transactional and analytical database with support for query engines such as Spark and Trino to accelerate analytics workloads. The VAST DataStore’s multi-protocol support for NFS, SMB, and Object simplify workflows from instruments to GPUs, to workstations.

All Flash, at Archive Economics

Deploy all-flash infrastructure for all your active data sets. VAST's new advancements in commodity flash management, data protection and reduction deliver a lower TCO than hybrid storage alternatives.

AI-Optimized Client Access

With support for RDMA and GPUDirect Storage access, VAST’s NAS experience delivers the performance of a parallel file system without any of the parallel file system complexity while enabling a single NFS client to saturate a 100Gb connection (Up to 11GB/s over EDR).

Ultra-Resilient, Ultra Reliable

VAST eliminates the problems of common node failures, delivers fail-in-place resilience, and is designed for 99.999+% uptime from petabyte to exabyte scale.

Accelerate Every Step of the Pipeline

Scale performance linearly as your data volumes grow. VAST Data Platform was engineered from the ground up to achieve optimal processor efficiency for genomics, proteomics, neuroscience and AI pipelines.

Multi-Tenant Infrastructure

Dedicate front-end servers and the performance they provide to the most critical projects. VAST's server pooling capability provides dedicated Quality of Service for competing projects.

Global Namespace

Simplify data access from anywhere with a single global namespace across cloud, edge, and core.

All Flash, at Archive Economics

Deploy all-flash infrastructure for all your active data sets. VAST's new advancements in commodity flash management, data protection and reduction deliver a lower TCO than hybrid storage alternatives.

AI-Optimized Client Access

With support for RDMA and GPUDirect Storage access, VAST’s NAS experience delivers the performance of a parallel file system without any of the parallel file system complexity while enabling a single NFS client to saturate a 100Gb connection (Up to 11GB/s over EDR).

Ultra-Resilient, Ultra Reliable

VAST eliminates the problems of common node failures, delivers fail-in-place resilience, and is designed for 99.999+% uptime from petabyte to exabyte scale.

Accelerate Every Step of the Pipeline

Scale performance linearly as your data volumes grow. VAST Data Platform was engineered from the ground up to achieve optimal processor efficiency for genomics, proteomics, neuroscience and AI pipelines.

Multi-Tenant Infrastructure

Dedicate front-end servers and the performance they provide to the most critical projects. VAST's server pooling capability provides dedicated Quality of Service for competing projects.

Global Namespace

Simplify data access from anywhere with a single global namespace across cloud, edge, and core.

Reference Architecture

Sample Genomics Pipeline with VAST

images