Trusted by the world’s leading data-driven organizations
View All Customers

Get the transactional consistency of a relational database, and the query performance of an exabyte-scalable data warehouse at the cost of a data lake.​ The VAST Data Platform unifies structured and unstructured data on a next-generation distributed architecture that accelerates and simplifies analytical workflows across edge, core, and cloud.

Overview

All-flash data lakes at archive economics.

As modern query engines like Spark and Trino replaced Hadoop, organizations turned away from HDFS and DAS-based architectures in favor of object storage based on AWS S3. However, S3's lower performance required query engines to introduce workarounds, such as caching middleware, which greatly complicated architectures.

Introducing VAST DataBase, a high-speed transactional and analytical database that can handle millions of transactions per second and terabytes per second of query throughput at exabyte-scale. Designed to run at the edge, core, and cloud, the VAST DataBase enables organizations to capture data in real-time, archive it at data lake scale, and perform queries on real-time streaming data across exascale datasets. Without the need for separate databases, data warehouses, data lake platforms, or complex ETL pipelines, it's now possible to deliver insights faster and at lower TCO than ever before.

VAST solves the challenges of performance and scale for open platform analytics with scale-out NFS that provides parallel file systems levels of performance. Cloud-native applications benefit from VAST’s high-performance S3 implementation combined with full multiprotocol interoperability.  Managing data at scale is simple with the VAST Catalog, an always-in-sync automatic metadata index built on the VAST DataBase that lets you search and find data via intuitive UI and SQL interface for advanced queries and automating workflows.

A new column format

Embracing flash all the way to the archive enables query filtration levels impossible on HDD and hybrid solutions. VAST created a new columnar object designed to exploit the random access performance of NVMe. At just 32 KB, it is 4000 times smaller than a standard data science row group. This means that it can deliver a much smaller data payload for a given query, resulting in significant performance improvements. The fine granularity of the object also makes database maintenance much simpler. Updates, deletions, and pruning never require complex vacuuming operations.

images
images

Customer information and travel supplier data are our most vital assets and we need a data science platform that can easily and cost-effectively scale with our growth. To help provide our customers with the best value for their travel needs, we require a high-performance big data solution to run our machine learning algorithms, that’s also infinitely scalable to meet our future needs.

Idan Zalzberg
Chief Data Officer, Agoda (A Bookings.com Subsidiary)
Key Benefits

A smarter way to power query engines.

The VAST Data Platform is a revolutionary architecture built for the era of deep learning. Providing predictable, real-time performance with the capacity to support thousands of queries simultaneously at the scale needed for query engines to be able to randomly read across massive data sets.

Scales Transactions Linearly

VAST’s DASE architecture allows for the VAST DataBase to scale transactions linearly by simply adding CPUs ending the trade-offs of shared-nothing architectures.

Put An End To Complex Data Engineering

VAST eliminates the need for caches, separate meta stores, and data partitioning imposed by legacy architectures.

Complex Queries Run 100x Faster*

Accelerate data science with support for Spark, Trino, and additional query engines plus native SQL and analytics applications. 

* Point of comparison: VAST DB + Spark vs. Spark + S3

Global Namespace

Simplify data access from anywhere with a single global namespace across cloud, edge, and core.

Consistent Snapshots Across Multiple Tables

Near limitless and granular snapshots of one or many tables, make it simple to remove the complexity of time travel operations.

Superior Data Reduction

VAST’s similarity-based data reduction combines the global approach of deduplication with the byte-granular approach to pattern for unparalleled efficiency without performance impact.

Scales Transactions Linearly

VAST’s DASE architecture allows for the VAST DataBase to scale transactions linearly by simply adding CPUs ending the trade-offs of shared-nothing architectures.

Put An End To Complex Data Engineering

VAST eliminates the need for caches, separate meta stores, and data partitioning imposed by legacy architectures.

Complex Queries Run 100x Faster*

Accelerate data science with support for Spark, Trino, and additional query engines plus native SQL and analytics applications. 

* Point of comparison: VAST DB + Spark vs. Spark + S3

Global Namespace

Simplify data access from anywhere with a single global namespace across cloud, edge, and core.

Consistent Snapshots Across Multiple Tables

Near limitless and granular snapshots of one or many tables, make it simple to remove the complexity of time travel operations.

Superior Data Reduction

VAST’s similarity-based data reduction combines the global approach of deduplication with the byte-granular approach to pattern for unparalleled efficiency without performance impact.

Reference Architecture

Breaking trade-offs with The VAST Data Platform

images