Today organizations spend far too much time managing data across data lakes, data warehouses and other siloed data sources. Together, VAST Data and Dremio, the open data lakehouse platform, deliver a unified data analytics platform that brings all the performance and functionality typically found in a data warehouse to a flash-powered data lake that can store all your data, including structured (Parquet, JSON and CSV), unstructured, and semi-structured data.
Dremio’s SQL query engine, based on Apache Arrow, supports multiple analytic use cases, from mission critical business intelligence and reporting to ad-hoc and exploratory analytics. The varying compute needs for these use cases cannot be adequately supported by hard-disks, making all-flash infrastructure critical. Until now, many customers have been unable to afford an all-flash data center because of the significant cost differences between hard drives and flash storage systems.
With its unique Disaggregated and Shared-Everything (DASE) architecture, VAST Data has broken this cost and performance trade- off to deliver a compounded level of storage savings that make flash affordable for all of your data. The result is a highly scalable and affordable all-flash, file and object platform that allows you to run petabyte scale analysis at less than half of the cost of traditional all- flash solutions, while being many times faster.
Massively-parallel architecture delivers concurrency and real-time access to billions of files and objects.
Elastically scale storage and performance as you add more users, query engines, or simply have more data.
Get all-flash performance for all data operations and searches at 1/2 the cost of traditional flash solutions.
Get up to 1.5:1 additional data reduction for pre-compressed Parquet files and 3.5:1 for un-compressed CSV files.
Dedicated QoS for Dremio’s sub-engines ensures demanding analytics jobs don’t prevent queries or dashboards from loading.
Store data in various formats and query them using different protocols (NFS or S3 protocols) from within the same namespace.
Solution
Most storage technologies just repackage the same 20-year old shared-nothing, scale-out architecture, which struggle to reliably scale performance and capacity beyond a few petabytes. With it’s DASE architecture, VAST Data has reimagined every aspect of what has become typical in storage system design, delivering superior scalability, resilience, and Quality of Service (QoS) at a radically lower TCO for your rapidly evolving data analytics applications.
— Run a variety of analytics workloads on the same namespace
— No noisy neighbor. Allocate containers per engine for QoS
— Resize container pools to tackle workloads of any concurrency
— Eliminate islands of infrastructure
VAST’s Universal Storage combines exabyte levels of scalability with multi-tenant quality of service tools in order to make it possible to consolidate all data and all applications onto one scale-out tier of affordable flash. Everything is simple when you no longer need to move data across silos of infrastructure.
DASE allows users to scale the performance, independently from the capacity of their system, allowing you to scale to 100s of petabytes and TB/s within the same namespace.
Get real-time responsiveness for all your data. With no east-west cluster traffic, DASE enables virtually unlimited linear, predictable scale. VAST systems in production regularly exceed 1000 GB/s.
Enable access to the same data via S3 and NFS simultaneously, eliminating the need to create multiple copies of the same data. Simply write via S3 and read that same data back via NFS or vice-versa.
Pool compute servers to provide dedicated QoS for competing applications on the same namespace. Now you can run batch and interactive analytics on the same namespace, without any performance issues.
VAST’s unique similarity based data reduction provides the industry’s highest level of data reduction, allowing you to get better storage efficiency.
Add enclosures and compute servers of multiple generations into a single cluster and namespace, eliminating the need for forklift upgrades needed or migration of data from the old to new clusters.