Real-Time Data Engineering & Real-Time Global Data Analytics
Shipping in 2024, the VAST DataEngine will redefine the data computing paradigm by introducing serverless functions and real-time triggers into the VAST Data Platform. Once logic and state are merged... files, objects and tables come to life from edge to cloud.
Data Platforms Need To Evolve
For decades, datastores have been unaware of applications, and applications have been equally unaware of data events. The division between applications and data has resulted in fractional solutions to building data pipelines and a batch processing mentality which separates data streams from deep data analysis.
The VAST Data Platform aims to break the tradeoff between data streaming and global insight by engineering data processing and event notifications natively into the system.
With the VAST DataEngine – data, and changes to data, trigger action, action is then performed on the data, and the system processes recursively forever. The Data Engine is the basis for perpetual AI training and inference and we hope will be the basis for the AI-powered discoveries of the future.
A Programmable Computing Engine in Software
The DataEngine is a containerized computing environment that customers deploy on their choice of CPUs, GPUs and DPUs – from edge to cloud. By embedding logic directly into the VAST Data Platform, the system can schedule processing events in real time, triggered by data activities.
DataEngine Programmable Environment
VAST’s DataEngine provides a programmable environment in Python for developers to bring their own code. There are also a number of built-in functions that are provided out of the gate to get value from the VAST Data Platform.
File header indexing
PII data detection
Streaming between tables/topics/files
Next-Generation Event Streaming Infrastructure
The VAST DataEngine features a new data streaming interface designed to write events natively into the VAST DataBase.
For the first time, it’s now possible to analyze all data by ingesting streaming data in real-time into VAST’s exabyte-scale transactional and analytical database.
A Real-Time Event Router
The VAST Event Router unifies unstructured and structured data event management into a common platform, providing event consumers simple tools to trigger action.
The VAST Data Platform is designed to create structure and insight from unstructured data.
By storing triggers and functions as state in the VAST Data Platform, your code becomes dynamically managed by a global data store that supports global code versioning, global code distribution and global code security policies.
A Simple Python SDK
The VAST DataEngine is a serverless platform, programmed in Python, that integrates stateful functions into an exabyte-scale datastore.
By integrating streaming and data processing with an exabyte scale datastore and database, the Data Platform enables comprehensive function calling with minimal code.
Introducing the VAST DataSet
Deep learning data engineering is tough. Data engineers write large dataset files down to archive storage for training… creating a number of problems associated with rigid data management:
If model training requires data variation, new datasets are written down to storage, often creating redundant data because datasets use overlapping training example data
Because conventional datasets are not embedded with training code, it can often be difficult to reproduce training models as data and code continue to evolve independently
With the DataEngine, VAST is introducing a new concept called the VAST DataSet. This new approach to data management leverages the VAST Database to create materialized views of example data without copying and re-copying data into blunt data containers. DataSets can scale to exabytes. Each DataSet includes an indexed set of examples and the code used for training so that it’s easy to reproduce models on the fly.
A Global Execution Environment
The VAST DataEngine is built on a container framework that allows for services to be globally executed across the VAST DataSpace.
Sign up for updates as we get closer to product launch!
Innovation begins with understanding
The VAST DataEngine Explained
The compute engine of the VAST Data Platform, the VAST DataEngine brings insights to life by adding functions and triggers to data.
The VAST Data Platform White Paper
Learn how VAST's revolutionary DASE architecture defies all conventional definitions of storage, delivering all-flash performance at archive economics to simplify the data center and accelerate all modern applications.