Overcoming the Technology Triple Challenge

Man Group Uses VAST Data as part of DataFrame Database

Industry Financial Services
Use Case HPC & AI
Overview

An important part of Man Group’s internal organization is its quant research and technology team. This team is comprised of hundreds of professionals around the globe. They are responsible for building complex systems and data models that the firm uses to support its various products and services.

Much of this work is based around analysis of a particular type of time-series data that logs all transactions, moment-by-moment, for a wide variety of instruments and asset classes (stocks, bonds, commodities, futures and so forth). It’s called “historical tick data” and represents everything that happens in a particular market for all items traded. These datasets can span periods from years to decades and are simply enormous. Individual data elements can span hundreds of millions of rows, or hundreds of thousands of columns, powering use cases such as deep tick history analysis or modelling of large corporate bond universes.

Background

That’s where the first challenge comes into play: because the datasets involved are so gargantuan, they can be time-consuming simply to load up and access. This is where VAST Data helps, because it enables these large datasets to be loaded and accessed in a reasonable amount of time. The time involved in getting to work obviously impacts the amount of work that gets done, so the VAST Data Platform helps improve productivity and capability. Indeed, speed is key to keeping up with fast-changing, always-moving markets.

The second challenge is related to how the datasets are handled once they’re loaded and are ready to process: as only then can they be ingested for analysis and used to drive models or simulations. Man Group’s previous database solution limited the size of datasets that it could ingest. Given that many important datasets are expected to break the 10PB barrier soon, this limitation was problematic. So flexible capacity that can accommodate datasets of 5PB or even 10PB is increasingly essential. 

The VAST Data Platform easily supports massive datasets, and Man Group has also developed ArcticDB, a high-performance Python-native database built in order to respond to the ever-increasing amount of data and complexity of front-office research at the firm. This is a challenge faced by many large buy-side and sell-side institutions. Using ArcticDB, Man Group’s investment professionals and technologists can better power robust, near-real-time automated trading. It also enables point-in-time analysis of research datasets and provides functionality for signal backtesting.

VAST and ArcticDB work in a complementary fashion. What the combination brings gets at the third challenge involved: while ArcticDB provides a set of data abstraction and definition tools that exercises capabilities across multiple file systems and object stores, VAST provides the underlying scale, speed and efficiency necessary to support the levels of performance for the huge datasets involved.

Outcome

Man Group’s internal experience in using VAST helped to inform and enable its design and implementation of the ArcticDB environment. In-house, the company’s biggest challenge has always been to produce as much actionable insight and intelligence as possible to drive investment decisions. 

“VAST’s size and speed helped boost what our quants and technologists could do with ArcticDB,” said James Munro, head of ArcticDB. “Ultimately it’s about productivity, which for us translates into value over time. VAST helps us produce more value in less time.”

Datasets are massive, complex and time-varying. However, regardless of their original form, DataFrames quickly emerge as the unit of analysis in modern data science workflows and ArcticDB makes this a first-class concern. As a DataFrame database, ArcticDB enhances the ability to generate new trading strategies, optimise portfolios, and manage investment risk. 

Companies in any industry with large amounts of complex data, especially time-series or transaction data, will find the combination of ArcticDB and VAST compelling. The first iteration of ArcticDB was made available on an open source basis via GitHub and has seen over one million downloads since 2015. The latest version continues an open source approach and adds a commercial proposition through an enterprise version for production use.

images

Ultimately it’s about productivity, which for us translates into value over time. VAST helps us produce more value in less time.

James Munro
Head of ArcticDB, Man Group