“Gradually, then suddenly” - Ernest Hemingway
For the past two months, I’ve been consumed by thoughts about the future of AI. In some ways, I’ve been given an opportunity to peek into the future. While most of the world has been belly-aching about high AI company valuations and wondering when AI will deliver business value, VAST has been working with several of the world’s most important frontier model developers to architect the thinking machines of the future.
When I say thinking machines, I mean “thinking”. Beyond the AI that we generally play with today is a class of codes being developed that are now capable of demonstrating general intelligence (AGI). OpenAI’s o3 model is the most recent example of a new approach to AI computing called chain-of-thought (CoT) reasoning. First introduced by Google in 2022, CoT models break from how generative AI models were previously trained, where reasoning models break down complex tasks into intermediate reasoning steps rather than require a model to be pre-trained with all of the potential answers to a prompt. This shift in AI computing has radically moved the state of the art forward in mathematics, logic, physics and commonsense reasoning.
With this new approach to computational thinking, we’ve seen exponential advancements toward artificial general intelligence. While it has yet to be generally released, OpenAI’s o3 was recently tested against the ARC AGI benchmark, a series of puzzles designed to benchmark computational intelligence against human intelligence. The results are so astounding that the AI community has now accepted that o3 is able to approximate AGI-levels of intelligence.
The above chart provided by OpenAI tells us a few things:
- With these CoT models, there’s a clear correlation between FLOPS and ability. There is a new class of codes referred to as “long-thinking” codes, a trend the WSJ has recently started to report on.
- o3 low-compute now rivals the ability of a mechanical turk (tasks that require human intelligence and have historically been difficult for general computers to perform). But remember these can be done in a fraction of the time of a human, so they will be initially used for complex real-time rote operations such as labeling images for AI training data prep, data entry, sentiment analysis, surveys, content moderation and more.
- o3 high-compute is very expensive in compute (and energy) terms, but is now approaching the capability of an average STEM graduate. Here, time is also on the side of OpenAI where a high-compute job can finish work much faster than a human. Whereas I used to think that AI would first augment blue-collar work, you can see that there’s a correlation between low cost and high cost in both people and machines, where the enhancement to white-collar tasks will come approximately at the same time.
- If Huang’s Law continues to be true, the 100x gap in cost/task will be covered in the next ~5 years… leaving me a lot more time to to find the Stick of Truth in the South Park game I’m currently playing on the Switch
Gradually… and then suddenly… we can see AGI on the near horizon. What’s even crazier is to consider that o1 was only introduced in September 2024, three months prior to the introduction of o3. In just three months time, we’ve seen a 200% gain in intelligence. Without needing to focus as much on data prep and data engineering, model builders can move much faster by improving model architecture, thought chaining and thought execution. OpenAI is not the only player in this space - Meta, Mistral, High-Flyer and Google have also published research on reasoning-based approaches to AI. Now, think forward 24 months from now when the state-of-the-art in model design advances on the same curve… Early 2024, the world was wondering “where is the value”, now we have AGI and we’re headed quickly to Artificial SuperIntelligence within “a few 1000 days”.
AI is at a pivotal moment. In some circles, this new style of thinking has been described as System Two. In 2011, Daniel Kahneman published a popular psychology book about the nature of thinking called “Thinking, Fast and Slow”. In this book, he laid out two different ways in which humans think:
- System 1 thinking is fast, intuitive, and automatic, relying on heuristics and emotions.
- System 2 thinking is slow, deliberate, and analytical, requiring effort and conscious reasoning to evaluate complex problems and make logical decisions.
What comes next is a wild ride that I still can’t entirely comprehend. We can trade $ for intelligence… and the longer these systems compute, the more they can understand. Today’s multi-step reasoning can complete modest System 2 tasks within minutes. In the future, they will introduce wildly complex tasks that take weeks or months to process a well-constructed prompt. Yes, they will cost more, but the GDP impact could be extraordinary.
I’ll be a dreamer for a moment.
Enterprise Data + Artificial SuperIntelligence
What keeps me up most at night is thinking about the role of data in these new machines… machines that don’t need to be trained on information, but who now know how to go and retrieve data at different processing steps in order to advance the process of reasoning.
AGI is now starting to work on data preparation, this means it can develop a semantic understanding of data. If a machine can curate data, it can certainly also understand the relevance during a multi-step reasoning process. CoT models have indeed now also started to perform multi-variant data processing at different turns of a reasoning job:
While o3 and its peers are now just being able to rival STEM grads for puzzles and coding and math, AI supercomputers can work faster and have the endurance to power through more complex, multi-step processing routines that humans can’t because they have to deal with annoying things like sleeping, eating and Eric Cartman yelling at you to find the Stick of Truth.
More than anything, these machines depart from humans in their ability to cultivate and call upon vast reserves of data in ways that are impossible for humanity to efficiently replicate. Moreover, these machines are going to get fantastically big over the coming years. In 2027, most of the leading frontier model builders will have constructed at least one 1-Million GPU supercomputer. If these machines get applied for inference and not training, it’s clear that data processing is their most significant advantage over a human thinker.
Human Brain1 | 1,024 GPU Machine2 | 1M GPU Machine | 1M GPU Advantage | |
Cognitive Throughput | 40 MB/s | 160 GB/s* | 320 TB/s* | 8.3Mx |
Storage Capacity | 2.5 Petabytes | 4 Petabytes | 4 Exabytes | 1.6Kx |
FLOPS | 10 ExaFLOPS | 82 PetaFLOPS | 82 ExaFLOPS | 8.2x |
Power | 20 W | 2 mW | 2 gW | -100Mx |
4 exabytes seems like a hefty amount of data, but this is not out of line from what we see of the large AGI buildouts that are happening now, where model builders are trying to collect as much data as possible from public and licensed sources to inform both training and RAG. Having said that, this approach really can only address problems that are laid out in the public domain.
On the private side - through my discussions with Fortune 100 companies, I’ve attempted to average the total amount of unique data managed. Let’s suppose this is 300 Petabytes (not derived scientifically in any way, it’s just a guess based on random conversations).
300 petabytes can be processed in 16 minutes by a 1M GPU computer. Imagine being able to feed your entire enterprise corpus into an AI engine that is 8x smarter than the average human in the time that it takes to read this blog. Putting the energy consumption considerations aside (which are not insignificant), we are about to enter into an age of enterprise AI where enterprise data stores will need to provide the scale, the contextual metadata (in the form of data vectorization) and the real-time structured and unstructured data parallelism needed to answer the challenges put forward by tomorrow’s AGI/ASI machinery.
This is the moment VAST Data was founded for: where enterprise data management and processing converge with super-intelligent machines, you’ll find VAST Data.
The impact of this next wave of AI can be as profound as we choose to scale it to be.
Clean water. Clean energy. Clean bills of health. Cleaner calendars at work (probably not…).
I’m excited.
1 A conversation with ChatGPT: https://chatgpt.com/share/67850115-3810-800f-9b64-18d723b2c6d9
2 Using a NVIDIA B200 GPU, FP32 performance to approximate mixed precision, “Better” Read GB/s