o3’s AI Revolution In today’s video, I’m diving...
Real quick, five things you may have missed in the huge announcement from OpenAI about their brand new model O3. Number one, there is a prize for the first model to reach Artificial General Intelligence. It's called the ARC-AGI prize. And they have a special exam that the model has to take that's supposed to simulate human general intelligence. O3 scored 87%, and the human baseline on that particular prize is not 100%. I don't know why, but it's not. It's 85%. So O3 mimicked human performance to the point where they had to do two things. First, the prize said, we're not awarding it because it's too expensive to use. And second, they said, we're making the test harder. We think that the test should be harder. So they moved the goalposts on O3. That's how good it is. Number two, O3 mini is coming early in the year. So one of the things that people don't realize is that O3 is a model that uses test time compute. So you type in your query, if you have thousands of dollars to spend, and O3 takes a long time to think about. I'm going to explain why later in this TikTok. And then it comes back. Well, what's interesting is once you have that basic model, you can distill it into something much faster after you've trained the original model. It's called distilled inference. It's how we have 4.0 mini. And one of the things the team called out, OpenAI did, is that O3 mini is coming. Faster, cheaper, probably better than O1. And that's really exciting, probably in January or February. Not obviously as good as full O3, but for the purposes of most people's tasks most of the time, incredible. Number three, how it works. How does O3 work? It works, it's not an LLM. It is an LLM, but it's not an LLM. I'm going to explain. It works by running a Monte Carlo search function across thousands of calls to large language models. So a Monte Carlo search function, very simply, you don't know which path to choose toward a solution. You imagine all the paths forward, and you pick the one that you think wins the most. And that's probably the correct path. Very, very simply, that is what is happening when O3 is in full compute mode. That is why it takes a long time, it is doing thousands of those calls, it is running those simulations forward. And that is incredibly effective, that is how Go was beaten. So Go is a game that was notoriously difficult for machines to play until they came up with this exact approach called AlphaGo. And then all of a sudden it started beating people. O3 uses the same architecture, but obviously applying it to language, to mathematics, to science, to other things. Number four, that you may not know about O3. It is now the number 175th best programmer in the world on the leaderboards. So if you worry about whether it codes, it codes better than all but 175 other people on the planet. And number five, right on the heels of that, the reason why we're not all doomed is because even though the world has obviously changed, 98% of us have no idea. I was in an airport yesterday, everyone around me had no clue that any of this had happened. They weren't paying any attention at all. That is the way most of the world works right now. Momentum and change in society takes a long time. The difference between the invention of the steam engine and when steam was fully industrialized is something like 150 to 200 years. It may not take that long for artificial intelligence, it will probably be much faster. But even much faster is still a long time. And it's still not clear that you would actually give O3 a person's job for a very interesting reason. As you give models more reasoning capabilities, O3 is more of a reasoning model than O1, both are much more reasoning than 4.0, they are less likely to be directly agentic. They don't just do what they're told in the same way, they think it through, they're thoughtful, they respond. And now that may make them more intelligent, it doesn't necessarily make it easier to have them do a job. And there's a lot of other human soft skills that we haven't figured out anywhere close how to replace. And that's actually, circling all the way back to the top of our five points, that is what the ARC AGI Prize called out. There are things that they are not measuring that seem really relevant for human performance that these models just don't know how to do. So, that's why I say approaching AGI, entering our AGI era for Taylor Swift fans. But it's going to be a blurry line. So there you go, five things you may not know about the O3 model. This is a huge deal, we're going to keep talking about it, and we're all going to have a wild new year together.
No AI insights yet
Save videos. Search everything.
Build your personal library of inspiration. Find any quote, hook, or idea in seconds.
Create Free Account No credit card required