Cost of AI is decreasing exponentially - clip f...
So we're talking about O1 Pro, $200 a month, and they're losing money. So the thing that we're referring to, this fascinating exploration of the test time compute space, is that actually possible? Do we have enough compute for that? Does the financials make sense? So the fantastic thing is, and it's in the thing that I pulled up earlier, but the cost for GPT-3 has plummeted, if you scroll up just a few images, I think. The important thing about like, hey, is cost a limiting factor here, right? Like my view is that like, we'll have like really awesome intelligence before we have, like AGI before we have it permeate throughout the economy. And this is sort of why that reason is, right? GPT-3 was trained in what, 2020, 2021? And the cost for running inference on it was $60, $70 per million tokens, right? Which is the cost per intelligence was ridiculous. Now, as we scaled forward two years, we've had a 1200X reduction in cost to achieve the same level of intelligence as GPT-3. So here on the X-axis is time over just a couple of years, and on the Y-axis is log scale dollars to run inference on a million tokens. And so you have just a down, like a linear decline on log scale from GPT-3 through 3.5 to LAMA. It's like 5 cents or something like that now, right? Which is versus $60, 1200X, that's not the exact numbers, but it's 1200X, I remember that number, is the humongous cost per intelligence, right? Now, the freak out over DeepSeek is, oh my God, they made it so cheap. It's like, actually, if you look at this trend line, they're not below the trend line, first of all, and at least for GPT-3, right? They are the first to hit it, right? Which is a big deal, but they're not below the trend line as far as GPT-3. Now we have GPT-4, what's gonna happen with these reasoning capabilities, right? It's a mix of architectural innovations, it's a mix of better data, and it's gonna be better training techniques, and all of these different better inference systems, better hardware, right? Going from each generation of GPU to new generations or ASICs, everything is gonna take this cost curve down and down and down and down. And then, can I just spawn 1,000 different LLMs to create a task and then pick from one of them, or whatever search technique I want, a tree, Monte Carlo tree search, maybe it gets that complicated. Maybe it doesn't, because it's too complicated to actually scale, like who knows, bitter lesson, right? The question is, I think, when, not if, because the rate of progress is so fast, right? Nine months ago, Dario was saying, or Dario said nine months ago, the cost to train and inference was this, right? And now we're much better than this, right? And DeepSeek is much better than this. And that cost curve for GPT-4, which was also roughly $60 per million tokens when it launched, has already fallen to $2 or so, right? And we're gonna get it down to cents, probably, for GPT-4 quality, and then that's the base for the reasoning models like O1 that we have today, and O1 Pro is spawning multiple, right? And O3, and so on and so forth, these search techniques, too expensive today, but they will get cheaper, and that's what's gonna unlock the intelligence, right?
No AI insights yet
Save videos. Search everything.
Build your personal library of inspiration. Find any quote, hook, or idea in seconds.
Create Free Account No credit card required