Agreed but I want to see how it plays out. Historically a good Windows computer cost $1000 and it was all it took to start programming. How much does it cost a computer with enough resources to run a good enough AI model for agentic workflows and a reasonable time to first token? Can "most of the world" afford buying one?
Qwen 3.6 27B is quite good for agentic coding, and practical to run on consumer hardware. You need a system with either 32+ GB VRAM, or a unified memory system with 48+ GB VRAM and a decent integrated GPU. While not cheap, such a setup is still attainable for much of the world, and will eventually get cheaper over time. Open models hosted on non-American clouds also remain an option with a much lower barrier to entry, for cases where privacy is less critical.
There was an article on HN a few weeks ago where someone detailed how they managed to get an old datacenter GPU to run in their consumer PC, getting decent performance with qwen. He spent something like $200 on the GPU (second hand of course).
So yeah, I think models on local hardware will be quite common soon among the tech savvy (such as people creating software).
Especially considering the millions of 2026-class data center GPUs that massively overinvested companies are currently buying, which will be obsolete in a few years.
I think those are going to be run until they die. The capex vs opex is too high to obsolete them in a few years. They'll keep serving current gen LLMs for as long as they keep running.
I do hope you're right that it will get cheaper over time (it should), but right now 32GB of VRAM is not affordable to a lot of people. You're talking ~$4500 just for the GPU, or $800 ish used if you can find one.
I don't understand the justification for local hardware with cost as the motivation. The same (or bigger/better) open weights models can served by third parties at much higher resource utilisation, and will therefore be much cheaper!?
Especially because the world is likely to persist, at least for a while, in state where computing hardware demand drastically exceeds supply resulting in high prices for hardware. So why wouldn't you want to max out utilisation and amortize costs, at least for typical (non sensitive) use cases.
Moore's law or one of its generalizations still holds, so it will only be a short matter of time before a $1k computer will be able to train and run a powerful enough model.
I thought Moore's Law came to an end in the last decade?
Certainly the transistors/chip or transistors/$ or flops/$ have not been progressing at the same exponential rate as during 1970-2010. There is still progress, but it's rather slower.
> Historically a good Windows computer cost $1000 and it was all it took to start programming.
Gotta remember inflation here.
$1K in 1995 was roughly equivalent to $2K now and wouldn't have been a particularly "good" machine then.
In 1982 the Commodore 64 started at about $600 bucks, also roughly around $2K today.
If you outgrew that, beefier machines back then were A LOT. It was easy to find $2k+ towers and (especially) laptops even into the 2000s, and a lot of those would be $5K+ equivalent today.
Software models and hardware are getting better all the time—and that’s where some big companies spending billions might stumble! In fact, Microsoft recently announced that they’re scaling back a bit on their AI investments.
Historically the cost of compute has also gone down. Like just look at it as compared to a year ago. We have amazing open source models that can run on consumer hardware and if we go away from our obsession of using opus 4.8 or mythos for everything then it actually is super amazing to see what these open source models could do. I use qwen3.6:27b as a daily driver and I am heavily impressed with it.
Roughly about Eur 3-4K right this minute I think? The graphics card, ram and storage are punishing. Under more normal circumstances (hopefully late 2027) it'd be 1500-2500 depending on what you think is realistically useful.
Possibly it's the same price range, allowing for inflation.
Before the AI "crisis" it used to take about $3500 to get a prebuilt with a 5090 which can run good enough LLMs. I run reasonable LLMs on just 16GB of VRAM on my Mac, and the 5090 has double that.
> It was only in 2025, as memory prices began an unprecedented surge, that the memory makers started to build new fabs targeted at HBM, all slated to start producing chips in 2027 or 2028.
The RAM shortage is predicated on both the huge datacenter buildout (many of which are already mired in delays, with a few even cancelled outright), and the massive memory purchase commitments various hyperscalers have made - hyperscalers who seem to be running short on cash lately...
History? This isn't the first RAM shortage. When one happens, producers build more fabs. The fabs come online, the availability of memory shoots up, and the shortage goes away, usually replaced by a glut.
If you want to argue that this is different from all previous RAM shortages, you can, but the burden of proof is on you to show the difference.
Hence why brute force needs to be replaced with examples such as neuromorphic methods. It could realistically could be combined with mesh networking as well to utilise the capabilities of all computers locally.
Over the long term, it seems like open models must win out. This feels like it rhymes with the story of operating systems. Despite the enormous financial contributions of Microsoft and Apple, linux still won because control matters over the long term.
I predict that mech interp and things like Neuronpedia will matter more and more over time, and the frontier providers are disincentivized from providing those tools
Edge models will get much better after the current insane capex and organic data for pre-training is dried out. But hard to see how the best open source models will ever come close to the best closed ones.
It's already happening. GLM-5.2 ranks quite close to SOTA models. Some might argue that benchmarks can't measure the real effectiveness for day-to-day usage but that's another discussion.
There is no reason we should accept the enclosure of the digital commons represented by AI. The data these models are trained on amounts to the total intellectual and artistic output of human kind through recorded history. It belongs to all of us, and accordingly, so should the models and weights produced by it.
They got paid. That’s what the money was for. It’s the investors who backed these foundational model companies who will hold the bag as more open source models come along and consume more market share.
> the investors who backed these foundational model companies who will hold the bag
Is awfully bold to assume that private credit is who will be holding the bag here. The IPOs are coming to shift the risk to the index funds & retail. Once insider lock up periods expire, I suspect a massive sell off.
Information also wants to be expensive, and that tension will never go away. "Information wants to be free" is only one side of the context of that quote.
> On the one hand you have—the point you’re making Woz—is that information sort of wants to be expensive because it is so valuable—the right information in the right place just changes your life. On the other hand, information almost wants to be free because the costs of getting it out is getting lower and lower all of the time. So you have these two things fighting against each other.
Information may want to be free, but the humans creating it still need to eat and pay rent. Copyright isn't necessarily unethical more than its a flawed tool, and lasts far too long in the law's current state. It needs to last only and exactly as long for the original creator to profit from the work for a specific duration of time, and then thats it.
No, greedy people want information to be expensive.
It is only because of rampant greed and capitalism that information is not free. There is nothing inherent about the collective knowledge of mankind that lends itself to being proprietary and expensive. Otherwise human society literally could not have evolved.
Great framing for your case, but I think it is less that it is unethical and more that ideas/copyright isn't perpetual, nor should it be fully transferrable to a corporation (a non-person entity)
I'd struggle to find an idea, art, technique etc... that wasn't an extension of something that came before it.
People used to make the similar arguments about programming languages and compilers. Now you'd need extaordinary requirements to justify paying licensing or usage fees for a language runtime or compiler.
Paying them may now be impossible. There might be some legal settlements still.
Preventing a handful of massive companies from continuing to be the only ones able to make money off that, not only unimpeded but with overt or covert state assistance (regulatory capture, ownership, whatever), at least puts an end to the worst of the abuse.
If we have broken the idea of copyright, and we do indeed appear to have broken the idea of copyright, why should trillion dollar companies owned and controlled by strange or psychopathic weirdos and their circle of investors be the only ones benefiting? Why do Sam and Dario or the US government get to decide when and for whom the tap is turned on?
Open-source models democratize access to foundational technology, reducing vendor lock-in risk for organizations. The community iteration model can also accelerate improvements in edge cases that proprietary teams might deprioritize.
Yann is on the mark. Almost amusing to see the EU along with its many former “subjects” realize they are at great risk of joint Chinese-American hegemony in AI. We should all be more terrified of a few nation states defining the agendas and policies of AI use than current Ai variants that a inherently without purpose or autonomy.
Great analogy to the fear of the printing press being really bad news in that it enabled the rabble to get aroused.
AI is the canary in the coal mine. They don't have an AI problem they have an everything problem. Inability to maintain energy security, declines in manufacturing, their social programs are no longer sustainable (Pension age rises and reforms), German car industry is in decline, increased spending demands for defense, and so on.
All that's needed is another sovereign debt crisis to spark what is essentially dry tinder and I think the EU is a lot closer to collapsing than anyone even remotely realizes.
We aren’t going to have Open Source AI without Open Source hardware specs and Open Source manufacturing. Software has been solo driving open computing for far too long, and with AI now the bottlenecks are finally moving up the stack.
I don't follow. If weights are open, can't competing providers pop up? Including, e.g., coalitions of anarchists who collectively share compute and collaborate on modifications to the weights.
Even if it's too expensive to run the models on your own personal hardware, open weights may still make it possible to take power back from the big private corporations.
How does that follow? Plenty of open source software runs in a commodity datacenter. This is about the API bottleneck, not the physical location of the GPU.
We don't need rinky-dink RTX models that budget VRAM.
We need large scale open weights models just as capable as what's at the frontier.
And we need the ability to rent compute and spin up the weights easily. One-click, easy enough for anyone. Easier than nerd tools like ComfyUI, Claw, and node graph garbage.
Freedom is owning very large scale weights. Anything less is subsistence.
We need to improve the waster and energy usage and this method doesn't. Most are not reinventing the wheel, a shared AI repository, communicated between online local computers would save a lot of need for these large models.
I'd love to see credible numbers on the energy usage of thousands of people running models on their own devices compared to sharing data center resources to run big models that serve many different people at the same time.
My hunch is that the energy/water usage of the data centers is a whole lot more efficient than everyone running at home, but I'd be interested in seeing real data on that.
Water usage goes up with data centers because more cooling is needed when you run the hardware harder.
So: if you're running the models on your own machine, presumably you're not running them as often, and air cooling is sufficient. But, at the same time, this is less efficient in terms of hardware use; the data centers need water cooling specifically because they're getting more bang from their buck from their hardware, by running their hardware harder.
So that's the tradeoff: more hardware-use efficiency means more water usage.
All consumer hardware (not counting XOC) uses either air cooling or closed-loop liquid cooling, so the water usage is zero, always. Power is a little trickier. I'd assume it's less efficient, but also the total usage is less, because the user sometimes turns the machine off, and the hardware idles to a deeper sleep state than server hardware.
the comparison misses that local LLM usage covers tasks you'd never send to an API — private code, offline work, medical notes. the baseline is 'local vs not-doing-it', not 'local vs cloud'
This is the wrong approach that will turn us into serfs. We need big honking models that do what the leading foundation hyperscaler models do to within a few percentage points of measured performance.
The small-scale models are not productive, and the duct tape solutions built on top of them are hobbyist-tier "year of Linux on desktop" toys.
I imagine fedora-wearing, crypto-shilling, coupon-cutting boffins every time I see small weights thing lauded as the future. This is the Pine Phone F-Droid of AI.
"SMS works most of the time on my phone, I swear! I don't really need my banking app!"
That is not big model energy.
Nothing outside of the top ten is worth spending any time on, and we need to focus on models that bridge the gap.
You're talking about impractical toys for highly technical people wasting their own time. That doesn't move the needle or have any economic impact on the competitive landscape.
We need sharp teeth that bite at the legs of the top-tier foundation labs and hold them back from running away with the prize.
We've been through this time and time again over the last thirty years. It's the same shaped problem as before. We don't need toys - we need real infra for real people paying money to do work. Not freeware for freeloaders who don't spend and invest in the problem space.
Large models fit that precisely, because it forces investment into a wide variety of open infra, routers, inference engines, etc. Not to mention the weights ecosystem itself.
Agreed but I want to see how it plays out. Historically a good Windows computer cost $1000 and it was all it took to start programming. How much does it cost a computer with enough resources to run a good enough AI model for agentic workflows and a reasonable time to first token? Can "most of the world" afford buying one?
Qwen 3.6 27B is quite good for agentic coding, and practical to run on consumer hardware. You need a system with either 32+ GB VRAM, or a unified memory system with 48+ GB VRAM and a decent integrated GPU. While not cheap, such a setup is still attainable for much of the world, and will eventually get cheaper over time. Open models hosted on non-American clouds also remain an option with a much lower barrier to entry, for cases where privacy is less critical.
There was an article on HN a few weeks ago where someone detailed how they managed to get an old datacenter GPU to run in their consumer PC, getting decent performance with qwen. He spent something like $200 on the GPU (second hand of course).
So yeah, I think models on local hardware will be quite common soon among the tech savvy (such as people creating software).
Especially considering the millions of 2026-class data center GPUs that massively overinvested companies are currently buying, which will be obsolete in a few years.
I think those are going to be run until they die. The capex vs opex is too high to obsolete them in a few years. They'll keep serving current gen LLMs for as long as they keep running.
> You need a system with either 32+ GB VRAM
I do hope you're right that it will get cheaper over time (it should), but right now 32GB of VRAM is not affordable to a lot of people. You're talking ~$4500 just for the GPU, or $800 ish used if you can find one.
For inference you can split the 32GB between two 16GB cards. Two new 5060tis for ~€1000 in total is more than fine.
It's a tad less efficient and a bit more of a hassle, but still a good experience for only a fraction of the price.
Indeed, and with some tinkering around the harness it can even punch way above its weight.
I don't understand the justification for local hardware with cost as the motivation. The same (or bigger/better) open weights models can served by third parties at much higher resource utilisation, and will therefore be much cheaper!?
Especially because the world is likely to persist, at least for a while, in state where computing hardware demand drastically exceeds supply resulting in high prices for hardware. So why wouldn't you want to max out utilisation and amortize costs, at least for typical (non sensitive) use cases.
Open weights/source doesn't necessarily mean running on local hardware, though.
I imagine having multiple providers competing will drive down hosted versions of open weight models drastically.
Moore's law or one of its generalizations still holds, so it will only be a short matter of time before a $1k computer will be able to train and run a powerful enough model.
I thought Moore's Law came to an end in the last decade?
Certainly the transistors/chip or transistors/$ or flops/$ have not been progressing at the same exponential rate as during 1970-2010. There is still progress, but it's rather slower.
> Historically a good Windows computer cost $1000 and it was all it took to start programming.
Gotta remember inflation here.
$1K in 1995 was roughly equivalent to $2K now and wouldn't have been a particularly "good" machine then.
In 1982 the Commodore 64 started at about $600 bucks, also roughly around $2K today.
If you outgrew that, beefier machines back then were A LOT. It was easy to find $2k+ towers and (especially) laptops even into the 2000s, and a lot of those would be $5K+ equivalent today.
And a unix workstation in those days could be high 4 or even 5 figures, depending on configuration.
Yes, between Moore's Law and more efficient model architectures, we just have to let time do its work.
Software models and hardware are getting better all the time—and that’s where some big companies spending billions might stumble! In fact, Microsoft recently announced that they’re scaling back a bit on their AI investments.
Historically the cost of compute has also gone down. Like just look at it as compared to a year ago. We have amazing open source models that can run on consumer hardware and if we go away from our obsession of using opus 4.8 or mythos for everything then it actually is super amazing to see what these open source models could do. I use qwen3.6:27b as a daily driver and I am heavily impressed with it.
Roughly about Eur 3-4K right this minute I think? The graphics card, ram and storage are punishing. Under more normal circumstances (hopefully late 2027) it'd be 1500-2500 depending on what you think is realistically useful.
Possibly it's the same price range, allowing for inflation.
Before the AI "crisis" it used to take about $3500 to get a prebuilt with a 5090 which can run good enough LLMs. I run reasonable LLMs on just 16GB of VRAM on my Mac, and the 5090 has double that.
Isn’t this just a bet that I’ll have an AI data center in my iPhone within 10 years? Why is that a bad bet?
About $2k in 2026 dollars and falling.
... or rising, at least as long as there's a RAM shortage.
I’d bet that there won’t be a RAM shortage for very long.
The best article I've seen about that is this one by David Oks (ignore the headline, the content is much better): https://davidoks.blog/p/ai-is-killing-the-cheap-smartphone
> It was only in 2025, as memory prices began an unprecedented surge, that the memory makers started to build new fabs targeted at HBM, all slated to start producing chips in 2027 or 2028.
It still won’t help unless the AI bubble pops. Even old fabs will continue pumping out HBM instead of DRAM as long as hyperscalers gobble it up.
This seems wildly optimistic, do you have anything to support it?
The RAM shortage is predicated on both the huge datacenter buildout (many of which are already mired in delays, with a few even cancelled outright), and the massive memory purchase commitments various hyperscalers have made - hyperscalers who seem to be running short on cash lately...
History? This isn't the first RAM shortage. When one happens, producers build more fabs. The fabs come online, the availability of memory shoots up, and the shortage goes away, usually replaced by a glut.
If you want to argue that this is different from all previous RAM shortages, you can, but the burden of proof is on you to show the difference.
Hence why brute force needs to be replaced with examples such as neuromorphic methods. It could realistically could be combined with mesh networking as well to utilise the capabilities of all computers locally.
Over the long term, it seems like open models must win out. This feels like it rhymes with the story of operating systems. Despite the enormous financial contributions of Microsoft and Apple, linux still won because control matters over the long term.
I predict that mech interp and things like Neuronpedia will matter more and more over time, and the frontier providers are disincentivized from providing those tools
> linux still won because control matters over the long term.
what has Linux won? Servers? sure
There's a video of the entire session here:
https://webtv.un.org/en/asset/k14/k14ej1ucqu?kalturaStartTim...
(if that link doesn't work, it starts about 12 minutes into the start)
had to click the play button, but it keyed to the 12m mark
Edge models will get much better after the current insane capex and organic data for pre-training is dried out. But hard to see how the best open source models will ever come close to the best closed ones.
It's already happening. GLM-5.2 ranks quite close to SOTA models. Some might argue that benchmarks can't measure the real effectiveness for day-to-day usage but that's another discussion.
There is no reason we should accept the enclosure of the digital commons represented by AI. The data these models are trained on amounts to the total intellectual and artistic output of human kind through recorded history. It belongs to all of us, and accordingly, so should the models and weights produced by it.
It belongs to not us, but the copyright holders of that work
ok, but government is how you do that. and as should be evident, its easy to year down and corrupt
Companies aren't really that better. Actually, sometimes it's even worse, because they are not even formally required to serve the public.
The effort of humans who had to toil through training models belongs to everyone? Do they no longer have any ownership over their hard work?
A drop in the bucket compared to the value of the collective human work that was stolen to train it.
edit: come to think about it I think the ratio of one drop to one bucket is vastly over estimating the ratio of the trainer's effort.
No human work was stolen, it was read.
Calling what LLMs do just "reading" is, at best, naïve anthropomorphization.
They got paid. That’s what the money was for. It’s the investors who backed these foundational model companies who will hold the bag as more open source models come along and consume more market share.
I agree, but
> the investors who backed these foundational model companies who will hold the bag
Is awfully bold to assume that private credit is who will be holding the bag here. The IPOs are coming to shift the risk to the index funds & retail. Once insider lock up periods expire, I suspect a massive sell off.
Are the going to pay for the societal harms they cause?
Some consider copyright to be unethical. "Information wants to be free".
Information also wants to be expensive, and that tension will never go away. "Information wants to be free" is only one side of the context of that quote.
> On the one hand you have—the point you’re making Woz—is that information sort of wants to be expensive because it is so valuable—the right information in the right place just changes your life. On the other hand, information almost wants to be free because the costs of getting it out is getting lower and lower all of the time. So you have these two things fighting against each other.
Information may want to be free, but the humans creating it still need to eat and pay rent. Copyright isn't necessarily unethical more than its a flawed tool, and lasts far too long in the law's current state. It needs to last only and exactly as long for the original creator to profit from the work for a specific duration of time, and then thats it.
No, greedy people want information to be expensive.
It is only because of rampant greed and capitalism that information is not free. There is nothing inherent about the collective knowledge of mankind that lends itself to being proprietary and expensive. Otherwise human society literally could not have evolved.
Great framing for your case, but I think it is less that it is unethical and more that ideas/copyright isn't perpetual, nor should it be fully transferrable to a corporation (a non-person entity)
I'd struggle to find an idea, art, technique etc... that wasn't an extension of something that came before it.
I would propose that copyrights not be eternal
Not too mention the unbelievable cost of actually doing all that training.
People used to make the similar arguments about programming languages and compilers. Now you'd need extaordinary requirements to justify paying licensing or usage fees for a language runtime or compiler.
Paying them may now be impossible. There might be some legal settlements still.
Preventing a handful of massive companies from continuing to be the only ones able to make money off that, not only unimpeded but with overt or covert state assistance (regulatory capture, ownership, whatever), at least puts an end to the worst of the abuse.
If we have broken the idea of copyright, and we do indeed appear to have broken the idea of copyright, why should trillion dollar companies owned and controlled by strange or psychopathic weirdos and their circle of investors be the only ones benefiting? Why do Sam and Dario or the US government get to decide when and for whom the tap is turned on?
Open-source models democratize access to foundational technology, reducing vendor lock-in risk for organizations. The community iteration model can also accelerate improvements in edge cases that proprietary teams might deprioritize.
Yann is on the mark. Almost amusing to see the EU along with its many former “subjects” realize they are at great risk of joint Chinese-American hegemony in AI. We should all be more terrified of a few nation states defining the agendas and policies of AI use than current Ai variants that a inherently without purpose or autonomy.
Great analogy to the fear of the printing press being really bad news in that it enabled the rabble to get aroused.
AI is the canary in the coal mine. They don't have an AI problem they have an everything problem. Inability to maintain energy security, declines in manufacturing, their social programs are no longer sustainable (Pension age rises and reforms), German car industry is in decline, increased spending demands for defense, and so on.
All that's needed is another sovereign debt crisis to spark what is essentially dry tinder and I think the EU is a lot closer to collapsing than anyone even remotely realizes.
We aren’t going to have Open Source AI without Open Source hardware specs and Open Source manufacturing. Software has been solo driving open computing for far too long, and with AI now the bottlenecks are finally moving up the stack.
I don't see any evidence of this. Open source software thrived on proprietary hardware for decades.
What..? We have open weights AI already, and even if we didn't, I don't see how Open Source hardware is a pre-requisite
Weights are meaningless if you can’t run the model from a computer under your desk.
I don't follow. If weights are open, can't competing providers pop up? Including, e.g., coalitions of anarchists who collectively share compute and collaborate on modifications to the weights.
Even if it's too expensive to run the models on your own personal hardware, open weights may still make it possible to take power back from the big private corporations.
How does that follow? Plenty of open source software runs in a commodity datacenter. This is about the API bottleneck, not the physical location of the GPU.
Fine, but I have a computer under my desk. I didn't have to wait for open hardware in order to get it.
We don't need rinky-dink RTX models that budget VRAM.
We need large scale open weights models just as capable as what's at the frontier.
And we need the ability to rent compute and spin up the weights easily. One-click, easy enough for anyone. Easier than nerd tools like ComfyUI, Claw, and node graph garbage.
Freedom is owning very large scale weights. Anything less is subsistence.
We need to improve the waster and energy usage and this method doesn't. Most are not reinventing the wheel, a shared AI repository, communicated between online local computers would save a lot of need for these large models.
I'd love to see credible numbers on the energy usage of thousands of people running models on their own devices compared to sharing data center resources to run big models that serve many different people at the same time.
My hunch is that the energy/water usage of the data centers is a whole lot more efficient than everyone running at home, but I'd be interested in seeing real data on that.
Water usage goes up with data centers because more cooling is needed when you run the hardware harder.
So: if you're running the models on your own machine, presumably you're not running them as often, and air cooling is sufficient. But, at the same time, this is less efficient in terms of hardware use; the data centers need water cooling specifically because they're getting more bang from their buck from their hardware, by running their hardware harder.
So that's the tradeoff: more hardware-use efficiency means more water usage.
With hardware like the Spark and Strix, the water usage is known to be zero, yea?
On the energy front, I assume less efficient, but I also think there is a tradeoff in efficiency versus freedom, that's why I have my own hardware.
All consumer hardware (not counting XOC) uses either air cooling or closed-loop liquid cooling, so the water usage is zero, always. Power is a little trickier. I'd assume it's less efficient, but also the total usage is less, because the user sometimes turns the machine off, and the hardware idles to a deeper sleep state than server hardware.
the comparison misses that local LLM usage covers tasks you'd never send to an API — private code, offline work, medical notes. the baseline is 'local vs not-doing-it', not 'local vs cloud'
NO!
This is the wrong approach that will turn us into serfs. We need big honking models that do what the leading foundation hyperscaler models do to within a few percentage points of measured performance.
The small-scale models are not productive, and the duct tape solutions built on top of them are hobbyist-tier "year of Linux on desktop" toys.
I imagine fedora-wearing, crypto-shilling, coupon-cutting boffins every time I see small weights thing lauded as the future. This is the Pine Phone F-Droid of AI.
"SMS works most of the time on my phone, I swear! I don't really need my banking app!"
That is not big model energy.
Nothing outside of the top ten is worth spending any time on, and we need to focus on models that bridge the gap.
You're talking about impractical toys for highly technical people wasting their own time. That doesn't move the needle or have any economic impact on the competitive landscape.
We need sharp teeth that bite at the legs of the top-tier foundation labs and hold them back from running away with the prize.
We've been through this time and time again over the last thirty years. It's the same shaped problem as before. We don't need toys - we need real infra for real people paying money to do work. Not freeware for freeloaders who don't spend and invest in the problem space.
Large models fit that precisely, because it forces investment into a wide variety of open infra, routers, inference engines, etc. Not to mention the weights ecosystem itself.