ChatGPT 5 power consumption could be as much as eight times higher than GPT 4 — research institute estimates medium-sized GPT-5 response can consume up to 40 watt-hours of electricity

RedWizard [he/him, comrade/them]@hexbear.net · 2 days ago

ChatGPT 5 power consumption could be as much as eight times higher than GPT 4 — research institute estimates medium-sized GPT-5 response can consume up to 40 watt-hours of electricity

SorosFootSoldier [he/him, they/them]@hexbear.net · 2 days ago

Granted I’m not a techie guy and can’t code, but like, isn’t there some way to do AI shit but smarter, more efficient, and less fucking wasteful? Is it a coding problem or is it just the nature of the beast that these things drink up entire lakes to spit out the wrong answers to math equations?

RedWizard [he/him, comrade/them]@hexbear.net · 2 days ago

It’s a capital problem. What you’re talking about is what made DeepSeek so disruptive; they made it eat less resources by optimizing it. Except, in America, Sam Altman also has his hands in the nuclear power industry for explicitly powering AI datacenters. So its all gravy. If the new model eats eight times more power, then that means the market for nuclear power has more demand, which means more business down the pipe. Then you have to consider this model probably requires even more powerful GPUs (DeepSeek could run on less powerful GPUs), which is good for the GPU market and APU market. We get to throw out all the year and a half old compute modules in favor of the next compute module.

FunkyStuff [he/him]@hexbear.net · 2 days ago

Still worth noting that, while DeepSeek is a huge improvement over American AI firms, they still don’t really have a solution for scaling up how smart their models are compared to the others; they can just make an equivalent model for much cheaper. So it doesn’t really solve the problem the AI firms are trying to skirt their way around, which is that they can’t deliver on the “today ChatGPT can count the r’s in strawberry, in 3 years it will be able to build a space station” type promises when the models don’t scale.

BeamBrain [he/him, they/them]@hexbear.net · 2 days ago

Yeah this is why Deepseek is the only GenAI I still use.

BynarsAreOk [none/use name]@hexbear.net · 2 days ago

deleted by creator

queermunist she/her@lemmy.ml · 2 days ago

The bubble is built around scale, so more bigger = more better

They’re making their chatbots less efficient to please investors who just want biggatons.

FunkyStuff [he/him]@hexbear.net · edit-2 2 days ago

The way in which these particular machines operate means that you need to make them guzzle 10x as much data, water, energy, or whatever other metric to get a meager improvement in how smart they are. 5 years ago, finding ways to scale them to be thousands of times larger was pretty easy, but now they’re coming up against the limitations and trying to break through the limits by just burning up more resources instead of slowing down to find a better approach.

zongor [comrade/them, he/him]@hexbear.net · edit-2 2 days ago

Yes it’s called quantization; it’s like zip file for a LLM model. You can get it small enough to run on a raspberry pi (like 5 amps) and although there is loss in “intelligence” it is still useable for a lot of scenarios. Look up ollama or llama.cpp for details