• RedWizard [he/him, comrade/them]@hexbear.netOP
    link
    fedilink
    English
    arrow-up
    29
    ·
    2 days ago

    It’s a capital problem. What you’re talking about is what made DeepSeek so disruptive; they made it eat less resources by optimizing it. Except, in America, Sam Altman also has his hands in the nuclear power industry for explicitly powering AI datacenters. So its all gravy. If the new model eats eight times more power, then that means the market for nuclear power has more demand, which means more business down the pipe. Then you have to consider this model probably requires even more powerful GPUs (DeepSeek could run on less powerful GPUs), which is good for the GPU market and APU market. We get to throw out all the year and a half old compute modules in favor of the next compute module.

    • FunkyStuff [he/him]@hexbear.net
      link
      fedilink
      English
      arrow-up
      21
      ·
      2 days ago

      Still worth noting that, while DeepSeek is a huge improvement over American AI firms, they still don’t really have a solution for scaling up how smart their models are compared to the others; they can just make an equivalent model for much cheaper. So it doesn’t really solve the problem the AI firms are trying to skirt their way around, which is that they can’t deliver on the “today ChatGPT can count the r’s in strawberry, in 3 years it will be able to build a space station” type promises when the models don’t scale.