My thoughts on AI

CriticalResist8@lemmygrad.ml · 2 days ago

My thoughts on AI

☆ Yσɠƚԋσʂ ☆@lemmygrad.ml · 2 days ago

the mit article was written this may, and as it notes, ai datacenters still use much more electricity than other datacenters, and that electricity is generated through less environmentally-friendly methods. openai, if it is solvent long enough to count, will

Nobody is advocating for the service model companies like openai use here. I think this tech should be done using open source models that can be run locally. These companies also lack any clear business model. This is a great write up on the whole thing https://www.wheresyoured.at/the-haters-gui/

even the most efficient models take several orders of magnitude more energy to create than to use:

Creating models is a one time effort. The usage is what really counts. Also, most new models aren’t trained from scratch either. They use foundational models as the base then tweak the weights. There are also techniques like LoRA that let you adjust a trained model.

However, even this is improving rapidly. Here’s one example:

With only 27 million parameters, HRM achieves exceptional performance on complex reasoning tasks using only 1000 training samples. The model operates without pre-training or CoT data, yet achieves nearly perfect performance on challenging tasks including complex Sudoku puzzles and optimal path finding in large mazes. Furthermore, HRM outperforms much larger models with significantly longer context windows on the Abstraction and Reasoning Corpus (ARC), a key benchmark for measuring artificial general intelligence capabilities. These results underscore HRM’s potential as a transformative advancement toward universal computation and general-purpose reasoning systems.

and overall, ai datacenters use

Now compare that with DeepSeek.

DeepSeek has claimed it took just two months and cost under $6 million to build an AI model using Nvidia’s less-advanced H800 chips.

i’m not gonna make the argument that aggregate demand is growing, because i believe that the uses of llms are rather narrow, and if ai is being used more, it’s because it is being forced on the consumer in order for tech companies to post the growth numbers necessary to keep the line growing up.

These aren’t the really interesting uses of AI. The reason there’s so much focus on chatbots in the west is cause there’s no industry to speak of. Compare this with China:

In the future, he said, 98% of AI applications in the market will serve industrial and agricultural needs while only 2% will serve consumers directly.

Very similar sentiment from the founder of Alibaba cloud as well https://www.youtube.com/watch?v=X0PaVrpFD14

but if you’re super concerned about code quality you’re not using an llm anyway. at least unless they’ve made large strides since i last used one.

I really don’t see what code quality has to do with LLMs to be honest. You have the final say on what the code looks like, and my experience is that you can sketch out the high level structure of code, and have LLM fill it in. Generally it’ll produce code that’s perfectly fine, especially for common scenarios like building a UI, an API endpoint, etc. This is precisely the kind of tedious code I have little interest writing, and I can focus on the actual interesting parts of the project.

If you haven’t used them for even a couple of months, then yes you’re missing out on very large strides. The quality of output is improving on practically monthly basis right now, and how you use the models matters as well. If you’re just typing stuff in a chat you’ll have a very different experience from using something like plandex or roocode where the model has access to the whole project, it can run tests, and iterate on a solution.

It’s easy to dismiss this stuff when you already have a bias against it and don’t want it to work, but the reality is that it’s already a useful tool once you learn where and when to use it.

into_highest_invite@lemmygrad.ml · 2 days ago

Nobody is advocating for the service model companies like openai use here. I think this tech should be done using open source models that can be run locally.

this is definitely fair. i think my big issue with it is the inordinate amount of capital (land, carbon emissions, water) that go into it. maybe i’ve unfairly associated all ai with openai and gemini and meta.

Now compare that with DeepSeek.

my understanding of deepseek is that most of their models are trained by engaging in dialogue with existing models. the cost of training and running those models should be taken into account in that case. if it is from scratch that might change things, if the carbon and water numbers are good.

In the future, he said, 98% of AI applications in the market will serve industrial and agricultural needs while only 2% will serve consumers directly.

i think that’s a problem with the definition of ai. it’s not clear to me what tim huawei defines ai as. i’m not arguing against the concept of machine learning, to be clear. i thought we were talking specifically about language models and photo and video generation and whatnot

It’s easy to dismiss this stuff when you already have a bias against it and don’t want it to work, but the reality is that it’s already a useful tool once you learn where and when to use it.

yeah that’s fair enough. i didn’t mean to get into a huge discussion over llms because there’s definitely an element of that in my head. idk, i guess my point in saying that was that you can shit out a more-or-less working piece of code in any language pretty quickly, if you don’t need it to be idiomatic or maintainable. my understanding was ai was kind of the same in that regard.

i guess if training large language models can be done with negligible emissions and cooled with gray or black water, i can’t be against it. programming is definitely the main field where there’s no arguing that llms aren’t useful at all. i’m still unconvinced that’s what’s happening, even with deepseek, but if they’re putting their datacenters on 3-mile island and using sewage to cool their processors, i guess that would assuage my concerns.

☆ Yσɠƚԋσʂ ☆@lemmygrad.ml · 1 day ago

this is definitely fair. i think my big issue with it is the inordinate amount of capital (land, carbon emissions, water) that go into it. maybe i’ve unfairly associated all ai with openai and gemini and meta.

I very much expect the whole bubble to pop because these companies still haven’t found a viable business model. I agree the way these big companies approach things is incredibly problematic. At the same time, the best thing to do is to promote development of this tech outside corporate control. We already saw the panic over DeepSeek being open sourced, and the more development happens in the open the less leverage these companies will have. There’s also far more incentive to make open solutions efficient because people want to run them locally on commodity hardware.

my understanding of deepseek is that most of their models are trained by engaging in dialogue with existing models. the cost of training and running those models should be taken into account in that case. if it is from scratch that might change things, if the carbon and water numbers are good.

Sure, but that also shows that you don’t need to train models from scratch going forward. The work has already been done and now it can be leveraged to make better models on top of it.

i thought we were talking specifically about language models and photo and video generation and whatnot

Doing text, image, and video generation is just one application for these models. Another application of multimodal AI is that it can integrate information from different sensors like vision, sound, and tactile feedback, and this makes it useful for building world models robots can leverage to interact with the environment. https://www.globaltimes.cn/page/202507/1339392.shtml

into_highest_invite@lemmygrad.ml · 21 hours ago

Sure, but that also shows that you don’t need to train models from scratch going forward. The work has already been done and now it can be leveraged to make better models on top of it.

yeah but you gotta count the emissions by the datacenters running the old models. i don’t think that accounting is being done by openai, and i don’t think it’s possible for deepseek. actually, i don’t think openai is doing any accounting.

https://www.globaltimes.cn/page/202507/1339392.shtml

is this the same kind of ai as above? idk, the unqualified term “ai” is kind of ambiguous to me.

☆ Yσɠƚԋσʂ ☆@lemmygrad.ml · 17 hours ago

We already agree that companies like openai are a problem. That said though, even these companies have an incentive to use newer models that perform better to reduce their own costs and stay competitive. If openai needs a data centre to do what you can do on consumer grade hardware with a model like qwen or deepseek, they’re not gonna stay in business for very long.

And yeah Global Times article is specifically talking about multimodal LLMs which is the same type of AI.

into_highest_invite@lemmygrad.ml · 11 hours ago

no i mean is the ceo of alibaba referring to llms

☆ Yσɠƚԋσʂ ☆@lemmygrad.ml · 3 hours ago

I mean that’s what his team is working on, and that’s the type of AI that’s seen most focus in China.

TankieReplyBot@lemmygrad.ml · 2 days ago

I found a YouTube link in your comment. Here are links to the same video on alternative frontends that protect your privacy: