• zongor [comrade/them, he/him]@hexbear.net
    link
    fedilink
    English
    arrow-up
    7
    ·
    edit-2
    2 days ago

    Yes it’s called quantization; it’s like zip file for a LLM model. You can get it small enough to run on a raspberry pi (like 5 amps) and although there is loss in “intelligence” it is still useable for a lot of scenarios. Look up ollama or llama.cpp for details