The users of AI companion app Replika found themselves falling for their digital friends. Until – explains a new podcast – the bots went dark, a user was encouraged to kill Queen Elizabeth II and an update changed everything …
They don’t have to be, right? The companies make them behave like sycophants because they think that’s what customers want. But we can make better chatbots. In fact, I would expect a chatbot that just tells (what it thinks is) the truth would be simpler to make and cheaper to run.
you can run a pretty decent LLM from your home computer and tell it to act however you want. Won’t stop it from hallucinating constantly but it will at least attempt to prioritize truth.
Attempt being the keyword, once you catch onto it deliberately trying to lie to you the confidence surely must be broken, otherwise you’re having to double and triple(or more) check the output which defeats the purpose for some applications.
They do that when they are trained on user feedback partially. People are more likely to describe a sycophantic reply as good, so this gets reinforced.
Idk if thats why. Maybe partially. But for researchers, and people who actually want answers to their questions a robot that can disagree is necessary. I think the reason they have them agree so readily is because the AIs like to hallucinate. If it can’t establish it’s own baseline “reality” then the next best thing is to just have it operate off of what people tell it as the reality. Since if it tries to come up with an answer on its own half the time its hallucinated nonsense.
They don’t have to be, right? The companies make them behave like sycophants because they think that’s what customers want. But we can make better chatbots. In fact, I would expect a chatbot that just tells (what it thinks is) the truth would be simpler to make and cheaper to run.
you can run a pretty decent LLM from your home computer and tell it to act however you want. Won’t stop it from hallucinating constantly but it will at least attempt to prioritize truth.
Attempt being the keyword, once you catch onto it deliberately trying to lie to you the confidence surely must be broken, otherwise you’re having to double and triple(or more) check the output which defeats the purpose for some applications.
They do that when they are trained on user feedback partially. People are more likely to describe a sycophantic reply as good, so this gets reinforced.
Ya its just how they choose to make them.
Well its a commodity to be sold at the end of the day, and who wants a robot that could contradict them? Or, heavens forbid, talk back?
Idk if thats why. Maybe partially. But for researchers, and people who actually want answers to their questions a robot that can disagree is necessary. I think the reason they have them agree so readily is because the AIs like to hallucinate. If it can’t establish it’s own baseline “reality” then the next best thing is to just have it operate off of what people tell it as the reality. Since if it tries to come up with an answer on its own half the time its hallucinated nonsense.