Google rolled out AI overviews across the United States this month, exposing its flagship product to the hallucinations of large language models.
Google rolled out AI overviews across the United States this month, exposing its flagship product to the hallucinations of large language models.
TBH I hate the term “hallucination” in this context. It’s just more BS anthropomorphizing. More marketing for “AI” (also BS). Can’t we just call it like garbage or GIGO or something more accurate? This is nothing new. I know that scientific accuracy is anathema to AI marketing but just saying…
Even though I agree in this context “hallucination” is actually the scientific term. It might be poorly chosen but in LLM circles if you use the term hallucination, the vast majority of people, will understand precisely what you mean, namely not an error in programming, or a bad dataset, but rather that the language model worked well, generating sentences that are syntactically correct, that are roughly thematically coherent, and yet are factually incorrect.
So I obviously don’t want to support marketing BS, in AI or elsewhere, but here sadly it matches the scientific naming.
PS: FWIW I believed I made a similar critic few months, or maybe even years, ago. IMHO what’s more important is arguably questioning the value of LLMs themselves, but then it might not be as evident for many people who are benefiting from the current buzz.
It’s not, actually. Hallucinations are things that effectively “come out of nowhere”, information that was not in the training material or the provided context. In this case Google Overview is presenting information that is indeed in the provided context. These aren’t hallucinations, the AI is doing what it’s being told to do. The problem is that Google isn’t doing a good job of providing it with the right information to summarize.
My suspicion is that since Google is using this AI for all search results it’s had to cut back the resources it’s providing to each individual call, which means it’s only being given a small amount of context to work from. Bing Chat does a much better job, but it’s drawing from many more search results and is given the opportunity to say a lot more about them.
FWIW https://arxiv.org/abs/2401.06796
We don’t choose. It’s decided to be the term for this. Computer bugs aren’t bugs. Etc etc. It’s just what the scientists called it
Gibberish?
It’s actually confabulation. Making up false memories as a result of brain damage.