DATABASE REVIEW
AI Overviews: An AI Chatbot on the Internet’s Front Page
by Mick O'Leary
AI Overviews
SYNOPSIS
AI Overviews is a Google AI chatbot that, unlike its competitors, requires no opt-in and automatically posts as the top, highlighted response to a Google search query. Prominent gaffes blighted its introduction.
|
In March 2023, a tech giant, to great fanfare, released an AI product whose multiple errors went viral and cast doubt on the company and on the viability of AI itself. In May 2024, the company did the same thing all over again.
The company is Google, and the first debacle was Bard, its AI chatbot, which was its entrant in the Great Generative AI Chatbot Race of early 2023. A few high-profile gaffes drew more public attention than its overall performance. The second was AI Overviews, which was intended to revolutionize Google search. Previous chatbots were user-initiated: You sought out the chatbot and submitted a query. AI Overviews, however, is self-initiated: It automatically generates an AI response to selected queries and places it at the very top of the search results. The idea was to close the loop in search, from ferreting through a bunch of links to having concise answers composed from top websites. Again, a few high-profile gaffes drew more public attention than its overall performance.
GOOGLE’S ANSWER MACHINES
AI Overviews, however, was not completely new to the public. It had been available for more than a year as Search Generative Experience (SGE), an opt-in from Google Labs. SGE was not a complete novelty; it was instead the latest step in a series of Google “answer machines” that extracted relevant content from individual sites, thus bypassing the whole link analysis stage. (For more information, see Database Review in the November/December 2023 issue of Information Today.) The top of a Google search page may have up to four answer machines:
- Featured Snippet—a small text extract from an authoritative site
- Google Knowledge Graph extracts
- People Also Ask—short answers to common user questions
- Things to Know—snippets on aspects of the user query
All of these are AI-generated and may be quite satisfactory when very short answers suffice. The new AI chatbots add two significant steps. First, they extract content from multiple sites. Second, they combine it into a single unified narrative. Their ability to do this, even almost 2 years after ChatGPT appeared, continues to amaze. And—oh—there’s one more aspect of chatbots: hallucinations.
I used SGE often over the past year. In its Google Labs stage, it regularly launched responses. If it didn’t do so at first, you could ask it to work on your query, and it would sometimes comply. I found that SGE responses—either spontaneous or requested—were produced for about 30%–40% of my queries. Overall, I found that they were typically accurate, used generally good sources, and were very short.
FROM SGE TO AI OVERVIEWS
For Database Review, I wanted to conduct a small set of queries across several subjects about which I had some direct knowledge or could easily fact-check. It didn’t work. I tried dozens of potential queries, including many from subjects that I had used in the SGE era, with no results: no history, physical science, social science, humanities, business, or current events. There were only a few topics I found that regularly generated results: life science, especially medical, health, and fitness, and a handful on AI itself.
I finally managed to amass a set of six AI Overviews results from those areas and compared them on several points:
- Links compared to a Google search
- Response compared to Gemini (free version)
- Response compared to Perplexity Ask
Note that I reviewed Perplexity in the June 2024 issue of Information Today, and it is still my go-to choice among the chatbots that I use regularly, including Bing, ChatGPT Plus, Claude, Copilot, and Gemini. I used Perplexity Ask, the free basic model, for the comparisons.
AI OVERVIEWS AT WORK
Disclaimer: My test used a small number of queries in a narrow subject range. Other evaluations might have differing results.
Accuracy
Results were highly accurate, with only one small slip that I noticed: When I searched “AI alignment research,” I received the following result: “OpenAI has a research program called ‘superalignment’ that aims to solve the AI alignment problem by 2027. The program is co-led by Jan Leike, OpenAI’s head of alignment research, and Ilya Sutskever, OpenAI’s cofounder and chief scientist.” But this is out-of-date information, because in May 2024, OpenAI announced that Sutskever would be leaving the company.
Relevance
Results generally adhered closely to the query. AI Overviews results are also very short, with an average of 176 words in my sample. This allows the reply to address only the broadest aspects of the topic. Many users might find this treatment just a little too sparse to enable them to get even a modest grip on the topic (see Response Compared to Perplexity Ask later in this article). I couldn’t assess timeliness because AI Overviews doesn’t reply to current events queries.
Number of Links
An AI Overviews result typically has from four to six links. The number of individual sites, however, may be smaller because one site may be cited more than once. The links are connected to individual sentences or paragraphs in the response.
Quality of Links
Links are from generally reliable sources. Because of the medical/health/fitness tendency in my sample, these included medical journals and nonprofit medical reference sites. Results from a few commercial healthcare sites were accurate. The most frequently cited source, however, was Wikipedia.
Links Compared to a Google Search
The queries were searched in Google’s Web mode, which produces an unadorned list of ranked sites (like the original Google search). Most of the AI Overviews sources also appeared in the top 10 list from the Google Web search.
Response Compared to Gemini
Gemini is Google’s flagship AI chatbot, which “powers” AI Overviews. Its responses were very close to those of AI Overviews in their accuracy and relevance, which would be expected for two good answers on the same topic. I concluded, however, that the two responses were separately generated, because I saw no duplication of language or wording that would suggest that they were somehow co-generated. Gemini’s responses averaged 227 words in length, or about 29% longer than AI Overviews’, which gave them somewhat greater heft.
Response Compared to Perplexity Ask
The first noticeable difference in Perplexity Ask’s responses is that they are much bigger than AI Overviews’ at 285 words, or more than 60% longer. This extra content enables the Perplexity Ask response to delve more deeply into aspects of the topic, thereby providing the user with that better “grip” on it. The Perplexity Ask answers are uniformly accurate and very well-composed.
THE CONTENT WARS: CREATION VS. AGGREGATION
AI Overviews opened a fierce new front in the decades-long war between content creators and content aggregators. This struggle was inflamed almost 2 years ago when the chatbots began to attract widespread public attention and usage. Content creators are concerned that, if users are satisfied to use the AI Overviews alone, they will be deprived of traffic.
AI Overviews suddenly threw more fuel onto the fire. It was, literally, top-of-the-front-page news on the world’s dominant search engine. There was no opt-in, no need to set up an account with a separate chatbot and seek it out—AI Overviews was right there. There was no charge, no hassle, and no need to mess with links as you accessed the creator’s website.
Google claims that AI Overviews in fact increases traffic to the linked sites in an AI Overviews response. But creators have claimed that their traffic has declined, with forecasts that AI Overviews will kill the goose that laid the golden egg, devastate the creators, and turn the internet into a content-starved desert.
WHITHER AI OVERVIEWS?
When I started writing this column, AI Overviews had mostly disappeared as it labored to recover from its laughingstock status. Now, as I finish the column, it’s back. A great irony is that, in an internet that’s swollen with bad content from conventional websites, a bold but perhaps hasty experiment (with a few bad apples in a barrel of thousands of good ones) was hounded into—at least temporary—exile. For AI Overviews to be more than just a niche novelty, however, it will have to answer most queries with levels of confidence and utility that are at least commensurate with those of its imperfect competitors. |