NEW YORK (Reuters) - You may never have to read another news story in your life, if you have artificial intelligence that can digest all the web’s information and serve up a summary on demand.
That’s the stuff of nightmares for media barons as Google (GOOGL.O) and others experiment with what's called generative AI, which creates new content drawing from past data.
Since May, Google has begun rolling out a new form of search powered by generative AI, after industry observers questioned the tech giant's future prominence in providing consumers with information following the rise of OpenAI's query-answering chatbot, ChatGPT.
The product, called Search Generative Experience (SGE), uses AI to create summaries in response to some search queries, triggered by whether Google’s system determines the format would be helpful. Those summaries appear on the top of the Google search homepage, with links to “dig deeper,” according to Google’s overview of SGE.
If publishers want to prevent their content from being used by Google’s AI to help generate those summaries, they must use the same tool that would also prevent them from appearing in Google search results, rendering them virtually invisible on the web.
Searching for “Who is Jon Fosse” – the recent Nobel Prize in Literature winner – for instance, generates three paragraphs on the writer and his work. Drop-down buttons provide links to Fosse content on Wikipedia, NPR, The New York Times and other websites; additional links appear to the right of the summary.
Google says that the AI-generated overviews are synthesized from multiple web pages and that the links are designed to be a jumping off point to learn more. It describes SGE as an opt-in experiment for users, to help it evolve and improve the product, while it incorporates feedback from news publishers and others.
To publishers, the new search tool is the latest red flag in a decades-long relationship in which they have both struggled to compete against Google for online advertising, and relied on the tech giant for search traffic.
The still-evolving product – now available in the United States, India and Japan – has raised concerns among publishers as they try to figure out their place in a world where AI could dominate how users find and pay for information, according to four major publishers who spoke to Reuters on the condition of anonymity to avoid complicating ongoing negotiations with Google.
Those concerns relate to web traffic, whether publishers will be credited as the source of information that appears in the SGE summaries, and the accuracy of those summaries, those publishers say. Most significantly, publishers want to be compensated for the content on which Google and other AI companies train their AI tools – a major sticking point around AI.
A Google spokesperson said in a statement: “As we bring generative AI into Search, we’re continuing to prioritize approaches that send valuable traffic to a wide range of creators, including news publishers, to support a healthy, open web.”
On compensation, Google says it is working to develop a better understanding of the business model of generative AI applications and get input from publishers and others.
In late September Google announced a new tool, called Google-Extended, that gives publishers the option to block their content from being used by Google to train its AI models.
Giving publishers the option to opt out of being crawled for AI is a “good faith gesture,” said Danielle Coffey, president and chief executive of the News Media Alliance, an industry trade group that has been lobbying Congress over these issues. “Whether payments will follow is a question mark, and to what extent there is openness to having a healthier value exchange.”
The new tool doesn’t allow publishers to block their content from being crawled for SGE, either the summaries or the links that appear with them, without disappearing from traditional Google search.
Publishers want clicks to secure advertisers, and showing up in Google search is key to their business. The design for SGE has pushed the links that appear in traditional search further down the page, with potential to reduce traffic to those links by as much as 40%, according to an executive at one of the publishers.
More alarming is the possibility that web surfers will avoid clicking any of the links if the SGE passage fulfills the users’ need for information – satisfied, for example, to learn the best time of year to go to Paris, without having to click on a travel publication’s website.
SGE is “definitely going to decrease publishers’ organic traffic and they’re going to have to think about a different way to measure the value of that content, if not click through rate,” said Forrester Research Senior Analyst Nikhil Lai. Even so, he believes publishers’ reputations will remain strong as a result of their links appearing in SGE.
Google says that it designed SGE to highlight web content. “Any estimates about specific traffic impacts are speculative and not representative, as what you see today in SGE may look quite different from what ultimately launches more broadly in Search,” a company spokesperson said in a statement.
While publishers and other industries have spent decades adjusting their websites to show up prominently in traditional Google search, they don’t have enough information to do the same for the new SGE summaries, these publishers say.
“The new AI section is a black box for us,” said an executive at one publisher. “We don’t know how to make sure we’re a part of it or the algorithm behind it.”
Google said publishers do not need to do anything different than what they have been doing to appear in search.
Publishers have long allowed Google to “crawl” their content for the purposes of appearing in search results – using a bot, or piece of software, to automatically scan and index it. “Crawling” is how Google indexes the web to make content show up in search.
Publishers’ concerns with SGE boil down to a key point: They say that Google is crawling their content, for free, to create summaries that users may read instead of clicking on their links, and that Google hasn’t been clear about how they can block content from being crawled for SGE.
Google’s new search tool, said one publisher, “is even more threatening to us and our business than a crawler that is crawling our business illegally.”
Google did not comment on that assessment.
When given the option, websites are blocking their content from being used for AI if doing so doesn’t impact search, according to exclusive data from AI content detector Originality.ai. Since its Aug. 7 release, 27.4% of top websites are blocking ChatGPT’s bot – including The New York Times and Washington Post. That’s compared to 6% that are blocking Google-Extended since its Sept. 28 release.