Googlebot Handles AI Generated Content: What You Need to Know

In the wake of the escalating prevalence of AI-generated content, a query was posed to Google's Martin Splitt regarding the adaptability of Googlebot's crawling and rendering mechanisms to this trend. 

Our Staff

AI Generated Content

Splitt’s response shed light on the intriguing process of how Google manages AI generated content, spotlighting the crucial aspect of quality control.

How Googlebot Brings Webpages to Life

The act of webpage rendering can be likened to a jigsaw puzzle. It involves the amalgamation of multiple elements – HTML, images, CSS, and JavaScript – that are downloaded and assembled in the browser to form a complete webpage. 

In a similar vein, Google’s industrious crawler, known as Googlebot, fetches these same components – the HTML, images, CSS, and JavaScript files – to meticulously render the webpage in its entirety.

How Googlebot Brings Webpages To Life
Image Source:

How deos Googlebot Handling of AI Generated Content

handles AI-generated content in the same way as it handles any other content. It crawls and indexes the content based on its relevance and quality. However, Googlebot may face challenges in understanding the context and intent of AI-generated content, which can affect its search rankings.

In a webinar engagingly titled “Exploring the Art of Rendering,” Google’s Martin Splitt offered his expert insights. This educational event was organized by Duda, a renowned name in the industry. 

A pertinent question arose from the audience, probing into whether the surge of AI-generated content could potentially impact Google’s rendering capabilities during the crawling process. 

Martin Splitt not only addressed this concern but also delved into the intricate process of Google’s evaluation of webpage quality during crawl time. He further discussed the subsequent actions taken by Google once a determination regarding the quality is made. 

The insightful question was posed by none other than Ammon Johns, and was articulated during the webinar by Ulrika Viberg.

Here is the question:

“So, we have one from Ammon as well, and this is something that is talked about a lot.

I see it a lot.

They said, content production increases due to AI, putting increasing loads on crawling and rendering.

Is it likely that rendering processes might have to be simplified?”

It seems that Ammon is curious to understand whether any unique procedures are being implemented to handle the amplified crawling and rendering demands associated with AI-generated content.

Martin Splitt replied:

“No, I don’t think so, because my best guess is…”

Martin discusses the challenge SEOs have with detecting AI content.

Martin continued:

“So we are doing quality detection or quality control at multiple stages, and most s****y content doesn’t necessarily need JavaScript to show us how s****y it is.

So, if we catch that it is s****y content before, then we skip rendering, what’s the point?

If we see, okay, this looks like absolute.. we can be very certain that this is crap, and the JavaScript might just add more crap, then bye.

If it’s an empty page, then we might be like, we don’t know.

People usually don’t put empty pages here, so let’s at least try to render.

And then, when rendering comes back with crap, we’re like, yeah okay, fair enough, this has been crap.

So, this is already happening. This is not something new.

AI might increase the scale, but doesn’t change that much. Rendering is not the culprit here.”

Quality Detection Applies To AI

Contrary to what might have been expected, Martin Splitt clarified that Google is not employing AI detection for content analysis. 

His statement indicated that Google is leveraging Quality Detection at various stages of content evaluation. 

This revelation is particularly intriguing, especially in the light of an article published by Search Engine Journal which discussed a quality detection algorithm capable of identifying low quality AI-generated content. 

Interestingly, the algorithm was not originally designed to target substandard machine-generated content. However, it was observed that the algorithm autonomously detected such content. 

The functioning of this algorithm aligns closely with Google’s publicized Helpful Content system. This system is purposefully engineered to recognize content that is authentically penned by humans.

The insightful commentary on Google’s Helpful Content algorithm comes directly from Danny Sullivan,

“…we’re rolling out a series of improvements to Search to make it easier for people to find helpful content made by, and for, people.”

Interestingly, it’s not just once that he mentioned content authored by humans. His announcement of the ‘Helpful Content’ system saw this reference thrice. 

The algorithm, ingeniously designed, not only identifies content generated by machines but also serves as a detector for overall low-quality content.

The research paper is titled, Generative Models are Unsupervised Predictors of Page Quality: A Colossal-Scale Study.

Researchers observe:

“This study suggests that detectors trained to differentiate between human and machine-written text can accurately predict the language quality of webpages, performing better than a standard supervised spam classifier.”

Circling back to what Martin Splitt said:

“…we are doing quality detection or quality control at multiple stages…

So, this is already happening. This is not something new.

AI might increase the scale, but doesn’t change that much.”

Interpreting Martin’s perspective, it appears that:

  1. Google uses quality detection for both human and AI content
  2. There’s nothing new being applied for AI content

Watch the Duda webinar with Martin Splitt at 35:50:

Exploring the Art of Rendering with Google’s Martin Splitt

, ,

Leave a Comment below

Join Our Newsletter.

Get your daily dose of search know-how.