According to Gavin King, founder of Dark Visitors, most major AI agents still adhere to robots.txt. “That’s been pretty consistent,” he says. But not all website owners have the time or know-how to constantly update their robots.txt files. And even when they do, some bots circumvent the file’s guidelines: “They’re trying to disguise the traffic.”
Prince says Cloudflare’s bot blocking won’t be a command that these types of bad actors can ignore. “Robots.txt is like putting up a ‘no trespassing’ sign,” he says. “It’s like having a physical wall with armed guards.” Just as it flags other types of suspicious web behavior, such as price-scraping bots used for illegal price monitoring, the company has created processes to spot even the most carefully hidden AI crawlers.
Cloudflare is also announcing an upcoming marketplace for customers to negotiate scraping terms of use with AI companies, whether it’s paying for content usage or bartering for credits to use AI services in exchange for scraping. “We don’t really care what the transaction is, but we do think there should be a way to provide value back to original content creators,” Prince says. “The compensation doesn’t have to be in dollars. The compensation could be credits or recognition. It could be a lot of different things.”
There's no set date yet for the marketplace's launch, but even if it rolls out this year, it'll be one of an increasingly busy pipeline of projects aimed at facilitating licensing deals and other permissions arrangements between AI companies, publishers, platforms and other websites.
What do the AI companies think about this? “We’ve talked to most of them and their responses ranged from, ‘This makes sense and we’re open to it,’ to ‘Go to hell,’” Prince says. (He declined to name names, though.)
The project wrapped up fairly quickly. Prince credits a conversation with Atlantic CEO (and former WIRED editor in chief) Nick Thompson as inspiration for the project; Thompson had discussed how many different publishers had encountered surreptitious web scrapers. “I think it's great that he's doing it,” Thompson says. If even large media companies were having trouble dealing with the influx of scrapers, Prince reasoned, independent bloggers and website owners would have an even harder time.
Cloudflare has been a leading web security company for years, providing much of the infrastructure that supports the web. It has historically remained as neutral as possible about the content of its sites and services; in the rare cases where it has made exceptions to that rule, Prince has emphasized that he does not want Cloudflare to be the arbiter of what is permissible online.
He sees Cloudflare as uniquely positioned to take a stand. “The path we’re on is not sustainable,” Prince says. “Hopefully we can be a part of making sure people get paid for their work.”