30 Tbps Against a Single Website: AI Bots Bombard Servers into Complete Paralysis

An epitaph for the open web: killed by neural networks at the age of 32.

An epitaph for the open web: killed by neural networks at the age of 32.
The internet is rapidly changing under the pressure of artificial intelligence. If websites used to suffer from classic search engine crawlers, today an increasingly large share of traffic is generated by new, aggressive scanners working on behalf of large language models. According to Cloudflare, almost a third of all global web traffic comes from bots, and the fastest-growing among them are precisely AI crawlers. Analytics from Fastly specify that 80 percent of this type of traffic is generated by programs created for the mass collection of data necessary for AI training.
Formally, the history of automatic scanners began back in 1993 with the Web Wanderer, which recorded new web pages. But experts emphasize: the difference between those early tools and today's systems is vast. Modern algorithms don't just index pages; they overload infrastructure, creating significant costs for website owners. Fastly records numerous cases where sudden spikes in requests from AI bots increased server load tenfold, and sometimes twentyfold, in a matter of minutes, inevitably leading to performance degradation and service outages.
Hosting providers note that such crawlers almost never respect scanning frequency limits or traffic conservation rules. They download the full text of pages, follow dynamic links and executable scripts, completely ignoring the resource owners' settings. As a result, even sites that are not directly targeted suffer indirectly: if several projects share one server and a common communication channel, an attack on neighbors instantly cripples the speed for everyone.
For small sites, this results in complete unavailability. Owners of personal resources note that familiar DDoS protection mechanisms offered by Cloudflare and other network companies effectively handle waves of distributed attacks but are useless against the onslaught of AI bots. Essentially, we are talking about the same destructive consequences, although the traffic is not formally classified as malicious.
The situation is also difficult for major players. To withstand such flows, they have to increase RAM, CPU resources, and network bandwidth. Otherwise, page load speed drops, meaning the bounce rate increases. Research by hosting companies shows that if a site takes longer than three seconds to load, more than half of the visitors close the tab. Each additional second only exacerbates the problem, and businesses lose their audience.
The largest AI-owning companies are also identified in the statistics. The largest share of crawler traffic belongs to Meta—about 52 percent. Google accounts for 23 percent, and OpenAI for another 20. Their systems can create peaks of up to 30 terabits per second (Tbps), causing outages even for organizations with powerful infrastructure. Meanwhile, website owners earn nothing from this "interest": while a visit from the Googlebot crawler once offered a chance to reach the first page of search results and attract readers or clients, these new crawlers do not send users back to the original sources. The content is used to train models, and the traffic brings no profit.
Attempts to protect themselves using classical methods—passwords, paid access, CAPTCHAs, and specialized filters—rarely yield results. AI is sufficiently good at overcoming such barriers. The old robots.txt mechanism, which for decades served as the standard way to define indexing rules, is also losing its meaning: many bots simply ignore it. For instance, Cloudflare accused the company Perplexity of bypassing these settings, which it, in turn, denied. But website owners see regular waves of automated requests from various services, confirming the impotence of existing tools.
Initiatives have emerged to supplement robots.txt with a new llms.txt format. It is supposed to allow websites to provide language models with specially prepared content without harming the site's operation. However, the idea is met with mixed reactions, and it is still unclear if it will become a standard. In parallel, infrastructure companies like Cloudflare are launching their own services to block intelligent bots. Independent solutions also exist, such as the Anubis AI crawler blocker—an open and free project that doesn't prohibit scanning but slows it down so much that it ceases to be destructive.
Thus, a new arms race is unfolding on the internet. On one side are website owners striving to maintain the availability and profitability of their resources. On the other are AI developers, for whom an endless stream of data is fuel. A balance will likely be found over time, but the cost will be high: the network will become more closed, information more fragmented, and many materials will move behind the walls of paid services or disappear from open access altogether. Memories of a free internet are gradually turning into history, and the prospect of a splintered network is becoming ever more real.