Ian Mulvany

March 25, 2025

AI Bot traffic - a real problem, right now.

We have been experiencing some disruption at BMJ due to actions needed to protect against AI bot traffic. After reading this post by eric Hellman https://go-to-hellman.blogspot.com/2025/03/ai-bots-are-destroying-open-access.html I wanted to share some of our experiences.

The issue is a real one. To quote one of my team

Unfortunately, bot traffic on our journal websites has now surpassed real user traffic. These aggressive bots are attempting to crawl entire websites within a short period, overloading our web servers and negatively impacting the experience of legitimate users. … over 100 million bot requests have originated from data centers in Hong Kong and Singapore in just the past three weeks.

Our hosting provider has been using cloudflare to block AI bot traffic. In general our experiences with cloudflare, and the many tools that cloudflare provides, has been very positive.

However when bot traffic blocking has been enabled it looks like cloudflare is taking a broad approach and blocking almost anything that looks like machine to machine communication.

That stopped some internal APIs working, stopped a lot of IP range authentication methods for our customers and affected openrul and proxy services.

It’s been difficult for our teams and for our customers.

We either had to turn off bot protection, or start to whitelist large trainees of IPs, which at the moment is a fiddly and inaccurate process.

I think initially cloudflare was not aware of the context of scholarly websites. Our hosting provider is in conversation with them, I don’t know what the outcome of that will be.

On the one hand a provider like cloudflare offers amazing web scale infrastructure. On the other hand in scholarly publishing we have many long tail access methods that our customers require.

At the moment the things that Eric complains about are a real issue. I think as we move through the year we will see better mitigations become available. I’m not convinced we will see better behaviours from the LLM bots. Right now we are trying a few things to mitigate this, but I would love to hear from anyone else out there who has found a successful approach.


About Ian Mulvany

Hi, I'm Ian - I work on academic publishing systems. You can find out more about me at mulvany.net. I'm always interested in engaging with folk on these topics, if you have made your way here don't hesitate to reach out if there is anything you want to share, discuss, or ask for help with!