Blocking GPTBot

I have decided to block OpenAI to train their data on my website.

Written in this way, it seems a bit pompous. I mean, I don’t believe what I am writing here is of real value to humanity, or has some monetary value. As I initially wrote, this is now essentially just an exercise, to write some of the stuff I encounter.

Still, it feels wrong that my words can be used to train a model. Maybe, exactly because this is just a personal exercise, I don’t want it to be part of something else.

That said, in my case the implementation was really straightforward. I just added a robot.txt in my hugo static directory, with the disallow instruction as according to OpenAI documentation; I have also added a couple more, according to some useful instructions found online:

User-agent: GPTBot
Disallow: /

User-agent: Google-Extended
Disallow: /

User-agent: CCBot
Disallow: /

That should do; I still need to find how to disable other bots, though – because the ones listed above are not the only players in this space…


blog

178 Words

2023-11-22 21:40 +0100