Blocking GPTBot
I have decided to block OpenAI to train their data on my website.
Written in this way, it seems a bit pompous. I mean, I don’t believe what I am writing here is of real value to humanity, or has some monetary value. As I initially wrote, this is now essentially just an exercise, to write some of the stuff I encounter.
Still, it feels wrong that my words can be used to train a model. Maybe, exactly because this is just a personal exercise, I don’t want it to be part of something else.
That said, in my case the implementation was really straightforward.
I just added a robot.txt
in my hugo static
directory, with the disallow
instruction as according to OpenAI documentation; I have also added a
couple more, according to some useful instructions found online:
User-agent: GPTBot
Disallow: /
User-agent: Google-Extended
Disallow: /
User-agent: CCBot
Disallow: /
That should do; I still need to find how to disable other bots, though – because the ones listed above are not the only players in this space…