How to Attempt to block AI CCbot and Chatgbt

Stated Copyrights on WordPress

https://smallbusiness.chron.com/intellectual-property-rights-wordpress-blog-57660.html

That means they should have provided an AI blocking system for its writers. They didn’t. I can try this below.

https://www.searchenginejournal.com/how-to-block-chatgpt-from-using-your-website-content/478384/

If your site has already been crawled (It has already been visited by chatgbt) then it’s likely already included in multiple datasets.

Nevertheless, by blocking Common Crawl it’s possible to opt out your website content from being included in new datasets sourced from newer Common Crawl datasets.

This is what I meant at the very beginning of the article when I wrote that the process is “neither straightforward nor guaranteed to work.”

The CCBot User-Agent string is:

CCBot/2.0

Add the following to your robots.txt file to block the Common Crawl bot:

User-agent: CCBot
Disallow: /

An additional way to confirm if a CCBot user agent is legit is that it crawls from Amazon AWS IP addresses.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.