Your business may be a mundane B2B business like ours and you may think that you do not have to protect your website from attacks like cloning. You would be wrong. Eventually, your site may get cloned, it happened to us. This could be done by
- attackers trying to steal your traffic
- your competitors
- 3rd parties who may not like what you publish
We explain our experience, how cloning works, LLMs‘ impact on cloning and how to protect yourself once it happens:
We learnt about cloning by getting cloned
Someone bought a misleading domain name that looks similar to ours and mirrored our entire website.
Now the mirror website is down but you can see a screenshot below.


The domain was bought about a month after we published an article about a crypto project. It could be due to that article or another reason:

This was unethical and incompetent. All we needed to do was:
- Look up the new website’s IP
- Block it via Cloudflare
Why do attackers clone websites?
The aim is to steal traffic. Search engines could be sending your traffic to the clones which the clone operators can use to monetize via ad networks like Google Ads or they could make changes to the clone to confuse your readers by putting words in your mouth.
How does cloning work?
Attackers can use a variety of tools (e.g. HTTrack) to create a copy of your website. This copy may be dynamically updated, which was the case in our attack.
What can you do against it?
As usual, we will suggest defense in depth to ensure that cloners put in as much effort as possible with limited benefits. As a preventative measure, you should improve your capability to identify clones and make it easy for your users to identify your website:
- Link extensively between your own articles. The cloner may not change these links which would mean that even visitors that arrive at the clone, can click a link to arrive at your website. This helps you
- become aware of the clone via your website analytics solution since you will be seeing traffic from the clone. They will probably buy a domain name that is similar to yours which will make it easy to notice
- win back your readers quickly after they arrive on the clone
- Link between your different domains. It is trivial to read your links and replace the internal ones with URLs in the clone. However, if you own a website on a different domain, the links to that domain will probably not be replaced by the cloner and will warn you about the clone.
- Invest in your branding. The more memorable your logo, font choices and visual layout of your website, the better your readers will remember it. Therefore they may be less likely to be confused by cloners when they arrive at a domain that is different than yours that includes your material.
These help you identify as soon as you get cloned. Here is how to deal with a clone when you discover one:
- Try to block it from crawling your website. These are worth trying but will not stop sophisticated cloners so do not spend too much time with these steps:
- Make a trivial change on your website and check the clone to see how frequently they refresh their clone. If it is a static clone, move to the second step as there is not much you can do to take it down via technical measures. If it is a dynamic clone
- Discover the IP of the clone website, there are numerous free online tools for that
- Block its IP if it is regularly crawling your website. However, if the crawler is based in another server, this may not work. Then, it may be worth looking at IPs that regularly crawl your website and block them. Of course, you do not want to block legitimate users or search engine bots so thread with caution in blocking.
- Make a trivial change on your website and check the clone to see how frequently they refresh their clone. If it is a static clone, move to the second step as there is not much you can do to take it down via technical measures. If it is a dynamic clone
- Contact every provider the clone website relies on including its domain provider, host or CDN. Send them a take down request and clearly explain the attack. Sharing your copyright and trademarks along with the take down request will expedite the process.
How will LLMs change website cloning?
Our website was copied verbatim but now bad actors can easily change these as they copy your website:
- The wording and images on your website using generative AI
- The internal links using simple rules
Such clones would be extremely difficult to detect and they could easily claim that what they created is a novel website not a copy. Probably the only remaining defense is to ensure that your brand is recognized in its domain.
Hope that was useful! We normally write about AI, feel free to explore AI use cases in business.
Comments
Your email address will not be published. All fields are required.