Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>There is a solution but just like other solutions it's up to Google to intepret it as cloaking and ban at their will. This is a long one and probably will need further tinkering to work for your case. (Sorry in advance for the length)</p> <p><strong>Setup</strong></p> <p>For the sake of the example, let's just say that:</p> <ul> <li>site: <code>www.thesite.com</code> and</li> <li>ImageURL base: <code>images.thesite.com</code></li> </ul> <p>(but ImageURL base could easily be <code>www.thesites.com/wp-content/uploads</code>)</p> <p><strong>Target</strong></p> <p>Our target is to make it so, (1) the full-size image is shown only with a watermark/overlay if it's requested from google images search and (2) don't break previously working stuff.</p> <p><strong>Solution</strong></p> <p>So the theoretical solution is the following.</p> <p><strong>1)</strong> Check the User-Agent and if it contains <code>Googlebot</code> then serve the "trap" URL. The trap URL is your current image URL but slightly changed so you can treat it differently, so instead of the current normal: </p> <p><code>http://images.thesite.com/wallpapers/awesome.jpg</code></p> <p>you should print for Googlebots:</p> <p><code>http://cacheimages.thesite.com/wallpapers/awesome.jpg</code></p> <p>(where <code>cacheimages</code> is anything you want)</p> <p><strong>2)</strong> Now the main dish; you should be able to target the requests to <code>http://cacheimages.thesite.com/</code> and have a script that acts like following:</p> <pre><code> If the request comes from a bot (check user-agent headers) Then serve the normal image without watermark Else (if the request seems to be from a normal user) Then check the referer: If it's from google (but NOT http://www.google.com/blank.html) Redirect to the Post of the image (Note 1.) Else if the refer is your site Show the raw normal image Else (any other referer, including http://www.google.com/blank.html) Show watermarked image (Note 2.) </code></pre> <p><em>Note 1</em>: This will happen when people click "View original image" or the image itself</p> <p><em>Note 2</em>: This will happen when people try to see the full-size image from the google image search results (and if they somehow arrive to the trap url of an image)</p> <p><strong>3)</strong> You could HTTP redirect the old images to the new ImageURL base if the user-agent is Googlebots so the overlay/watermark trick starts working on old images faster (or even use Google Webmaster Tools if you use subdomains for images) and you are sure to preserve the SEO juice.</p> <p><strong>Further actions</strong></p> <p>You could do more changes if you want to be serious.</p> <ol> <li>Instead of showing the watermarked image redirect to more dynamic url <code>http://cacheimages.thesite.com/preview?p=/wallpapers/awesome.jpg&amp;r=23535</code> or the more modern use of HTTP headers for no indexing: <code>X-Robots-Tag: noindex</code></li> <li>Of course cache the watermarked images</li> <li>Check the <code>Accept</code> http headers for cases that I haven't thought and serve image or redirect image post accordingly.</li> </ol> <p><strong>Note</strong></p> <p>You may also have to think about international traffic so instead of <code>google.com</code> you want to check for <code>google.[a-z-\.]+/</code></p> <p><strong>Conclusion</strong></p> <p>This could be adapted to any system, I made it for one that has images on a subdomain, so it probably won't be exactly the same for other systems like wordpress etc. Also, I am sure Google will do a change on their image search in the following couple months to fix this issue.</p> <p>An untested sample implementation of the idea can be found on <a href="https://github.com/sevastos/google-image-trap" rel="noreferrer">Github</a>.</p> <p><strong>Disclaimers</strong></p> <p>This hasn't been tested thoroughly and you could get banned, it's merely provided for research and educational purposes. I cannot be held responsible for any damages etc.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload