Thursday, February 18, 2010

Stop search engines indexing your web

1 Theoretical possibility


1.1 /robots.txt - robots exclusion file


Create file /robots.txt:

User-agent: *
Disallow: /



Search Engine Support: Google, Yahoo, Bing, ...

1.2 Use page meta tags robots



<meta name="robots" content="noindex,nofollow" />



2 The only sure method


2.1 Password protected web - web server solution


Example for Apache server: httpd.apache.org/docs/2.0/programs/htpasswd.html

Create a file called .htpasswd:

user1:nbGwQ3aJkq3qE (e.g. user1:encrypt password1)



KxS Password Encrypter: www.kxs.net/support/htaccess_pw.html

Create a file .htaccess:

AuthName "Restricted Area"
AuthType Basic
AuthUserFile /mysite/.htpasswd
AuthGroupFile /dev/null
require valid-user



Upload .htpasswd (best) to the root or a non publically accessible directory on your server.
Upload .htaccess to the directory you wish to protect.

2.2 Password protected web - cms solution


Protected area for registered users dependent on a specific CMS.

2.3 Computers and Humans apart


Captcha: en.wikipedia.org/wiki/CAPTCHA

reCaptcha: recaptcha.net

Hidden tag: 15daysofjquery.com/safer-contact-forms-without-captchas/11/

Captcha alternatives: www.arraystudio.com/as-workshop/the-captcha-alternatives.html

Sources:


Detailed description of 6 methods to control what and how your content appears in search engines - 6 ways to stop Google and other search engines from indexing your site | Antezeta Web Marketing.