A Robots.txt File Instructs Search Engines And Bots

December 8, 2009 |  by Tom K  |  Bots

Robots.txt File

There are times when you may want to block search engines (or other robots) from including certain directories or files in search engine results. One way to do this is to create a text file called a robots file. You can use Notepad or any other text editor to create the file.

If you wanted to block robots (like Google, Yahoo, Bing, etc) from everything on your website you can use:

User-agent: *
Disallow: /

Is that it? Yes. Simply copy and past that into a blank text file and upload the file to your top-level directory. You can test it by accessing your robots.txt file @ www.yourdomain.com/robots.txt

When the bots visit your page they will read this file and get instruction.

Below, I’ve included some other examples of usage:

If you wanted to give all robots complete access you can use:

User-agent: *
Disallow:

What if you wanted to block all robots from a certain directory or from multiple directories? Use the following in your robots.txt file:

User-agent: *
Disallow: /directory/
Disallow: /directory2/

What if you wanted to block a certain bot? Use the following as your robots.txt file:

User-agent: Bot
Disallow: /

Alternatively, you can allow only a certain bot by using the following:

User-agent: Google
Disallow:

What if you don’t know the “user-agent” name for the bot? I typically rely on a list like the one found at User-Agents.org

All though this method is definitely not foolproof, using the robots.txt file is a good way to communicate with bots. If there is an extremely sensitive directory you should make that directory password protected.

| More


Leave a Reply

Comment moderation is enabled, no need to resubmit any comments posted.