Use robots.txt to Control Search Engine Spiders

by Sathishkumar on June 26, 2010

Robots.txt is a special file located in the root of each server which is used to keep complete control over crawling of your blog. You can allow some bots and disallow some bots to crawl your blog or your entire site using robots.txt, since all major search engine’s agrees to the Robots Exclusion Protocol.

use robots.txt to control search engine spiders

Is robots.txt file that important?

Many people think why should I use robots.txt and what is the purpose of using it. Here you will get the answer for your questions.

  1. Using robots.txt can avoid wastage of server resources by disallowing scripts which are solely meant for the use of humans and not for a bot.
  2. If you don’t have a robots.txt file in your server, there are chances that the search engine spiders receive the 404 error page of your site, which results in bandwidth usage. So, using robots.txt can save your bandwidth.
  3. Sometimes you don’t want a particular page or content or article to be not indexed in search engines. With robots.txt you can control the indexing of your blog.

Now you get what is the use of robots.txt and yes, Robots.txt is very important for every site.

How to set up a robots.txt to control search engine spiders?

Writing a robots.txt is pretty simple. All you need to do is create a text document with the name robots.txt and then fill it up with your required site control content and then upload it to root of the server. Here’s I am going to show you a robots.txt which control the search engine spiders from indexing my blog images directory and the cgi-bin directory.

User-agent: *
Disallow: /images/
Disallow: /cgi-bin/

You can use the same method to disallow what ever content you want. Using a robots.txt will definitely be an added advantage for your site search engine indexing.

If you find this tutorial useful, gives us a thumbs up.

Get Free Email Updates

I take your privacy very seriously

View all posts by
Blog →

Follow Us On

@techiemania facebook

Article by Sathishkumar Varatharajan

Sathishkumar is an Online Marketer. He is the mastermind behind this Make Money Online Blog. He is also the founder of Worthy Clique, Internet Marketing Company.

Sathishkumar Has Written 429 Articles For TechieMania.com

Share your thoughts on Facebook..

{ 1 comment… read it below or add one }

Pankaj Gupta April 14, 2011 at 5:25 am

Yes, robots.txt are very powerful but I will not suggest to block images because we can get traffic via google images too.

Reply

Leave a Comment

{ 1 trackback }

Previous post:

Next post: