The "Robots.txt" File and How to Construct it

How To Create the Robots Text Page for your Web Site and Why you NEED One!

"The importance of robots.txt"

How do I get "top search engine ranking" for my website?  Search Engine Optimization  and Website Marketing with Search Engine First !
See Your Website as a Search Engine Spider Sees It  |  How to Automatically Generate Sitemaps  |  What is My Website Worth?

Search Engine First Placement Pricing Search Engine Optimization Results Contact Us for Search Engine First Page Placement Top 10 Website Placement FAQ's Guaranteed top 10 website ranking
 Pricing Results Contact FAQ'S Guarantee

RESULTS!

On Google where does MY website rank? do a search for: first place search engine ranking  Results: Number 1 in the organic listings in about 76,600,000 results.

Webmaster/SEO Specialist

Tami Robertson

Search Engine Optimization SEO Specialist

Webmaster Searchenginefirst.com

Pushing our competitors down
the search engines!

My newest TOP ranking websites 08/26/2011

www.environmentaldesigns.com
www.hydrovactrucks.net
www.powergeneratorsuperstore.com
www.usdieselengines.com
www.fractrailers.com


Domain Names - How to Select a Domain Name

Competitive Intelligence on the Web

Link Popularity

How to get YOUR website to the top of Google

How to Get My Web Site to the Top of the Search Engines

How to Make Money with your Website by Publishing with Adsense

Search Engine Ranking by Search Engine Optimization - Top 10 Website Guarantee

Site Maps Generator for Search Engine Success - Google Site Map Yahoo Site Map OPML HTML RSS Sitemap ROR SiteMaps

Search Engine Ready Website - Is My Website Ready for Search Engine Submission?

Keyword Optimization Guide for Search Engines

Link Analyzer - Good Website for Link Analysis

Search Engine News

Meta Tags - How to construct the Meta Tag

Natural Listings vs. Organic Listings and Pay-Per-Clicks on the Search Engines

The Robots.txt File and How to Construct it or how to create the robots text page

Top Search Engine Ranking by Search Engine Optimization Sample Websites

Search Engine Optimization FAQ

Search Engine Placement Guarantee

Search Engine Terminology

See your website through the eyes of a search engine spider

The importance of robots.txt

Although the robots.txt file is a very important file if you want to have a good ranking on search engines, many Web sites don't offer this file.

If your Web site doesn't have a robots.txt file yet, read on to learn how to create one. If you already have a robots.txt file, read our tips to make sure that it doesn't contain errors.

What is robots.txt?

When a search engine crawler comes to your site, it will look for a special file on your site. That file is called robots.txt and it tells the search engine spider, which Web pages of your site should be indexed and which Web pages should be ignored.

The robots.txt file is a simple text file (no HTML), that must be placed in your root directory, for example:

    http://www.yourwebsite.com/robots.txt

How do I create a robots.txt file?

As mentioned above, the robots.txt file is a simple text file. Open a simple text editor to create it. The content of a robots.txt file consists of so-called "records".

A record contains the information for a special search engine. Each record consists of two fields: the user agent line and one or more Disallow lines. Here's an example:

    User-agent: googlebot
    Disallow: /cgi-bin/

This robots.txt file would allow the "googlebot", which is the search engine spider of Google, to retrieve every page from your site except for files from the "cgi-bin" directory. All files in the "cgi-bin" directory will be
ignored by googlebot.

The Disallow command works like a wildcard. If you enter

    User-agent: googlebot
    Disallow: /support

both "/support-desk/index.html" and "/support/index.html" as well as all other files in the "support" directory would not be indexed by search engines.

If you leave the Disallow line blank, you're telling the search engine that all files may be indexed. In any case, you must enter a Disallow line for every User-agent record.

If you want to give all search engine spiders the same rights, use the following robots.txt content:

    User-agent: *
    Disallow: /cgi-bin/

Where can I find user agent names?

You can find user agent names in your log files by checking for requests to robots.txt. Most often, all search engine spiders should be given the same rights. in that case, use "User-agent: *" as mentioned above.

Things you should avoid

If you don't format your robots.txt file properly, some or all files of your Web site might not get indexed by search engines. To avoid this, do the following:

  1. Don't use comments in the robots.txt file

    Although comments are allowed in a robots.txt file, they might confuse some search engine spiders.

    "Disallow: support # Don't index the support directory" might be misinterepreted as "Disallow: support#Don't index the support directory".


  2. Don't use white space at the beginning of a line. For example, don't write

    placeholder User-agent: *
    place Disallow: /support

    but

    User-agent: *
    Disallow: /support


  3. Don't change the order of the commands. If your robots.txt file should work, don't mix it up. Don't write

    Disallow: /support
    User-agent: *

    but

    User-agent: *
    Disallow: /support


  4. Don't use more than one directory in a Disallow line. Do not use the following

    User-agent: *
    Disallow: /support /cgi-bin/ /images/

    Search engine spiders cannot understand that format. The correct syntax for this is

    User-agent: *
    Disallow: /support
    Disallow: /cgi-bin/
    Disallow: /images/


  5. Be sure to use the right case. The file names on your server are case sensitve. If the name of your directory is "Support", don't write "support" in the robots.txt file.


  6. Don't list all files. If you want a search engine spider to ignore all files in a special directory, you don't have to list all files. For example:

    User-agent: *
    Disallow: /support/orders.html
    Disallow: /support/technical.html
    Disallow: /support/helpdesk.html
    Disallow: /support/index.html

    You can replace this with

    User-agent: *
    Disallow: /support


  7. There is no "Allow" command

    Don't use an "Allow" command in your robots.txt file. Only mention files and directories that you don't want to be indexed. All other files will be indexed automatically if they are linked on your site.

Tips and tricks:

1. How to allow all search engine spiders to index all files

    Use the following content for your robots.txt file if you want to allow all search engine spiders to index all files of your Web site:

    User-agent: *
    Disallow:

2. How to disallow all spiders to index any file

    If you don't want search engines to index any file of your Web site, use the following:

    User-agent: *
    Disallow: /

Your Web site should have a proper robots.txt file if you want to have good rankings on search engines. Only if search engines know what to do with your pages, they can give you a good ranking.

 

 

Back to Search Engine First Homepage  |  Home