As you are probably aware, the robots.txt file is a set of instructions for visiting robots (spiders) that index the content of your web site pages. For those spiders that obey the file, it provides a map for what they can, and cannot index.
On occasion, some of these search engine spiders may not be able to read the robots.txt file. When this happens, the robots meta tag comes into play. In effect, this is a last chance to keep content out of search engines.
If this meta tag is missing, or if there is no content, or the robot terms are not specified, then it will be assumed that the search engine may index all of your site. Therefore, it makes more sense to use this Meta Tag in case you don’t want certain parts of your web page indexed.
Writing a Robots Tag
The syntax for the robots meta tag can look like any of the following:
<meta name=”robots” content=”index,follow”>
<meta name=”robots” content=”noindex,follow”>
<meta name=”robots” content=”index,nofollow”>
<meta name=”robots” content=”noindex,nofollow”>
Robots Tag Myths
“If you have a robots.txt file, you don’t need a meta tag.”
False. Although this is true for the majority of the time, occasionally a search engine may not be able to view the robots.txt file. The tag provides a second chance opportunity to restrict content from being indexed.
“I can use a robots meta tag instead of a robots.txt file.”
False. Again, although it is possible to restrict content with the meta tag, the first option should be for the txt file. The robots.txt file has far more options for restricting areas of your site from being crawled.
A later article will discuss the robots.txt file in detail. In the meantime, our meta tag discussions continue with some of the more obscure elements.