JOMS3.COM
Recomented Articles

All about robots.txt file

Article Id :14 / 19/02/2018 Uploader name :Jomon Vayalil Oonnittan

  Robot.txt file is plane text file with the file name robots.txt

Which is used to tell the robots or any other non human traffic generator by computers to "don't read, look or fetch the web pages which is mentioned in this file. And the location of this file should be in your root folder of the website.

Normally all type of search engines robots like Google boat, bing boat are consider as robots, which is used to clawer our each web pages for indexing website  in there search engines.

In all case we may not be need to index all our web pages in the search engines, like some personal pages or some pages may used in the back end purpose or in deep net.

So in this article we will ex plane the general syntax of the robots.txt file. 

syntax for allowing all robots with all web pages.

User-agent: *
Disallow: /

Explanation: " User-agent: " stands for the user agent name, this line will tell the robots to, which are the robots are allowed to read this website. "*" in the place of the star we can write the name of the robot which one  you want to allow / disallow to read your site

Example : User-agent: Google bot, then only Google will allow /disallow to read your site, instead of Google boat if we use "*" all the robots will allow / disallow to reed your site.

so the first line of syntax will say which robots are allowed /disallowed. the second line of syntax that is

Disallow: / will say the robots to which are the files can allowed to read from this site.

"/" indicate that all the file from the roots are not allowed to read. If we use

Disallow: "bank" that means non of the file from the roots is not disallowed (robots can read all files)

User-agent: *
Disallow: /foldername/
   will only disallow the files in this folder for all robots

User-agent: *
Disallow: /filename.html
   will only disallow the particular file for all robots

We can also use multiple statement like tis

User-agent: *
Disallow: /foldername/

User-agent: *
Disallow: /filename.html

Please also notes some bellow mention syntax....


Protect Specific Directories From Robots

    User-agent: *
    Disallow: /my_home/
    Disallow: /joms3/


Protect Specific Pages From Robots

    User-agent: *
    Disallow: /joms3.com/contatme.php
    Disallow: /private.php


Prevent a Specific Robot from Accessing Your Site

    User-agent: Lycos/x.x
    Disallow: /

Allow Only One Specific Robot Access

    User-agent: *
    Disallow: /
    User-agent: Googlebot
    Disallow:


We can also add our site-map file in the robots.txt file as shown in the syntax bellow.

Sitemap: http://joms3.com/sitemap.txt

The above line can be add any where in the robots.txt file. And the robots can understand the location of you site map file and access it.

Live your comments at " it@joms3.com " so that we can improve our articles.
joms3.com on youtube
Quick links
© Copyright joms3.com