If you have pages or directories which you do not want to be indexed by the search engines you can add this information to the robots.txt file and place the file on the server. When the search engine spider visits your site it reads the file and follows the instructions
 
 The robots.txt file need not exist but if it does it must be called "robots.txt" and must be written in ascii
 
 It must be in the root directory of the web site as spiders will not look for it anywhere else 
 
 Note: 
 If you do not have a robots.txt file in the root directory of your web site you may find a large amount of 404 errors appear on your web stats. This is because the file was requested by bots or spiders and was not available.
 
 To create a robots.txt file
 • Create a text file using a Word Processor or HTML editor using the required coding as examples below
 • Save the file as robots.txt 
 • Upload the robots.txt file to the root directory using your FTP software in ACSII mode
 
 Examples
 To exclude all robots from parts of the server
 User-agent: *
 Disallow: /cgi-bin/
 Disallow: /misc/sitestats/
 
 Exclude a specific spider from parts of the server
 User-agent:slurp.so/
 Disallow: /cgi-bin/
 Disallow: /secure/
 Disallow: /products/
 Disallow:/misc/sitestats/
 
 This indicates that nothing is disallowed and the spider can                      follow all links
 User-agent: *
 Disallow:
 
 To allow a single robot complete access and exclude                      all others
 User-agent: Googlebot/1.0
 Disallow:
 User-agent: *
 Disallow: / 
 
 This would prevent your                      entire web site from being indexed
 User-agent: *
 Disallow: / 
 
 Spider User-agents
 Alta Vista : Scooter
 Infoseek : InfoSeek Sidewinder Ultraseek                      Mozilla
 Lycos : Lycos_Spider_(T-Rex)
 Google : Googlebot/1.0
 Inktomi : Slurp Slurp.so
 
 
 The reasons for excluding files from some or all spiders could be privacy, log files or pages optimised for a particular search engines which you would not want indexing by other search engines
 
 You can add the Robots meta tag to the head of your web page to instruct spiders what to index and what not to
 
 <html>
 <head>
 <title>What Is A Robots text File</title>
 <meta name="description" content="If you have pages or directories which you do not want to be indexed by the search engines you can add this information to the robot txt file and place the file on the server">
 <meta name="robots" content="index, follow">
 </head>
 <body>
 
 The RobotsMeta tag has the following options
 Indexes the page and follows links
 <meta name="robots" content="index, follow">
 
 Does not index the page, but follows links
 <meta name="robots" content="noindex, follow"> 
 
 Indexes the page, but does not follow links
 <meta name="robots" content="index, nofollow"> 
 
 Neither indexes or follows links
 <meta name="robots" content="noindex, nofollow"> 
 
 You can use one of these tags on specific pages according to your requirements for each page
- 8 Utilizadores acharam útil
