During the first time search engine bots visiting website, they check domain root directory. Robots.txt file contains information which sub directories should be indexed or not
Robots.txt example
https://www.google.com/robots.txt
Robots.txt structure
Robots.txt has basic structure
User-agent: useragent1 Disallow: ... Allow: ... Sitemap: ...
line starts with User-agent informs search engine regarding to valid operations on below
for example
User-agent: * Disallow: /search Allow: /search/about
this block indicates that /search path is disallowed for all clients and /search/about path is allowed
Allowing all pages for all user agents
User-agent: * Allow: *
Disallowing all pages for all user agents
User-agent: * Disallow: *
Sitemap tag
sitemap tag shows where sitemap file located
Sitemap: https://www.google.com/sitemap.xml
there should be only one Sitemap tag in robots.txt file.
we recommend using sitemap index file in the multiple sitemap case
Message length should be less than 1024 character!