Site Search

Find sitemap.xml, robots.txt and ads.txt - Real-Time Search

Not http https www subdomain

reportpistoia.com robots.txt

reportpistoia.com
Site: reportpistoia.com

The robots.txt file is a file at the root a domain indicates parts of the site should not be accessed by search engine crawlers.

(*) Thumbnail Screenshots by ShrinkTheWeb



# If the Joomla site is installed within a folder such as at
# e.g. www.example.com/joomla/ the robots.txt file MUST be
# moved to the site root at e.g. www.example.com/robots.txt
# AND the joomla folder name MUST be prefixed to the disallowed
# path, e.g. the Disallow rule for the /administrator/ folder
# MUST be changed to read Disallow: /joomla/administrator/
#
# For more information about the robots.txt standard, see:
# http://www.robotstxt.org/orig.html
#
# For syntax checking, see:
# http://www.sxw.org.uk/computing/robots/check.html

User-agent: 008
Disallow: /

User-agent: MJ12bot
Crawl-delay: 5

User-agent: Googlebot 
Crawl-delay: 5

User-agent: Googlebot-Image 
Crawl-delay: 5

user-agent: AhrefsBot
Crawl-delay: 5

User-agent: msnbot
Crawl-delay: 5

User-agent: bingbot
Crawl-delay: 5

User-agent: *
Disallow: /administrator/
Disallow: /cache/
Disallow: /cli/
Disallow: /components/
Disallow: /images/
Disallow: /includes/
Disallow: /installation/
Disallow: /language/
Disallow: /libraries/
Disallow: /logs/
Disallow: /media/
Disallow: /modules/
Disallow: /plugins/
Disallow: /templates/
Disallow: /tmp/


              
Thursday, 18 April 2019