Site Search

Find sitemap.xml, robots.txt and ads.txt - Real-Time Search

Not http https www subdomain

lazzarionline.com robots.txt

lazzarionline.com
Site: lazzarionline.com

The robots.txt file is a file at the root a domain indicates parts of the site should not be accessed by search engine crawlers.

(*) Thumbnail Screenshots by Thumbshots



# Google Image Crawler Setup
#User-agent: Googlebot-Image
#Disallow:

# Crawlers Setup
User-agent: *
Allow: /

# Directories
Disallow: /404/
Disallow: /app/
Disallow: /cgi-bin/
Disallow: /downloader/
Disallow: /errors/
Disallow: /includes/
#Disallow: /js/
Disallow: /lib/
Disallow: /magento/
Disallow: /media/tmp
Disallow: /pkginfo/
Disallow: /report/
Disallow: /scripts/
Disallow: /shell/
#Disallow: /skin/
Disallow: /stats/
Disallow: /var/
Disallow: /measures/

# Paths (clean URLs)
Disallow: /index.php/
Disallow: /catalog/product_compare/
Disallow: /catalog/category/view/
Disallow: /catalog/product/view/
Disallow: /catalogsearch/
Disallow: /checkout/
Disallow: /control/
Disallow: /contacts/
Disallow: /customer/
Disallow: /customize/
Disallow: /newsletter/
Disallow: /poll/
Disallow: /review/
Disallow: /sendfriend/
Disallow: /tag/
Disallow: /wishlist/
Disallow: /catalog/product/gallery/
Disallow: /errors/

# Files
Disallow: /cron.php
Disallow: /cron.sh
Disallow: /error_log
Disallow: /install.php
Disallow: /LICENSE.html
Disallow: /LICENSE.txt
Disallow: /LICENSE_AFL.txt
Disallow: /STATUS.txt

# Paths (no clean URLs)
#Disallow: /*.js$
#Disallow: /*.css$
Disallow: /*.php$
Disallow: /*?SID=

#Sitemaps
sitemap: http://www.lazzarionline.com/media/sitemaps/sitemap.xml
sitemap: http://www.lazzarionline.net/media/sitemaps/english_sitemap.xml

              
Thursday, 25 April 2019