SEO Hardcore

Search Engine Optimization

  • Home
  • about
  • archives
  • sitemap
May 27th, 2008

Preventing unwanted robots from crawling your site.

john in robots.txt

There are a lot of good reasons to use a robots.txt file but one of the increasingly important ones is to prevent unwanted visits from robots. If you notice in your logs that there are a lot of user-agents that you don’t recognize you may be getting visits from crawlers that add no value to your site and simply digest bandwidth.

A comprehensive list of robots can help educate you about which crawlers are out there and which may bring you the most value.

Also you can fin a list of robots commands and a robots.txt file generator.

As an example the section below allows certain crawlers while shoo-ing away others:

# For domain: http://www.domain.com

User-agent: Googlebot

Disallow:

User-agent: Googlebot-Image

Disallow:

User-agent: MSNBot

Disallow:

User-agent: Slurp

Disallow:

User-agent: Teoma

Disallow:

User-agent: Gigabot

Disallow:

User-agent: Scrubby

Disallow:

User-agent: Robozilla

Disallow:

User-agent: Nutch

Disallow:

User-agent: ia_archiver

Disallow:

User-agent: baiduspider

Disallow:

User-agent: yahoo-mmcrawler

Disallow:

User-agent: psbot

Disallow:

User-agent: asterias

Disallow:

User-agent: yahoo-blogs/v3.9

Disallow:

# Shoo

User-agent: *

Disallow: /

Disallow: /cgi-bin/

# Disallow: /images/ - uncomment line with correct path for images

# File exclusions

Disallow: /dir/Privacy-Policy

Disallow: /dir/Security

# Sitemap declaration

sitemap: http://www.domain.com/sitemap.xml

[Slashdot] [Digg] [Reddit] [del.icio.us] [Facebook] [Technorati] [Google] [StumbleUpon]
no comment

Search

Rss

  • Main Entries RSS
  • Comments RSS

Recent Entries

  • SharePoint 2007 301 redirects
  • Preventing unwanted robots from crawling your site.
  • The top 10 ways to twist your ankle doing search.
  • .htaccess commands
  • Search in 2007
  • What can ACAP do for me?
  • ACAP Launches, Robots.txt 2.0 For Blocking Search Engines?
  • Example uses of the visitor engagement metric
  • My thoughts about Omniture and WebTrends
  • SharePoint SEO Fundamentals

Recent Comments

  • Keine Kommentare vorhanden.

Meta

  • Register
  • Log in
  • Valid XHTML
  • Valid CSS 3.0
  • WordPress

Categories

  • analytics
  • apache
  • caching
  • duplicate content
  • google
  • htaccess
  • Local Search
  • microsoft
  • MOSS
  • robots.txt
  • search fun
  • silverlight
  • spidering
  • Universal search

Archives

  • May 2008
  • January 2008
  • December 2007
  • November 2007
  • October 2007
  • September 2007
  • August 2007

Blogroll

  • Charles blog
  • Data Sage
  • Garage Sale
  • Link Princess
  • SEO Hardcore
  • SEO Tools
November 2008
S M T W T F S
« May    
 1
2345678
9101112131415
16171819202122
23242526272829
30  
© 2008 Wired by SEO Hardcore
Dezzain Studio
Nature Pictures | Bamboo Blinds