icon Get the most out of Surmunity, read our tips here! Need an interesting blog to read? You've got to read the Surpass Blog! | Welcome! Please register to access all of our features.

» Surpass Web Hosting Forums » Discussions » Shared Hosting » Robots.txt

Shared Hosting Questions about your shared hosting account.

Reply
 
LinkBack Thread Tools Search this Thread
Old July 2nd, 2004, 12:56 PM   #19 (permalink)
Registered User
Comfy Contributor
 
Joined in May 2004
Lives in Indianapolis, IN
Hosted on PASS5
179 posts
Gave thanks: 0
Thanked 0 times
Some bots will ignore robots.txt files as they don't care if you want them on your web site or not.

These can be blocked by using a .htaccess file instead.

1. Block robots via .htaccess

We can't block by robot name here, we block them by matching the beginning of their User-Agent string.


Code:
SetEnvIfNoCase User-Agent "^EmailSiphon" bad_bot
SetEnvIfNoCase User-Agent "^EmailWolf" bad_bot
SetEnvIfNoCase User-Agent "^ExtractorPro" bad_bot
SetEnvIfNoCase User-Agent "^CherryPicker" bad_bot
SetEnvIfNoCase User-Agent "^NICErsPRO" bad_bot
SetEnvIfNoCase User-Agent "^Teleport" bad_bot
SetEnvIfNoCase User-Agent "^EmailCollector" bad_bot

<Limit GET POST>
Order Allow,Deny
Allow from all
Deny from env=bad_bot
</Limit>
This example bans some spambots.

To block another robot, add a line for it near the top.
Code:
SetEnvIfNoCase User-Agent "^User-Agent" bad_bot
Replace User-Agent with the User-Agent string for this robot, as found in log files. Here's a sample log entry.

Code:
xyz.net - - [07/Mar/2003:11:28:35] "GET / HTTP/1.0" 403 - "-" "Teleport 1.28"
Here, the User-Agent is Teleport 1.28. The ^ character in the SetEnvIfNoCase lines tells our .htaccess file to block anything starting with the string we provide.

Any User-Agent starting directly with Teleport would be blocked, regardless of version number or added text.

I hope that helps BigJohn


John (Should that be LittleJohn?)
__________________
PASS10
http://www.fallcreektech.com
http://www.mediabuynet.com - My Lovely Wife's Site
johnaikin is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old July 3rd, 2004, 12:26 AM   #20 (permalink)
Registered User
Seasoned Poster
 
Code3TJ's Avatar
 
Joined in Jan 2004
Hosted on Pass51
62 posts
Gave thanks: 0
Thanked 0 times
I have to agree on the MSNbot. :crash:
I have a question on the robot.txt files. Should it show up in the error logs on my site's stats?
__________________
Jeep Horizons - Pass51
California Jeeper - Pass51
Code3TJ is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old July 3rd, 2004, 11:11 AM   #21 (permalink)
Registered User
Comfy Contributor
 
Joined in May 2004
Lives in Indianapolis, IN
Hosted on PASS5
179 posts
Gave thanks: 0
Thanked 0 times
Your robots.txt should show up in your log files, but only in your error log if you don't have one.

Warning: More long-windedness to follow:

You can also use meta tags to manage robots to a certain extent.

However this method is much more limited.

You can't specify a robot to exclude this way, nor can you prevent access to directories in an easy manner.

Four directives can be used in the robots meta tag.


Directive Meaning
INDEX Index this page
NOINDEX Do not index this page
FOLLOW Follow/index links on this page
NOFOLLOW Do not follow/index links


The below example would tell the robot it may indeed index this page and follow links on this web page.

<META NAME="ROBOTS" CONTENT="INDEX,FOLLOW">



To allow all robots to index a page but not follow links on it, use the html meta tag line shown below.

<META NAME="ROBOTS" CONTENT="INDEX,NOFOLLOW">



An optional directive which may or may not work with most popular search engines is the revisit-after one.

<META NAME="revisit-after" CONTENT="15 days">



Replace the value in "15" with the number of days after which you want the robot to visit your web site again.

These meta tags are not recognized by all search engines, use a robots.txt file instead when possible.


John
__________________
PASS10
http://www.fallcreektech.com
http://www.mediabuynet.com - My Lovely Wife's Site
johnaikin is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old July 3rd, 2004, 11:38 AM   #22 (permalink)
Registered User
Comfy Contributor
 
Joined in May 2004
Lives in Indianapolis, IN
Hosted on PASS5
179 posts
Gave thanks: 0
Thanked 0 times
Here is what I try to ward off spammers:

A combination of BigJohn's great tutorial on spamassassin at Make SPAM ASSASSIN work for you...

and robots.txt, Meta tags and the following .htaccess:
Options -Indexes

SetEnvIfNoCase User-Agent "^EmailSiphon" bad_bot
SetEnvIfNoCase User-Agent "^EmailWolf" bad_bot
SetEnvIfNoCase User-Agent "^ExtractorPro" bad_bot
SetEnvIfNoCase User-Agent "^CherryPicker" bad_bot
SetEnvIfNoCase User-Agent "^NICErsPRO" bad_bot
SetEnvIfNoCase User-Agent "^Teleport" bad_bot
SetEnvIfNoCase User-Agent "^EmailCollector" bad_bot
SetEnvIfNoCase User-Agent "^LinkWalker" bad_bot
SetEnvIfNoCase User-Agent "^Zeus" bad_bot
SetEnvIfNoCase User-Agent "^Alligator" bad_bot
SetEnvIfNoCase User-Agent "^autoemailspider" bad_bot
SetEnvIfNoCase User-Agent "^DA" bad_bot
SetEnvIfNoCase User-Agent "^Download Demon" bad_bot
SetEnvIfNoCase User-Agent "^Download Express" bad_bot
SetEnvIfNoCase User-Agent "^Download Wonder" bad_bot
SetEnvIfNoCase User-Agent "^DSurf" bad_bot
SetEnvIfNoCase User-Agent "^eCatch" bad_bot
SetEnvIfNoCase User-Agent "^EBrowse" bad_bot
SetEnvIfNoCase User-Agent "^ESurf" bad_bot
SetEnvIfNoCase User-Agent "^FileHound" bad_bot
SetEnvIfNoCase User-Agent "^Franklin Locator" bad_bot
SetEnvIfNoCase User-Agent "^FreshDownload" bad_bot
SetEnvIfNoCase User-Agent "^FSurf" bad_bot
SetEnvIfNoCase User-Agent "^Gamespy_Arcade" bad_bot
SetEnvIfNoCase User-Agent "^GetBot" bad_bot
SetEnvIfNoCase User-Agent "^GetRight" bad_bot
SetEnvIfNoCase User-Agent "^Go!Zilla" bad_bot
SetEnvIfNoCase User-Agent "^Go-Ahead-Got-It" bad_bot
SetEnvIfNoCase User-Agent "^HLoader" bad_bot
SetEnvIfNoCase User-Agent "^iGetter" bad_bot
SetEnvIfNoCase User-Agent "^Industry Program" bad_bot
SetEnvIfNoCase User-Agent "^InstallShield DigitalWizard" bad_bot
SetEnvIfNoCase User-Agent "^IUPUI Research Bot" bad_bot
SetEnvIfNoCase User-Agent "^JoBo" bad_bot
SetEnvIfNoCase User-Agent "^JOC Web Spider" bad_bot
SetEnvIfNoCase User-Agent "^Kapere" bad_bot
SetEnvIfNoCase User-Agent "^Larbin" bad_bot
SetEnvIfNoCase User-Agent "^LeechGet" bad_bot
SetEnvIfNoCase User-Agent "^LightningDownload" bad_bot
SetEnvIfNoCase User-Agent "^Mac Finder" bad_bot
SetEnvIfNoCase User-Agent "^Mail Sweeper" bad_bot
SetEnvIfNoCase User-Agent "^Mass Downloader" bad_bot
SetEnvIfNoCase User-Agent "^MetaProducts Download Express" bad_bot
SetEnvIfNoCase User-Agent "^Microsoft URL Control" bad_bot
SetEnvIfNoCase User-Agent "^Missauga Locate" bad_bot
SetEnvIfNoCase User-Agent "^Missauga Locator" bad_bot
SetEnvIfNoCase User-Agent "^Missouri College Browse" bad_bot
SetEnvIfNoCase User-Agent "^Mister PiX" bad_bot
SetEnvIfNoCase User-Agent "^MovableType" bad_bot
SetEnvIfNoCase User-Agent "^Mozi!" bad_bot
SetEnvIfNoCase User-Agent "^Mozilla/3.0 (compatible)" bad_bot
SetEnvIfNoCase User-Agent "^MSIE_6.0" bad_bot
SetEnvIfNoCase User-Agent "^FrontPage" bad_bot
SetEnvIfNoCase User-Agent "^NEWT ActiveX" bad_bot
SetEnvIfNoCase User-Agent "^Indy Library" bad_bot
SetEnvIfNoCase User-Agent "^WebCapture" bad_bot
SetEnvIfNoCase User-Agent "^DreamPassport" bad_bot
SetEnvIfNoCase User-Agent "^Email Extractor" bad_bot
SetEnvIfNoCase User-Agent "^DnloadMage" bad_bot
SetEnvIfNoCase User-Agent "^DTS Agent" bad_bot
SetEnvIfNoCase User-Agent "^HTTrack" bad_bot
SetEnvIfNoCase User-Agent "^MyGetRight" bad_bot
SetEnvIfNoCase User-Agent "^NetAnts" bad_bot
SetEnvIfNoCase User-Agent "^Nitro Downloader" bad_bot
SetEnvIfNoCase User-Agent "^Nutch" bad_bot
SetEnvIfNoCase User-Agent "^Offline Explorer" bad_bot
SetEnvIfNoCase User-Agent "^PagmIEDownload" bad_bot
SetEnvIfNoCase User-Agent "^pavuk" bad_bot
SetEnvIfNoCase User-Agent "^Program Shareware" bad_bot
SetEnvIfNoCase User-Agent "^Progressive Download" bad_bot
SetEnvIfNoCase User-Agent "^puf" bad_bot
SetEnvIfNoCase User-Agent "^PuxaRapido" bad_bot
SetEnvIfNoCase User-Agent "^Python-urllib" bad_bot
SetEnvIfNoCase User-Agent "^RealDownload" bad_bot
SetEnvIfNoCase User-Agent "^RepoMonkey Bait & Tackle" bad_bot
SetEnvIfNoCase User-Agent "^SiteSnagger" bad_bot
SetEnvIfNoCase User-Agent "^SmartDownload" bad_bot
SetEnvIfNoCase User-Agent "^SpeedDownload" bad_bot
SetEnvIfNoCase User-Agent "^SQ Webscanner" bad_bot
SetEnvIfNoCase User-Agent "^Stamina" bad_bot
SetEnvIfNoCase User-Agent "^Star Downloader" bad_bot
SetEnvIfNoCase User-Agent "^UdmSearch" bad_bot
SetEnvIfNoCase User-Agent "^URLGetFile" bad_bot
SetEnvIfNoCase User-Agent "^FileHeap! file downloader" bad_bot
SetEnvIfNoCase User-Agent "^UtilMind HTTPGet" bad_bot
SetEnvIfNoCase User-Agent "^WebAuto" bad_bot
SetEnvIfNoCase User-Agent "^webcollage" bad_bot
SetEnvIfNoCase User-Agent "^WebCopier" bad_bot
SetEnvIfNoCase User-Agent "^WebReaper" bad_bot
SetEnvIfNoCase User-Agent "^Website eXtractor" bad_bot
SetEnvIfNoCase User-Agent "^WebStripper" bad_bot
SetEnvIfNoCase User-Agent "^WebZIP" bad_bot
SetEnvIfNoCase User-Agent "^WEP Search 00" bad_bot
SetEnvIfNoCase User-Agent "^Wget" bad_bot
SetEnvIfNoCase User-Agent "^Wildsoft Surfer" bad_bot
SetEnvIfNoCase User-Agent "^WWWOFFLE" bad_bot
SetEnvIfNoCase User-Agent "^Xaldon WebSpider" bad_bot
SetEnvIfNoCase User-Agent "^ZBot" bad_bot


<Limit GET POST>
Order Allow,Deny
Allow from all
Deny from env=bad_bot
</Limit>

SetEnvIfNoCase Request_URI ban-ip\.txt ban

<Files ~ "^.*$">
order allow,deny
allow from all
deny from env=ban
</Files>

Now there is some other stuff going on in there, for example, notice the line about 6 lines form the bottom? That is a little trick to specifically ban the IP addresses of spammers and spambots - like this:

On my index page, very early, I have an invisible link that looks like this
Code:
<div style="display:none;"><a href="http://www.fallcreektech.com/sandtrap">sandtrap</a></div>
/sandtrap has an index file that looks like this:
Code:
<?php
$ip = "$REMOTE_ADDR\n" ;
$banip = '/home/fallcre/public_html/ban-ip/ban-ip.txt';
$fp = fopen($banip, "a");
$write = fputs($fp, $ip);
fclose($fp);
?>
My robots.txt has an entry specifically relating to this (there are other entries naturally):
User-agent:*
Disallow: /sandtrap
Disallow: /ban-ip

So, put all together, what I've done is use robots.txt to tell robots not to index /sandtrap, any robot that isn't well behaved will go ahead and follow the first link it finds - the hidden link to sandtrap.

Using php, I get the ip address of that robot and add it to my ban-ip file.

Then using .htaccess, any ip address in that ban-ip file won't be allowed access to my site.

The line I referred to earlier
Code:
 SetEnvIfNoCase Request_URI ban-ip\.txt ban
and the entries under it are there to prevent people from seeing my banned IP list generated from this technique.

This .htaccess file also includes blocks for file downloaders, download managers, and other nastyness I've decided to keep away from my site.

Naturally, you have to keep an eye on your logs, because the various bots come and go quickly.

By using this combination of techniques, I've cut my spam from over 900 per day to under 20.

I hope this is helpful.

John

P.S.
I wish I could figure out why ISP's don't ban a list like this using httpd.conf it would make everyone's life easier and their email servers would sure be happy about it.
__________________
PASS10
http://www.fallcreektech.com
http://www.mediabuynet.com - My Lovely Wife's Site
johnaikin is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old July 3rd, 2004, 3:25 PM   #23 (permalink)
Registered User
Seasoned Poster
 
JustinSane's Avatar
 
Joined in Oct 2003
82 posts
Gave thanks: 0
Thanked 0 times
Here's another great list...

I changed to SetEnvIfNoCase User-Agent ^bad nasty spambot getout as well...

Find an all inclusive list to ban harvesters 'n bad bots (and especially agents downloading your entire site) in a webmasterworld post by clicking here for Scooter's 2nd post down

Although I'm still having problems with order allow,deny stopping my php nuke ported 'hot or not' program from 'getimagesize' when trying to add images to nuke database by giving me a 403 or 404 warning..
__________________
Justin Sane Lost Vegas but not LOST in Vegas...
JustinSane is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old July 3rd, 2004, 4:02 PM   #24 (permalink)
Registered User
Comfy Contributor
 
Joined in May 2004
Lives in Indianapolis, IN
Hosted on PASS5
179 posts
Gave thanks: 0
Thanked 0 times
JustinSane,

Thanks! That's a great shortcut to adding a lot of them.

John
__________________
PASS10
http://www.fallcreektech.com
http://www.mediabuynet.com - My Lovely Wife's Site
johnaikin is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old July 3rd, 2004, 4:43 PM   #25 (permalink)
Uniquely Me
Super #1
 
MrCoolDale's Avatar
 
Joined in Nov 2003
Lives in Anywhere
Hosted on Pass
2,660 posts
Gave thanks: 0
Thanked 0 times
justinsane - I know it's a bit delayed, I'm still catching up on stuff from when I was out on vacation. Anyway, if you were hit by slash, you can forget about trying to get rid of all his stuff. He's also not outside the US. He is the one that taught me how to make my computer think that it's somewhere it isn't, or that it has something it doesn't. Thusly he can make it seem as though he's not in the US, though he is. He will most likely not hit your site again, it's not his style, though be prepared to be hit by someone leaving the name momapping (pronounced Moe-mapping). It's someone that likes to track slash and follow him around, though he only hits about half of the people that slash does.

I just wanted to warn you. Before you ask, no I do not know where he lives anymore. He moved a few months ago. I also don't have a full name or phone number.

- Dale?
__________________
Server: Pass
Website: MrCoolDale.com
Current Project: As Of Yet - Uniquely Me
Co-Winner: 2004 Surpassies - Ultimate Surpasser

MrCoolDale is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old July 3rd, 2004, 6:16 PM   #26 (permalink)
Registered User
On a golden path...
 
Joined in May 2004
372 posts
Gave thanks: 0
Thanked 0 times
Quote:
Originally Posted by JustinSane
I changed to SetEnvIfNoCase User-Agent ^bad nasty spambot getout as well...

Find an all inclusive list to ban harvesters 'n bad bots (and especially agents downloading your entire site) in a webmasterworld post by clicking here for Scooter's 2nd post down

Although I'm still having problems with order allow,deny stopping my php nuke ported 'hot or not' program from 'getimagesize' when trying to add images to nuke database by giving me a 403 or 404 warning..
I can't view that, can someone copy ans paste it here? Thanks!
qwertykb is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old July 3rd, 2004, 6:51 PM   #27 (permalink)
minor deity
Super #1
 
Bigjohn's Avatar
 
Joined in Apr 2004
Lives in Georgia
Hosted on XEON
7,386 posts
Gave thanks: 27
Thanked 94 times
Here it is:
Code:
<Files .htaccess> 
order allow,deny 
deny from all 
</Files> 

order allow,deny 
allow from all 
deny from 80.201.211.221 
deny from 193.165.185.50 
deny from 213.35.182.113 
deny from 232.80.35.168 

SetEnvIfNoCase User-Agent "Indy Library" getout 
SetEnvIfNoCase User-Agent "Full Web Bot" getout 
SetEnvIfNoCase User-Agent ^.*Demon getout 
SetEnvIfNoCase User-Agent ^About getout 
SetEnvIfNoCase User-Agent ^Active getout 
SetEnvIfNoCase User-Agent ^AnswerChase getout 
SetEnvIfNoCase User-Agent ^Ants getout 
SetEnvIfNoCase User-Agent ^Atom getout 
SetEnvIfNoCase User-Agent ^attach getout 
SetEnvIfNoCase User-Agent ^back getout 
SetEnvIfNoCase User-Agent ^BatchFTP getout 
SetEnvIfNoCase User-Agent ^BlitzBOT getout 
SetEnvIfNoCase User-Agent ^bloodhound getout 
SetEnvIfNoCase User-Agent ^brain getout 
SetEnvIfNoCase User-Agent ^Buddy getout 
SetEnvIfNoCase User-Agent ^Cartographer getout 
SetEnvIfNoCase User-Agent ^CherryPicker getout 
SetEnvIfNoCase User-Agent ^ChinaClaw getout 
SetEnvIfNoCase User-Agent ^clickgarden getout 
SetEnvIfNoCase User-Agent ^cosmos getout 
SetEnvIfNoCase User-Agent ^Crawl_Application getout 
SetEnvIfNoCase User-Agent ^Crawler getout 
SetEnvIfNoCase User-Agent ^Crescent getout 
SetEnvIfNoCase User-Agent HttpClient getout 
SetEnvIfNoCase User-Agent ^curl getout 
SetEnvIfNoCase User-Agent ^Custo getout 
SetEnvIf User-Agent ^DA getout 
SetEnvIfNoCase User-Agent ^DaviesBot getout 
SetEnvIfNoCase User-Agent ^DISCo getout 
SetEnvIfNoCase User-Agent ^DLExpert getout 
SetEnvIfNoCase User-Agent ^dnloadmage getout 
SetEnvIfNoCase User-Agent ^Drip getout 
SetEnvIfNoCase User-Agent ^eCatch getout 
SetEnvIfNoCase User-Agent ^Email getout 
SetEnvIfNoCase User-Agent "^Express WebPictures" getout 
SetEnvIfNoCase User-Agent ^Extractor getout 
SetEnvIfNoCase User-Agent ^EyeNetIE getout 
SetEnvIfNoCase User-Agent ^FileHound getout 
SetEnvIfNoCase User-Agent ^FlashGet getout 
SetEnvIfNoCase User-Agent ^flashsite getout 
SetEnvIfNoCase User-Agent ^flunky getout 
SetEnvIfNoCase User-Agent Frontpage getout 
SetEnvIfNoCase User-Agent ^gazz getout 
SetEnvIfNoCase User-Agent ^Genie getout 
SetEnvIfNoCase User-Agent ^Get getout 
SetEnvIfNoCase User-Agent ^Go!Zilla getout 
SetEnvIfNoCase User-Agent ^Go-Ahead-Got-It getout 
SetEnvIfNoCase User-Agent ^gotit getout 
SetEnvIfNoCase User-Agent ^Grafula getout 
SetEnvIfNoCase User-Agent ^gues getout 
SetEnvIfNoCase User-Agent ^HMVie getout 
SetEnvIfNoCase User-Agent ^htdig getout 
SetEnvIfNoCase User-Agent ^ia_archiver getout 
SetEnvIfNoCase User-Agent ^IBrowse getout 
SetEnvIfNoCase User-Agent ^IncyWincy getout 
SetEnvIfNoCase User-Agent ^ineta getout 
SetEnvIfNoCase User-Agent ^infoGIST getout 
SetEnvIfNoCase User-Agent ^InterGET getout 
SetEnvIfNoCase User-Agent "^Internet Ninja" getout 
SetEnvIfNoCase User-Agent ^IP?Works getout 
SetEnvIfNoCase User-Agent ^Iria getout 
SetEnvIfNoCase User-Agent ^iseeker getout 
SetEnvIfNoCase User-Agent ^Jack getout 
SetEnvIfNoCase User-Agent ^Java getout 
SetEnvIfNoCase User-Agent ^JetCar getout 
SetEnvIfNoCase User-Agent ^JoBo getout 
SetEnvIfNoCase User-Agent ^JOC getout 
SetEnvIfNoCase User-Agent ^JustView getout 
SetEnvIfNoCase User-Agent ^larbin getout 
SetEnvIfNoCase User-Agent ^leech getout 
SetEnvIfNoCase User-Agent ^LexiBot getout 
SetEnvIfNoCase User-Agent ^lftp getout 
SetEnvIfNoCase User-Agent ^libW getout 
SetEnvIfNoCase User-Agent ^Lifeboat getout 
SetEnvIfNoCase User-Agent ^likse getout 
SetEnvIfNoCase User-Agent ^Linkbot getout 
SetEnvIfNoCase User-Agent "^links sql" getout 
SetEnvIfNoCase User-Agent ^LncSoft* getout 
SetEnvIfNoCase User-Agent ^Lockstep getout 
SetEnvIfNoCase User-Agent ^lwp getout 
SetEnvIfNoCase User-Agent ^Magnet getout 
SetEnvIfNoCase User-Agent ^MARS getout 
SetEnvIfNoCase User-Agent ^Marvin getout 
SetEnvIfNoCase User-Agent ^Mass getout 
SetEnvIfNoCase User-Agent ^Mata.*Hari.* getout 
SetEnvIfNoCase User-Agent ^Memo getout 
SetEnvIfNoCase User-Agent ^Microsoft getout 
SetEnvIfNoCase User-Agent "^MFC Foundation" getout 
SetEnvIfNoCase User-Agent ^MIDown getout 
SetEnvIfNoCase User-Agent ^MIIxpc getout 
SetEnvIfNoCase User-Agent ^MindSpider getout 
SetEnvIfNoCase User-Agent ^Mirror getout 
SetEnvIfNoCase User-Agent ^Mister getout 
SetEnvIfNoCase User-Agent ^MOT-CF getout 
SetEnvIfNoCase User-Agent ^Mozzila/4* getout 
SetEnvIfNoCase User-Agent ^ms-catapult getout 
SetEnvIfNoCase User-Agent ^msproxy getout 
SetEnvIfNoCase User-Agent ^nabot getout 
SetEnvIfNoCase User-Agent ^Navman getout 
SetEnvIfNoCase User-Agent ^navroad getout 
SetEnvIfNoCase User-Agent ^NearSite getout 
SetEnvIfNoCase User-Agent ^Net getout 
SetEnvIfNoCase User-Agent ^NICErsPRO getout 
SetEnvIfNoCase User-Agent ^Nitro getout 
SetEnvIfNoCase User-Agent ^oBot getout 
SetEnvIfNoCase User-Agent ^Octopus getout 
SetEnvIfNoCase User-Agent ^Papa getout 
SetEnvIfNoCase User-Agent ^pc getout 
SetEnvIfNoCase User-Agent ^PingALink getout 
SetEnvIfNoCase User-Agent ^Pockey getout 
SetEnvIfNoCase User-Agent ^psbot getout 
SetEnvIfNoCase User-Agent ^Pump getout 
SetEnvIfNoCase User-Agent ^Recorder getout 
SetEnvIfNoCase User-Agent ^ReGet getout 
SetEnvIfNoCase User-Agent ^RepoMonke getout 
SetEnvIfNoCase User-Agent ^RMA getout 
SetEnvIfNoCase User-Agent ^Siphon getout 
SetEnvIfNoCase User-Agent ^site getout 
SetEnvIfNoCase User-Agent ^SlySearch getout 
SetEnvIfNoCase User-Agent ^Smart getout 
SetEnvIfNoCase User-Agent ^Snagger getout 
SetEnvIfNoCase User-Agent ^Snake getout 
SetEnvIfNoCase User-Agent ^SpaceBison getout 
SetEnvIfNoCase User-Agent ^Sqworm getout 
SetEnvIfNoCase User-Agent ^SuperBot getout 
SetEnvIfNoCase User-Agent ^SuperHTTP getout 
SetEnvIfNoCase User-Agent ^Surfairy getout 
SetEnvIfNoCase User-Agent ^Surfbot getout 
SetEnvIfNoCase User-Agent ^suzuran getout 
SetEnvIfNoCase User-Agent ^Szukacz getout 
SetEnvIfNoCase User-Agent ^tAkeOut getout 
SetEnvIfNoCase User-Agent ^Tateji getout 
SetEnvIfNoCase User-Agent ^Tcl getout 
SetEnvIfNoCase User-Agent ^Telesoft getout 
SetEnvIfNoCase User-Agent ^templeton getout 
SetEnvIfNoCase User-Agent ^test getout 
SetEnvIfNoCase User-Agent ^utopy getout 
SetEnvIfNoCase User-Agent ^Vacuum getout 
SetEnvIfNoCase User-Agent ^VoidEYE getout 
SetEnvIfNoCase User-Agent ^Web getout 
SetEnvIfNoCase User-Agent ^Wget getout 
SetEnvIfNoCase User-Agent ^Whacker getout 
SetEnvIfNoCase User-Agent ^WPF getout 
SetEnvIfNoCase User-Agent ^wwwhoosh getout 
SetEnvIfNoCase User-Agent ^Xaldon getout 
SetEnvIfNoCase User-Agent ^xget getout 
SetEnvIfNoCase User-Agent ^ZBot getout 
SetEnvIfNoCase User-Agent ^Zeus getout 
SetEnvIfNoCase User-Agent Alligator getout 
SetEnvIfNoCase User-Agent Bandit getout 
SetEnvIfNoCase User-Agent Collector getout 
SetEnvIfNoCase User-Agent Copier getout 
SetEnvIfNoCase User-Agent Download getout 
SetEnvIfNoCase User-Agent GetRight getout 
SetEnvIfNoCase User-Agent grab getout 
SetEnvIfNoCase User-Agent htmlgobble getout 
SetEnvIfNoCase User-Agent HTTrack getout 
SetEnvIf User-Agent iCab getout 
SetEnvIfNoCase User-Agent MSIECrawler getout 
SetEnvIfNoCase User-Agent naviscope getout 
SetEnvIfNoCase User-Agent Ninja getout 
SetEnvIfNoCase User-Agent Offline getout 
SetEnvIfNoCase User-Agent peakjet getout 
SetEnvIfNoCase User-Agent prozilla getout 
SetEnvIfNoCase User-Agent rapidcache getout 
SetEnvIfNoCase User-Agent realdownload getout 
SetEnvIfNoCase User-Agent Reaper getout 
SetEnvIfNoCase User-Agent robofox getout 
SetEnvIfNoCase User-Agent saver getout 
SetEnvIfNoCase User-Agent silentsurf getout 
SetEnvIfNoCase User-Agent ^spiderbot getout 
SetEnvIfNoCase User-Agent ^stamina getout 
SetEnvIfNoCase User-Agent Stripper getout 
SetEnvIfNoCase User-Agent Sucker getout 
SetEnvIfNoCase User-Agent tarspider getout 
SetEnvIfNoCase User-Agent Teleport getout 
SetEnvIfNoCase User-Agent thumbnavigator getout 
SetEnvIfNoCase User-Agent transsoft getout 
SetEnvIfNoCase User-Agent udmsearch getout 
SetEnvIfNoCase User-Agent utilmind getout 
SetEnvIfNoCase User-Agent w3mir getout 
SetEnvIfNoCase User-Agent weazel getout 
SetEnvIfNoCase User-Agent Widow getout 
SetEnvIfNoCase User-Agent www4mail getout 
SetEnvIfNoCase User-Agent WWWOFFLE getout 
SetEnvIfNoCase User-Agent hloader getout 
SetEnvIfNoCase User-Agent WebCapture getout 
SetEnvIfNoCase User-Agent EasyDL getout 
SetEnvIfNoCase User-Agent dloader getout 
SetEnvIfNoCase User-Agent "production bot" getout 
SetEnvIfNoCase User-Agent "full web bot" getout 
SetEnvIfNoCase User-Agent "demo bot" getout 
SetEnvIfNoCase User-Agent TECOMAC getout 
SetEnvIfNoCase User-Agent potbot getout 
SetEnvIfNoCase User-Agent npbot getout 
SetEnvIfNoCase User-Agent turnitinbot getout 
SetEnvIfNoCase User-Agent anarchie getout 
SetEnvIfNoCase User-Agent "Educate Search" getout 
SetEnvIfNoCase Referer iaea\.org getout 

<Limit GET POST> 
Order Allow,Deny 
Allow from all 
Deny from env=getout 
</Limit> 

Options -Indexes 

RewriteEngine on 
RewriteCond %{HTTP_REFERER}!^$ 
RewriteCond %{HTTP_REFERER}!^http://(www\.)?mydomain.com.*$ [NC] 
RewriteCond %{HTTP_REFERER}!^http://216\.239.*$ [NC] 
RewriteCond %{HTTP_REFERER}!^http://images\.google.*$ [NC] 
RewriteCond %{HTTP_REFERER}!^http://www\.google\..*$ [NC] 
RewriteCond %{HTTP_REFERER}!^http://translate\.google\..*$ [NC] 
RewriteCond %{HTTP_REFERER}!^http://babel\.altavista\..*$ [NC] 
RewriteCond %{HTTP_REFERER}!^http://babelfish\.altavista\..*$ [NC] 
RewriteCond %{HTTP_REFERER}!^http://world\.altavista\.com.*$ [NC] 
RewriteCond %{HTTP_REFERER}!^http://www\.excite\.co.*$ [NC] 
RewriteRule \.(jpg¦JPG)$ http://www.mydomain.com/images/replace.gif [R,L]
__________________
Proud to be a Surmunity Mod!
XEON PASS60 PASS61
Make a fundamental difference!
My Sites:
Curious about Brewing Beer? Join the community!
>>>>> Some Change is GOOD! Keep your paycheck! Support the Fair Tax
Get into an Art museum
Victorian London
It's your brain -ON WEB - mybrainhost.com (under development)
What SHOULD Government do? Much Less than it Does!
Bigjohn is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On