FIXIT MENU:
home about us contact us

WHAT'S AVAILABLE:
free scripts advanced scripts online tools great books web related tutorials contributed tutorials news archive geek toys!

SUPPORT:
help forum live chat help



Selected article

RSS feed   enewsbar Live Subscribe    Add to MyYahoo    Add to Google

Other HTMLfixIT articles:




by Franki

Robots.txt is hardly new and is almost as old as the net itself. Having said that it is very handy when it comes to making sure that the search engines only spider the parts of your site that you actually want to show up on search engines like Google, Yahoo, MSN, Altavista etc. You can make your own robots.txt file with a text editor, or you can use this handy online tool from Webtoolcentral. With the reports coming in about security flaws and data mining happening via specially crafted search engine queries, it makes more sense then ever to ensure that you limit what information people can dig out of search engine indexes. It can also be handy for limiting bandwidth caused by excessive spidering as I found out yesterday.

I’ve been trying to work out why so much of our normally sufficient bandwidth was suddenly getting used up for no immediately apparent reason. After much searching, tcpdumping and access_log watching, I discovered one of our hosting clients had a huge directory of video and music files, some of which were 250MB in size. It turns out much of the traffic was actually search engine bots downloading them, presumably to add to one of the new video searching facilities the search engines have all jumped on. After crafting a nice robots.txt and adding code to my download manager program to block search engine referrers from downloading the files, the bandwidth usage has dropped dramatically. It turned out that one of the worst offenders was ConveraMultiMediaCrawler, which showed up almost continuously in the access log. With my robots.txt and my modified downloader, none of the search engines can access those video and music files unless configured to allow it. Robots.txt may be old tech, but don’t let that make you think it isn’t a useful tool. It should be added that for it to work, the bot in question has to support the robots exclusion standard, but all the big ones do and that ensures you can control where your information ends up.








Comments are closed.







This site is totally free to use, you have absolutely no moral or legal obligations to help us continue.
There are however, some costs involved in running the site.

<random humor>
Plus sometimes Franki prefers EMU bitter to VB.
</random humor>

So if this site helped you find your way, perhaps you could consider contributing to our costs. Whatever amount you feel this site was worth to you would be just wonderful.
Use PayPal if you do decide to share and help us with the costs and in appreciation for our time and attention, or alternatively buy a book from our Bookstore..


  Time  in  Don's  part  of the world is:   April 19, 2024, 1:20 pm
  Time in Franki's part of the world is:   April 20, 2024, 2:20 am
  Don't worry neither one sleeps very long!



privacy policy :: support us :: home :: live chat help
contact us :: forum ::tutorials :: bookstore :: Site Map



      Valid XHTML 1.0!             powered by Apache Server
Pic 3 Pic 3

SEARCH:
USEFUL LINKS:

CIGHTML Firefox Thunderbird ClamWin WordPress SpyBot S&D TheGIMP Apache for Windows Registry Cleaners More cool stuff:

//-->

HTMLfixIT Site Stats.

Browser Statistics
Internet Explorer 85.88%
IE 717.63%
IE 62.3%
IE 50.00%
IE other8.6%
Moz Firefox 3.x3.03%
Moz Firefox 2.x0.18%
Moz Firefox 0.x/1.x26.65%
Netscape 8.x0.00%
NS 6+/Mozilla2.73%
Moz Seamonkey0.00%
K-meleon0.00%
Epiphany0.00%
Netscape 4.x0.00%
Opera 9.x0.00%
Opera 8.x0.00%
Opera 7.x0.42%
Opera 6.x0.00%
Opera other0.42%
Safari Mac/Intel5.21%
Safari Mac/PPC0.06%
Safari Windows25.2%
Google Chrome1.51%
Konqueror0.18%
Galeon0.00%
WebTV0.00%


Resolution Statistics
640 x 4800.25%
800 x 60026.14%
1024 x 76836.55%
1152 x 8640.25%
1280 x 80011.68%
1280 x 8540.00%
1280 x 102417.01%
1400 x 10500.00%
1600 x 12001.02%
1920 x 12007.11%
2560 x 10240.00%


OS Statistics
Windows 741.55%
Windows Vista2.4%
Windows 20033.91%
Windows XP20.86%
Windows 20000.36%
Windows NT40.05%
Windows 98/ME0.05%
Windows 950.00%
Linux/UNIX/BSD8.76%
Mac OSX8.03%
Mac Classic0.00%
Misc14.03%



New Windows Virus Alerts
also by sophos.

17 Apr 2011 Troj/Mdrop-DKE
17 Apr 2011 Troj/Sasfis-O
17 Apr 2011 Troj/Keygen-FU
17 Apr 2011 Troj/Zbot-AOY
17 Apr 2011 Troj/Zbot-AOW
17 Apr 2011 W32/Womble-E
17 Apr 2011 Troj/VB-FGD
17 Apr 2011 Troj/FakeAV-DFF
17 Apr 2011 Troj/SWFLdr-W
17 Apr 2011 W32/RorpiaMem-A

For details and removal instructions, click the virus in question.