Friday, July 24, 2009

How Robots.txt can be useful sometimes.

What's new from Google? No doubt, most of the internet users would be eager to click any article with such heading or text.
Indeed, I am one among those. ;)

At this point of time, I recall a moment when I won a bet of good food. I was able to gain access to one of my friend's yahoo account 6 years back. This was out of sheer luck that I knew his personal information and the security question was very easy to answer. Yeah, that was it. This friend of mine recollected the incident few years back and that was the time when I got the idea of having a security check for such hacks.
My idea was to record the previous login time (And make it non-editable).
Finally, Google has come up with a similar feature in gmail(adding login time, ip address and time spent). Storing the ip address is too good an idea. This would actually give us an idea of the location where the hacker is located. A useful feature enough to make sure that no one else is reading your (personal) mails.And this is one reason why I like gmail.

Well, coming to the actual point. I was actually searching for a parameter in the robots.txt to set the maximum number of scans a bot can perform on my site. Request-rate is the required parameter/setting I can use.
And while I was reading few articles, Google's robots.txt http://www.google.com/robots.txt caught my eyes. Google allowing most of its content not be crawled surprised me. But, this was no waste an effort. I came to know of Google Ventures , useful for entrepreneurs..
So robots.txt gives us some useful information too(if not for robots).
Perhaps, I may be gaining some useful information from other site one day.

Many lines in Google's robot.txt but with a Disallow tag prefixed.
No harm, I am no robot to be disallowed :P

No comments:

Post a Comment