Is Grub out of control? Barely more than a week old, the distributed search engine is already causing headaches. It does not properly follow the
Robot Exclusion Standard and thus spiders sites against their owners' wishes. Because it is a distributed client run by thousands of volunteers (and therefore connects from many different IP addresses), it is non-trivial to block. The Wikipedia project, for example, is
experiencing slowdowns because of it. Let's hope they can solve these problems, as the idea seems to be quite cool.
posted by Eloquence
on Apr 23, 2003 -
7 comments
Grub: The seti@home of search engines? According to the
New Scientist:
"A distributed computing project called
Grub, which harnesses individual users' spare computing power and internet bandwidth, began cataloguing millions of web pages this week."
Grub
has thus launched before
HyperBee, a similar distributed search project.
This link was
previously posted on MeFi when it was still in the conceptual stage.
The project is being run by
LookSmart (along with its own open directory project called
zeal) but as the New Scientist article notes: "Website information collected by Grub is already being fed into one of LookSmart's search services, called
WiseNut. But the collected data are also freely accessible to the public, so they can be incorporated into any web site or desktop application."
Possible Google competition or doomed from the start?
posted by talos
on Apr 21, 2003 -
10 comments
Interesting idea, but will it work? "Grub provides a free for download, distributed crawling client, which is used to create an infrastructure (database + volunteers) that will eventually provide URL update status information for nearly every web page on the Internet. Grub's distributed crawler network will enable websites, content providers, and individuals to notify others that changes have occurred in their content, all in real time"
posted by sixdifferentways
on May 18, 2001 -
0 comments