Want to buy the Web?
October 2, 2003 3:03 PM   Subscribe

Want to buy the Web? The whole thing? (scroll to bottom of page). Alexa now offers - for sale - the entire web, collected from their crawler, in a portable form: "For organizations capable of hosting or mining an entire crawl index that exceeds 60 Terabytes in size, Alexa can ship the contents of the crawl to your location. Current customers include the Internet Archive and the Library of Alexandria in Egypt. The web-wide crawl takes approximately 2 months to complete. It is over 60 Terabytes in size, spanning over 3.5 billion unique URLs." No price listed, but "If you have to ask..."
posted by kokogiak (16 comments total)
 
I wonder how long it's going to take for someone to start screaming about copyright infringement.
posted by mrbula at 3:25 PM on October 2, 2003


can i get this on floppys?
posted by efalk at 3:27 PM on October 2, 2003


not to mention that it's only what, 16% of the actual web?

"Color may vary as shown on box, not valid in Alaska, darkweb not included."

also, what mrbula said.
posted by dorian at 3:29 PM on October 2, 2003


People already bitch and moan about copyright infringement due to google groups (which was formerly dejanews). There have been many people who have posted many things to usenet news groups which they later regretted. I recall my first usenet post, the usenet reader I used asked me if I was sure I wanted to post this, that my message would be read far and wide etc. So, anyway, some whiners are upset that stuff they posted to the public is public knowledge. Did I say whiners yet?

Expect the same thing to happen at a point in the future when terabytes of data is something that fits on a low end consumer hard drive.
posted by substrate at 3:44 PM on October 2, 2003


Holy crap that's a lot of porn...
posted by dopamine at 3:47 PM on October 2, 2003


dnno, I think that as storage grows in magnitude so will bndwdth. I guess it could catch up one day, but I imagine that when we have cheap multi-terabyte consumer storage devices, internet service will be 100mbit or even 1000mbit, and 60 terabytes will no longer be nearly enough space to contain the entire web (vertical growth as well as the obvious horizontal...)

and yes, I do find it highly amusing (and somewhat painful) to search gooja for my earliest Usenet efforts.

hohhh. one. million. times.
posted by dorian at 3:51 PM on October 2, 2003


i want my cut. (but i'd settle for an elephant.)
posted by keswick at 4:09 PM on October 2, 2003


Anyone got a .torrent for this?
posted by inpHilltr8r at 4:37 PM on October 2, 2003


60 terabytes is smaller than I would have guessed. You can buy 300 GB drives these days. A while ago on Slashdot there was a guy who made a 1 TB RAID out of q-tips or something. Sixty of those and you've got the entire thing?
posted by tss at 5:07 PM on October 2, 2003


160 GB HDD ~$130

60,000 / 160 = 375

375 x $130 = $48,750

Assuming a server can hold 25 HDDs, we also need 15 servers, plus network stuff etc. Assuming such a thing can be built for $1,000, that's an additional $15,000. Bulk-buy discounts probably apply as well, which makes the numbers even fuzzier.

So, my guess is it'll cost you change out of $70,000 to store it, at today's prices. They probably want (guessing) $250,000 to $500,000 for the data.

So, anyone got a spare half-million and a lot of basement space? :)

Now the irony is that 20 years ago, a new computer could be obtained, at some expense, with a 5MB hard drive. 10 years ago, a new computer came with a 60MB drive. Today, they come with 60GB drives. Is it unreasonable to think that in 10 years or so, they'll come with 60TB drives?
posted by aeschenkarnos at 5:57 PM on October 2, 2003


Sixty TB is much smaller than I had expected — my understanding was the the Internet Archive was coming up on a petabyte.

At the Library of Congress you can see a display of four monitors stacked vertically, with a huge RAID array going up the side, which flashes web sites at random -- the archive dates from around 1997, IIRC.
posted by IshmaelGraves at 8:55 PM on October 2, 2003


I don't think Jack Valenti is gonna like this one bit.
posted by soyjoy at 9:13 PM on October 2, 2003


Sheesh - no need to spend that amount of cash! I'm cheerfully offering the contents of my 20 gig portable firewire for a mere $10,000. Enjoy a bunch of work projects, pictures I still haven't printed from Easter, and every dumb mpeg I've downloaded in the last year and a half.

Act now and I'll throw in a cable and (un)leather carrying case....for FREE!
posted by jalexei at 5:09 AM on October 3, 2003


Once I buy the Web I'm kicking everybody else out...
posted by Shane at 5:54 AM on October 3, 2003


dnno, I think that as storage grows in magnitude so will bndwdth. I guess it could catch up one day, but I imagine that when we have cheap multi-terabyte consumer storage devices, internet service will be 100mbit or even 1000mbit, and 60 terabytes will no longer be nearly enough space to contain the entire web (vertical growth as well as the obvious horizontal...)

Storage is growing much much faster than the availability of bandwidth. I think in 5-10 years that an "internet subscription" will mean you get it delivered to your door every morning.
posted by straight at 6:52 AM on October 3, 2003


They should just ship new hard drives with the Internet already on them...
posted by kerplunk at 3:43 AM on October 4, 2003


« Older Russian Prison Tatoos.   |   September 12: A Toy World Newer »


This thread has been archived and is closed to new comments