You can't count on the web, okay?
October 15, 2015 2:39 PM   Subscribe

The web, as it appears at any one moment, is a phantasmagoria. It’s not a place in any reliable sense of the word. It is not a repository.

The promise of the web is that Alexandria’s library might be resurrected for the modern world. But today’s great library is being destroyed even as it is being built. Until you lose something big on the Internet, something truly valuable, this paradox can be difficult to understand.
posted by pjern (32 comments total) 29 users marked this as a favorite
 
THE CROSSING: ECHOES, By Kevin Vaughan
posted by the man of twists and turns at 2:44 PM on October 15, 2015 [3 favorites]


All those websites will be lost, in time, like tears in rain. Time...to log off.
posted by The Card Cheat at 2:55 PM on October 15, 2015 [13 favorites]


Wonder what percentage of Metafilter article links still work.
posted by ymgve at 2:57 PM on October 15, 2015 [7 favorites]


In China in the Warring States period, there was a golden age of philosophy and learning called the era of the Hundred Schools of Thought. When the first emperor of the Qin took the throne and there was again one ruler for all of China, he burned the books of all competing schools so that only the one he followed should survive. Copies of the rest were preserved in the imperial library, so that high officials might access them.

The imperial library burned to the ground when the Qin dynasty fell, 15 years later.

Perhaps we are destined to be our own weevils.
posted by Diablevert at 2:59 PM on October 15, 2015 [10 favorites]


Weevils waffle, but they don't fall down.
posted by It's Raining Florence Henderson at 3:01 PM on October 15, 2015 [3 favorites]


You can't count on the web, okay?

Sure I can: one, two, three, four...
posted by Sangermaine at 3:06 PM on October 15, 2015


More reason to support the Internet Archive I guess.
posted by dilaudid at 3:21 PM on October 15, 2015 [4 favorites]


The initial promise of the web was that people could share information freely, but the desire to monetize that information means that making it difficult to archive on another site falls into the category of feature, not bug. That Atlantic article itself is full of dynamic loading, embedded ads, etc. that clutter up the page and require extra steps to save the content itself. I'm sure archive.org is always adding to its array of techniques for saving these dynamic pages, but just like the computer security playing field is tilted heavily in favor of the attackers, the battle over free content that can be archived vs. walled gardens with ephemeral, hard-to-extract content is tilted heavily in favor of the latter.
posted by tonycpsu at 3:23 PM on October 15, 2015 [6 favorites]


...And on the pedestal these words appear:
"You've made it: Jam Central Station,
the central depository for all things Space Jam
."
Nothing beside remains. Round the decay
Of that colossal wreck, boundless and bare,
The lone and level sands stretch far away.
posted by prize bull octorok at 3:32 PM on October 15, 2015 [32 favorites]


This is one of the reasons I use Evernote -- I can suck down web pages for future reference.

Of course, Evernote is a company that's subject to the vagaries and panics of the U.S. market. Which is why I'm also looking into OneNote as a duplicate of that. And wondering how crazy it would be to print out this stuff too ...
posted by sobell at 3:33 PM on October 15, 2015


I think people too often fail to distinguish between saving knowledge or information and saving data. Not that saving data is of no importance at all, but the more noise you save, the harder it is to actually find signal. It can be useful, certainly, to have access to such a thing, but a tool that saved every version of every static page on the web forever would eventually just make it harder to find that important thing you meant to read five years ago. Keeping things that are important is saving. Keeping everything is hoarding. Occasionally losing something of some value as a price for curation is, to me, very reasonable. I mean, I'm on Metafilter for a reason.
posted by Sequence at 3:38 PM on October 15, 2015 [6 favorites]


One of the challenges with preserving "The Crossing" was its use of Flash for the multimedia parts. I wonder if multimedia articles created with HTML 5 will be more stable than Flash because it's a standard, like, would we be able to access "Snow Fall" in ten years if it wasn't from the New York Times?
posted by Small Dollar at 3:40 PM on October 15, 2015


I've been feeling kind of disillusioned with technological impermanence already... I guess I need to read this article to really bring myself down. :(

RE: Evernote, Evernote's pretty good at letting you get your data off it. It's not in a very usable form, but it's all at least theoretically there, and could be parsed out of the .enex files by any moderately competent programmer.

Maybe I'll start Evernote-clipping information I care about on web sites instead of just bookmarking it....
posted by edheil at 3:45 PM on October 15, 2015 [1 favorite]


Sobell, I feel bad mentioning it but I've seen Evernote explicitly called out in several recent tech press articles on "dead unicorns" eg, companies which currently have funding but are no longer growing enough to justify their extremely high valuations and therefore might be headed the way of the dodo...
posted by Diablevert at 3:47 PM on October 15, 2015


And if you pay for an annual subscription, EN will let you search those stored pages! I'm pointing this out because I like and use Evernote, but how much is its longevity determined by its funding model? Is it even reasonable to expect the most fantastic service to exist past some unfathomable economic cycle if it's not a public utility?
posted by sneebler at 3:51 PM on October 15, 2015


I've seen Evernote explicitly called out in several recent tech press articles on "dead unicorns" eg, companies which currently have funding but are no longer growing enough to justify their extremely high valuations and therefore might be headed the way of the dodo...

Sentences like that make me want to pitch my laptop into the sea and take up farming or something. If the tech economy is structured so that useful infrastructure services like Evernote (or Twitter) can't exist without impossible perpetual-motion-machine growth, the tech economy is going to collapse and take everything else down with it.
posted by zjacreman at 3:57 PM on October 15, 2015 [9 favorites]


So I got bored, and decided to take a look at the first 25 articles posted on Metafilter:

Cat-Scan.com is one of the strangest sites I've seen in some time. I have no idea how these people got their cats wedged into their scanners, or why.
❌ The original cat-scan.com is gone, site has fittingly enough been taken over by MetaFilter Memories

As if you couldn't get enough of the JenniCam, now there's the JenniShow. It's fairly boring, more like a video diary than a voyeur's dream.
❌ jennicam.org and thesync.com are totally gone now

The hype machine for Apple's new Consumer Portable appears to be in full swing.
❌ Link only redirects to front page of the San Jose Mercury News, which I assume is the paper that had the original article

The world's smallest Web server keeps getting smaller.
❌ The world's smallest webserver is now so small it it 404 not found

EXPN is ESPN/ABC/GoNetworks/DisneyEmpire's newest attempt to exploit the existence of extreme sports. I've spent most of my life skateboarding, but I still don't know how much of it I could watch on TV. I'm sure it would get old after a few weeks.
✔ EXPN got renamed to X Games, link still works

The Death Clock No one knows when they're going to die -- until now! Thanks to the web, you can visit The Death Clock, put in some data about yourself and see what day, assuming you survive the apocalypse later this year, you'll bite the big one on! Fun!
✔ Ironically, still alive and kicking!

Can't remember what that font was on your last project? Looking for a font that's grundge, or one that's high tech looking? Point Central has the best font database I've seen.
❌ point-central.com is gone and taken over by domain squatters

Hello Tarot is the world's cutest tarot card deck. Lord knows I want one.
❌ sizer.org is gone and taken over by domain squatters

Yahoo now has co-branding arrangements going on. I hate to see this trend continue, as I see it diluting the reputation of both companies. What's Pepto-Bismol got to do with teens having a fun summer? Anyone care to make the connection for me? What's next, Geritol sponsoring a site aimed at toddlers?
❌ Thankfully, pepto.yahoo.com is totally gone

The recent World Cup Soccer Tournament was one small step forward for women in sports, but this is one giant leap behind
❌ Amazon.com link, but doesn't work for some reason. Not because it's old, see the next entry

I was just thinking that it would be really funny if the Recording Industry Association of America decided to photocopy this book and give it away, just to get back at Justin Frankel.
✔ riaa.com still works, of course. Amazon link to "MP3 Power! With Winamp", a book written by the Winamp and Napster authors

Steve Jobs will be broadcast in realtime from MacWorld in New York next week. It will be available at numerous sites around the country, mostly at college campuses. It will be playing at a building on my campus, so I just might check it out.
❌ Apple is still around, Steve Jobs is not. Neither is this link which is now a 404

Misc. Media zine has a nice article about weblogs.
❌ Misc. Media is still here, but the link is a 404

I wonder if Microsoft's acquittal in the Bristol case is a bad omen for the DOJ.
❌ Another San Jose Mercury News link that doesn't work anymore

This Futurama fan site is better than anything Fox has to offer online about their show. Rather than unleash their lawyers on these fans, they should be sending them paychecks. Among the many gems on the site are these remixes of the show's sounds.
❌ futuramaoutlet.com is gone and taken over by domain squatters

Would you buy a house from someone called "Rambo?"
❌ rambogetsitdoneright.com is totally gone

I found this site linked from a mom-n-pop design shop's awards page. Not only has this company stolen the Point Survey's 5% graphic, they're also using the old C|net background. Very original.
✔ Fake it till you make it apparently works. They got bought out in 2004

Screw the $800 Aeron chair I've coveted for so long, I need one of these.
❌ hmstore.com and netsurfer.fi is gone

Yet another target for Apple's Lawyers is this new iMac ripoff. It looks pretty rough, like plexiglass glued around a monitor.
❌ I guess Apple's lawyers won. The E-One computer is no more and neither are its links

Of course, I want one of these, but more exciting than that is the new wireless LAN. At work, we have a 1Mbps wireless than that cost thousands, but this looks 10x faster at 1/5 the cost.
✔ The Apple iBook is now called MacBook, but Airport kept its name. Both links redirect to their respective modern counterparts.

Revenge is sweet. I just wish this company would do a partnership with SpamCop, so I could report spam and send the spammer some crap in the mail, in one convienient place.
✔ dogdoo.com is still around, ready to fill all your fake dog poo needs. spamcop.net is still around too.

Hello AOL? It's Microsoft, we've come to crush your Instant Messenger.
❌ MSN Messenger shut down last year. Link doesn't work, doesn't even redirect to Skype.

Hopefully the 1996 Cat Olympics organization will be holding their olympics in Sydney next year. If so, my cats will start training soon, however, they're already in top form for most all events.
❌ First comment is "Cool. Link still works three years later. What if they had a thread about Cat Olympics and nobody came?". Website is down now, though.

This is a great discussion of color on the web, with a focus on the limits of computers.
? Link might still work. colorcom.com seems to have problems now, but Google's cache of the main site is less than a month old

This is the worst web navigation I've ever seen at a university web site. It's so user-unfriendly that they have Search above all options. It also happens to be my alma mater, so it's doubly sad to see.
✔ www.ucr.edu is still going strong, and apart from a Flash slideshow of current events, the page looks acceptable now.

So, a whopping 7 (maybe 8) out of 25 links still point to the content intended by the original site owners, the others are either broken, dead or points to a main page instead of a specific article. That's...bad.

(Of course, this is 15 year old links. I assume the ratio gets better the closer one gets to the present)
posted by ymgve at 4:00 PM on October 15, 2015 [39 favorites]


It can be useful, certainly, to have access to such a thing, but a tool that saved every version of every static page on the web forever would eventually just make it harder to find that important thing you meant to read five years ago. Keeping things that are important is saving. Keeping everything is hoarding. Occasionally losing something of some value as a price for curation is, to me, very reasonable. I mean, I'm on Metafilter for a reason.

I disagree vehemently. Curation is done in the present. We do not know which knowledge or data will be valuable in the future.
posted by zsazsa at 4:10 PM on October 15, 2015 [2 favorites]


It is not a repository.

It's a suppository.
posted by Chitownfats at 4:29 PM on October 15, 2015


metafilter: a constantly changing patchwork of perpetual nowness.
posted by rude.boy at 4:40 PM on October 15, 2015 [2 favorites]


This happens to my old MeFi posts way more often than I'd like. So much care putting them together, then within a year or two the most important links break forever. I'd love a way to update posts to keep them relevant to future readers.

Also, I recently discovered Reddit has a hidden memory hole problem. Its interface seems straightforward enough, and threads remain readable for years. But the backend has a critical limitation -- it is incapable of displaying more than 1000 of any one piece of data. So you can only see your most recent 1000 comments/posts, subreddit listings stall out after 1000 entries, and third-party stats trackers and search engines can only consider the most recent 1000 items. Even a user's list of saved posts -- the database most necessary to preserve, starts forgetting the oldest posts after you save the 1001st. You can theoretically still read all these memory-holed posts, but not without bookmarking them or getting lucky with a Google search.

It infuriated me when Digg blew up their archives in the redesign, and now it's successor is doing it too (just more discreetly).
posted by Rhaomi at 5:08 PM on October 15, 2015 [5 favorites]


reminder that pinboard.in lets you archive a copy of each page you bookmark for a small additional fee ($25/yr)
posted by cfraenkel at 5:14 PM on October 15, 2015


A year or two? I was looking up a post from two or three months ago the other day, and when I found it, one of the links was already 404.
posted by indubitable at 5:17 PM on October 15, 2015 [2 favorites]


Here it is. Very first link is 404.
posted by indubitable at 5:20 PM on October 15, 2015 [1 favorite]


For what it's worth, Library of Congress works with Internet Archive to preserve their stuff on our system. Most of the stuff are old congressional pages, and we focus more on collections than just specific sites, like "September 11, 2001 Web Archive". Hopefully this will temper the inevitable disappearance of these collections, to say nothing of who gets to decide what's important enough to preserve.
posted by numaner at 5:38 PM on October 15, 2015 [1 favorite]


it is incapable of displaying more than 1000 of any one piece of data

This is a more common problem than you'd think. Search indexes gets progressively much harder to retrieve large page numbers because of how much more data it has to sort through to match, eating up more and more cache/temp memory. More likely their system recognizes when a single client is requesting too much data and cuts it off, so to the client it just looks like the server is hanging.
posted by numaner at 5:42 PM on October 15, 2015


"dead unicorns" are companies that are "only" valued at some number of hundreds of millions of dollars verses a billion plus. No one cares besides the Silicon Valley echo chamber.
posted by sideshow at 6:18 PM on October 15, 2015


More reason to support the Internet Archive I guess.

Not a great solution, because content creators can request anything they don't want in it removed. Can leave a lot of holes in the archive for people who really want to use it for historical purposes.
posted by Drinky Die at 7:45 PM on October 15, 2015


So, a whopping 7 (maybe 8) out of 25 links still point to the content intended by the original site owners, the others are either broken, dead or points to a main page instead of a specific article. That's...bad.

I've done an annual Halloween blog post for, what, nearly a decade now...and this year, I decided it wasn't worth it to link any of the earlier posts, because so many of the links are now inhabiting some internet equivalent of a haunted house.

More practically, there's considerable push to get instructors to use online texts to save students $. Which I'm all for, and do whenever possible (despite frequent issues with quality control, lack of annotation, &c.). But online text repositories keep vanishing. Similarly, many university-hosted sites go kaflooey (important technical term) when the associated faculty or graduate students go elsewhere. And don't get me started on GoogleBooks' habit of randomly limiting access to texts that are decades and/or centuries out of copyright.
posted by thomas j wise at 7:50 PM on October 15, 2015 [2 favorites]


Sobell, I feel bad mentioning it but I've seen Evernote explicitly called out in several recent tech press articles on "dead unicorns" eg, companies which currently have funding but are no longer growing enough to justify their extremely high valuations and therefore might be headed the way of the dodo...

Hence my mentioning of using OneNote -- which is a Microsoft product -- as a simultaneous archive and backup. Trust me, plenty of people have already apprised me of Evernote's recent business press coverage. Microsoft products are far less vulnerable at this point, what with the company's standing plus its recent strategic repositioning as an aggressively cloud-based company.
posted by sobell at 10:11 PM on October 15, 2015


Neocities, a free web host and internet community that aims to resurrect the lost spirit of Geocities, recently announced that they're supporting the InterPlanetary File System. IPFS is a protocol that aims to make the web distributed in a way that files are never located on a single point of failure, and are simultaneously backed up on many servers. Sadly this article didn't mention it.
posted by Apocryphon at 10:00 AM on October 16, 2015 [4 favorites]


ymgve: colorcom.com seems to be working now and the link 404s.
posted by BiggerJ at 12:52 AM on October 17, 2015


« Older Six Degrees of a Different Bacon   |   What You Can Learn From Hunter-Gatherers' Sleeping... Newer »


This thread has been archived and is closed to new comments