Delete, Baby, Delete.
May 6, 2002 2:50 PM   Subscribe

Delete, Baby, Delete. I really enjoyed this short article from this month's Atlantic Monthly about the misunderstandings of document/records destruction. Some of the events discussed are the Iranian reconstuction of documents shredded at the U.S. emabassy and printed under the title Documents From the U.S. Espionage Den, the destruction of the Library at Alexandria and of course the Enron/Andersen document destruction.

It got me to thinking about cached web pages and the fact that you have to make sure Google doesn't cache your page if you don't want a permanent record there. It seems like no matter what you do on the web, odds are it's saved somewhere, wether it's google, the wayback machine or any other projects that I don't know about. If you wanted to entirely erase something you did last year on the web, what would you do?
posted by jonah (20 comments total)
you can contact google and have them remove your cache, i believe. but otherwise, there's not a whole lot. you've just got to accept that whatever's on the web is public record. that doesn't mean anyone gives a flying fuck that you write, but someone always could.
posted by moz at 2:55 PM on May 6, 2002

I spent a couple hours this weekend trying to counter link rot in my archived blog entries with the Wayback Machine and Google's cache. I think I was about 50% successful, but I ended up 100% frustrated when the Wayback Machine starting coughing up hairballs and random pages in response to queries.

Bah. Maybe I'll give it another go later.
posted by NortonDC at 3:03 PM on May 6, 2002

So much can be recovered from intentionally deleted files, yet so little when the deletion is accidental. Is there a Murphy's Law for this?
posted by HTuttle at 3:17 PM on May 6, 2002

Password protect your site. Put the password in plain view on the front page. Automated archives will be defeated. Sure, it will decrease your page views, but people really interested in your content will follow through.

Or, hey, don't sweat it and let the archives be.
posted by fleener at 3:49 PM on May 6, 2002

You can ask Google to remove stuff from its cache, including old Usenet posts (provided that you can demonstrate ownership).

Hmmm...if a user decided, for whatever reason, that they wanted their MeFi account deleted, and all of their posts/comments along with it, would mathowie agree?
posted by obiwanwasabi at 3:55 PM on May 6, 2002

Hmmm...if a user decided, for whatever reason, that they wanted their MeFi account deleted

I've wondered about that myself, considering the "All posts are © their original authors" blurb at the bottom of the page.
posted by gummi at 4:02 PM on May 6, 2002

Sorry I screwed up that link to the U.S. Documents, it should have been to here.

Besides and google, how many other places cache/archive sites? I was reading up on how robots work crawling sites after seeing that you can tell google not to cache your page in the robots.txt file. Pretty interesting stuff, it would be cool if there was a form somewhere to generate the file from some easy to understand questions for simpletons like myself.
posted by jonah at 4:14 PM on May 6, 2002

These generators have limited options it seems.
posted by jonah at 4:16 PM on May 6, 2002

Does anyone know whether there's any kind of archive for whois records? Or does updating a whois record completely destroy its former contents?

Being able to look up old whois records might help people who've lost their domain names (see here and here, for example).
posted by timeistight at 4:46 PM on May 6, 2002

(Has anyone else noticed significantly less updating of the Wayback machine beginning in January?)
posted by dhartung at 4:49 PM on May 6, 2002

Just finished using the Wayback Machine to restore a site that I thought had been completely lost (thanks MeFi).

Re: The Atlantic Monthly:
posted by syzygy at 5:17 PM on May 6, 2002

(dhartung: i noticed that before mid-December it was hitting my site at least daily, but after/since then, nothing. sort of odd.)
posted by Sapphireblue at 6:17 PM on May 6, 2002

>>"If you wanted to entirely erase something you did last year on the web, what would you do?"

>>"how many other places cache/archive sites?"

If I were a suspicious person I might start wondering if you had something to hide, jonah ;)
posted by iconomy at 7:32 PM on May 6, 2002

I was young, I needed the money...
posted by jonah at 7:44 PM on May 6, 2002

i didn't bother requesting that google remove the cache to various pages i didn't want archived, but i added this to my meta tags:

<meta name="robots" content="nocache,noarchive">

after a few weeks they updated their cache and the archives disappeared. in the last year they haven't archived anything i don't want them to.

to keep the internet archive/wayback machine from storing your pages add this to a robots.txt, and then upload to your main directory:

User-agent: ia_archiver
Disallow: /

you can see how that works here
posted by t r a c y at 9:05 PM on May 6, 2002

Awesome tips, t r a c y. Thanks!
posted by JoyG_n Josh at 7:47 AM on May 7, 2002

Why are you people so keen on publishing publicly and then hiding it?
posted by NortonDC at 8:38 AM on May 7, 2002

I personally am not trying to hide any of my postings or delete my archives from google, but it is odd to think that every quip you've made online will probably be easily accessed and searchable thirty years from now. I can just see kids finding parents' stuff online; "what do you mean I can't go camping with my friends mom, what about this picture of you at Burning Man with paper towel tubes duct taped to your head?"
posted by jonah at 9:10 AM on May 7, 2002

To me, the thing is that even putting blinders on Google won't prevent your material from biting you on the ass. You published it, it's out there. Someone will have saved it. Playing peekabo with Google is an illusion at best.
posted by NortonDC at 9:35 AM on May 7, 2002

you're welcome JoyG_n Josh :-) lol jonah :-D

i don't publish anything on the net that's personal, in terms of writing. never have, never will. i've never kept a journal and i have no interest in reading other people's, it's just not my thing. but i do like to confine my personal photos and bits of crappy artwork to my sites only, so i block the image spiders and web archivers (that i know of) from the few pages of mine that have stuff like that.

here's what my entire robots.txt looks like for anyone that wants to copy it - the vscooter entry is altavista's image spider. and if you use other image formats besides gif and jpg, just add them in the same format.
posted by t r a c y at 2:18 PM on May 7, 2002

« Older   |   Tales of the Tyrant Newer »

This thread has been archived and is closed to new comments