Skip

Cologne City Archive Disaster
March 5, 2009 6:37 AM   Subscribe

Cologne City Archive is a six-story building containing 26 kilometers of shelves, 65,000+ documents dating from 922 AD, 104,000 maps, 50,000 posters, 500,000 photographs and 780 estates and collections, including Irmgard Keun, Hans Mayer and Jacques Offenbach. Considered a state of the art institution when built in 1971 and copied around the world, the building simply collapsed on Tuesday, destroying most everything. [1],[2](via)
When the building was constructed, a small nuclear-bomb proof chamber was included in the cellar to protect the most precious pieces. But in recent years, the chamber has been used only to store cleaning material.
posted by stbalbach (94 comments total) 31 users marked this as a favorite

 
It's unclear exactly what and how much was lost, from link [2]
It won't be clear for some time exactly which items have been irrevocably lost, as some may be able to be restored. But it was the building containing the old inventory -- dating up to 1815 -- which collapsed. The newer material in the archives' other buildings is being relocated for safety reasons, with some of the archives' contents headed to Freitaeger's archive at the university.
In any case, an entire building.
posted by stbalbach at 6:39 AM on March 5, 2009


I'm not an archivist or anything but that sort of loss just makes me feel terrible. I can't get past the thought of literally 1000 years of work to preserve something obliterated in minutes.
posted by zennoshinjou at 6:53 AM on March 5, 2009


Yeah this is pretty bad.

I was staying at a hotel just a few blocks away from there last month. Directly in front of the hotel was a huge hole from the construction site where they are digging the tunnel. In the back of my mind I was thinking, "What if the hotel collapses into the tunnel? Shoot, they're German engineers, I'm sure they got it covered".

Yikes!

It's a damn shame.
posted by chillmost at 6:54 AM on March 5, 2009 [1 favorite]


Oh, man. I'm with zennoshinju on this one. I really hope there are no lives lost, as they seem to think there might be, but even this
The archives included the minutes of all town council meetings held since 1376. Not a single session had been missed, making the collection a remarkable resource for legal historians.
brought tears to my eyes.
posted by Rock Steady at 7:06 AM on March 5, 2009


What a loss.

.
posted by pointless_incessant_barking at 7:12 AM on March 5, 2009


Actually, I don't get it... how is all of this stuff "irretrievably lost"? Isn't most of it text, on paper? Dig carefully, pull out all the books and folders, dust them off, uncrumple them if necessary, and you are good to go. I don't see how more than 10-20% of this stuff can't be recovered.
posted by Meatbomb at 7:15 AM on March 5, 2009 [1 favorite]


What a dreadful situation. I'll be interested to hear my archivist wife's take on this. The critical buzz in archival circles these days is on digitization of existing records, she tells me; see, e.g., this recent article in the Pennsylvania Gazette. She tells me that one of the issues with digitization is permanence; paper, while bulky, holds up remarkably well if well-cared for. What good is any tangible material when the building housing it is vulnerable, though?

Very sad. Thank you for sharing this news in a FPP, stbalbach.
posted by cheapskatebay at 7:20 AM on March 5, 2009


Despite a growing number of libraries digitizing the content of their shelves, Freitaeger said electronic copies aren't necessarily the best way to avoid losing documents since "software doesn't last forever."

Instead, he recommended old-fashioned microfilm...


Really?
posted by Phanx at 7:25 AM on March 5, 2009


how is all of this stuff "irretrievably lost"? Isn't most of it text, on paper? Dig carefully, pull out all the books and folders, dust them off, uncrumple them if necessary, and you are good to go.

Come on, man. Buildings have gas lines, water mains, electrical wiring and flammable materials, and destroying paper is really easy. Preservation is not a trivial effort.
posted by mhoye at 7:35 AM on March 5, 2009 [2 favorites]


Googlebot sheds an index-able tear.
posted by Burhanistan at 7:38 AM on March 5, 2009 [6 favorites]


Instead, he recommended old-fashioned microfilm...

Really?


Yes. Microfilm is a proven, stable preservation medium. It has been around long enough we can see what happens if it isn't cared for properly and get a better idea how long it can last. Silver nitrate film, properly processed and stored can last for 500+ years. It isn't technology dependent either, just really need a magnifying glass and a light source.

Yay for microfilm.
posted by marxchivist at 7:41 AM on March 5, 2009 [13 favorites]


What good is any tangible material when the building housing it is vulnerable, though?

It's a fascinating question. How do you ensure something is not lost? Even a "state of the art" building can be lost in an accident. The answer of course is redundancy.

As an example: The Apollo Lunar program had something like 10 million "systems". Even with 99.99% relibility, that would mean 100s or thousands of failures during a mission, which did in fact happen. Yet, the astronauts were comfortable going. Why? Every system was double or triple redundant. Apollo 13 was a problem because they had double redundant systems fail at the same time due to the scale of the explosion from the oxygen tanks.

Paper documents can be made redundant with digitization, or facsimile. Then make the digital archives redundant. True, the media wears out, but not so long as its maintained, and is itself made redundant.
posted by stbalbach at 7:41 AM on March 5, 2009


It's bad, but not the end of the world. I think Meatbomb is right: tons of bricks and steal falling on piles of paper will not shred and obliterate the paper as much as it will compress and crumple and dirty it. It's now an archeological problem, only you already know exactly what's in there and where it should be.

If they stored a lot of 3D artifacts, I would expect much of it to be a little more 2D now, but still mostly fixable. They might be uglier -- in the end, they'll be bent and cracked and glued -- but the information they embodied should not be lost.
posted by pracowity at 7:49 AM on March 5, 2009


Isn't most of it text, on paper? Dig carefully, pull out all the books and folders, dust them off, uncrumple them if necessary, and you are good to go. I don't see how more than 10-20% of this stuff can't be recovered.

When the building comes down the water pipes break. Ever get a book wet? Or seen a piece of paper sitting in a puddle of water? Paper, at its base, is wood or rag that you pulp up, get really wet, then spread thin to dry. Add the water back in and you just have ... pulp.
posted by anastasiav at 7:50 AM on March 5, 2009


This story is surprising because it's so premodern. Ancient history is replete with stories of the destruction of archives and warehouses of historical artifacts and books that we cry over when we think about what was lost (the Mongol sack of Baghdad, the Venetian sack of Constantinople, and the burning of the Library of Alexandria which was considered so unimportant at the time that no one ever really figured out when, exactly, it happened). In this day and age, you expect that historical documents are kept safe in modern, protected buildings that just don't catch on fire or suddenly collapse with no warning. But here we are, in the 21st century, dealing with these random events of destruction which wipe out our historical record as though we're still in the ancient world.
posted by deanc at 8:11 AM on March 5, 2009 [7 favorites]


I just don't understand how this could have been allowed to happen. An entire building collapsed. What were the engineers and building inspectors and the like on?
posted by orange swan at 8:20 AM on March 5, 2009 [3 favorites]


To the comments above- have you ever seen old paper? Its not like the stuff you pull out of a printer today. Its incredibly fragile. I'm sure they had those documents properly stored and preserved but even so they aren't exactly in the best condition to survive the trauma of a building collapsing.
posted by zennoshinjou at 8:25 AM on March 5, 2009 [1 favorite]


What were the engineers and building inspectors and the like on?

Someone's covert payroll.
posted by Hovercraft Eel at 8:25 AM on March 5, 2009 [1 favorite]


Silver nitrate film, properly processed and stored can last for 500+ years.

How do we know this? Is 'can last' the operative term? I seem to remember reading in Nicholson Baker's (admittedly overstated) polemic against microfilm that this doesn't always actually happen..

Yay for vellum - proven life of 1000+ years.
posted by GeorgeBickham at 8:33 AM on March 5, 2009


But in recent years, the chamber has been used only to store cleaning material.

FAIL

Also, this is why we need nanobots. Duplicate this stuff on the subatomic level so even stuff like hidden messages and whatnot remain.
posted by DU at 8:35 AM on March 5, 2009 [2 favorites]


have you ever seen old paper? Its not like the stuff you pull out of a printer today. Its incredibly fragile.

That depends - woodpulp newsprint circa 1900 can definitely fall apart as soon as you touch it, although its quality varies somewhat.

Early-modern rag paper though? Not quite as tough as old boots, but much more durable than anything you would ordinarily pull out of a printer today.

But it all depends on how it's kept - including not having a building fall on top of it, like you say.
posted by GeorgeBickham at 8:36 AM on March 5, 2009 [1 favorite]


even so they aren't exactly in the best condition to survive the trauma of a building collapsing

Exactly. I'm thinking an archivist or conservator will come in here at some point, but some old books are so fragile that the act of merely turning the pages will fragment the pages into many tiny pieces. A box falling off a shelf could mean destruction - being in a collapsing building would mean literal pulverization.
posted by Rock Steady at 8:37 AM on March 5, 2009


oops
posted by wires at 8:40 AM on March 5, 2009


It's bad, but not the end of the world. I think Meatbomb is right: tons of bricks and steal falling on piles of paper will not shred and obliterate the paper as much as it will compress and crumple and dirty it. It's now an archeological problem, only you already know exactly what's in there and where it should be.

I lived about 500 meters from the Sarajevo National Library, which was heavily bombed (on purpose, by the Serbs) during the war which resulted from the break-up of Sarajevo. This library had one of the best collections of all sorts of books and documents from the early days of the Ottoman Empire, others dating back to early writings in Old Church Slavonic, incredible Sephardic Judaica from the days when Jews fled the inquisition in Spain. In other words, irreplaceable old printed material.

It was a big library too, so when it started seeing damage from bombs, the most important pieces were taken and stored elsewhere for safety. But it was impossible to "evacuate" the entire library - there was nowhere in town that wasn't being bombed too, plus we're talking so many volumes and records it would have been hell to keep track of it all.

Occasionally, a big shell would hit a part of the library and it would rain paper. But not sheets - just little dust-sized pieces. Simply dropping a 200-year old (i.e. relatively "new") book on the floor on its corner often reduced the entire thing to dust. And I do mean dust, not crumpled pages. It wasn't unusually for a shelf of books to fall over from the force of a shell, and for every single thing sitting on the shelf to be reduced to dry pulp, with no possible hope of recovery. How I wish that these archives were simply "compressed" and "crumpled" and "dirty." In a building like the one in Köln, with its terrible problems with heat and dryness, and a total, immediate collapse . . . well I'd be surprised if 2% of it survived - and that would probably be the more recent stuff.

The Sarajevo National Library was pretty ruined, but they're slowly repairing it. A lot of stuff was saved; a lot of important stuff was truly pulverized (in the strict sense of turning to powder.) I used to see the devastation being done every few days, when I'd pass the library to get water - entire books that would just disintegrate by the simple action of opening the cover, for example.

The experts know what they're talking about when they say this stuff is lost.
posted by Dee Xtrovert at 8:51 AM on March 5, 2009 [20 favorites]


I just don't understand how this could have been allowed to happen. An entire building collapsed. What were the engineers and building inspectors and the like on?

It sounds like someone (the Metro?) may have been digging a tunnel nearby.
posted by carter at 8:56 AM on March 5, 2009


Also: the archives fell into a huge hole - the debris is too unstable to start searching through for documents - and it's now filling up with groundwater. So a lot of stuff is probably lost.
posted by carter at 8:58 AM on March 5, 2009


It gives me some consolation that there is a great deal of knowledge in Germany about how to reconstruct damaged documents.
posted by GeorgeBickham at 9:03 AM on March 5, 2009


Yay for microfilm.

You might be right, I'm no expert - but weren't the books that survived the Dark Ages the ones that got copied rather than the ones written on robust parchment?

I admit I harbour an uncharitable suspicion that librarians and archivists love microfilm because it's so tiresome to access and the images are often so fuzzy that they can keep it safely in their hoard without too many irritating users wanting to mess with it :)
posted by Phanx at 9:09 AM on March 5, 2009


This is a shocking story, and one can only hope the damage is not as bad as it seems. The Cologne city archives are a major source for the history of the Protestant Reformation (explaining why Cologne didn't go Protestant -- the subject of a famous article by Bob Scribner, 'Why was there no Reformation in Cologne?' -- helps to explain why other German cities did), and if these archives have been destroyed, it is a huge, huge loss for historical studies.
posted by verstegan at 9:12 AM on March 5, 2009


but weren't the books that survived the Dark Ages the ones that got copied rather than the ones written on robust parchment?

Yes. Most of the works we have from antiquity were copied during the so-called "Dark Ages", the "originals" long gone. So when you read Plato and Socrates, you can thank the Muslim and Christian scribes who copied them down. Some documents have been found in the deserts of Egypt and elsewhere, but few.
posted by stbalbach at 9:22 AM on March 5, 2009


Most of the works we have from antiquity were copied

that why it should be digitized. you digitize, then put it on the web and send copies everywhere.

microfilm is a joke.
posted by bhnyc at 9:37 AM on March 5, 2009 [1 favorite]


Err, why not digitize AND put it on microfilm?
posted by showbiz_liz at 9:42 AM on March 5, 2009 [3 favorites]


Hate to be a Google fanboy but that's what they DO. Archive and distribute. And while archive.org does good work, they don't have gobs of money like Google does. I firmly believe that right now there are people in Google's HQ planning massive raids on similar archives around the world to digitize all the old works possible.
posted by seanmpuckett at 9:52 AM on March 5, 2009


microfilm is a joke.

No way. Everything's a tradeoff, sure and microfilm or microfiche has downsides, but microfilm and microfiche have a couple of very real benefits, and the big ones are that it's a durable physical object with a well-understood and easily duplicable recovery mechanisms, while also being reasonably high-density.

That recovery-process is important. Got any important data on an 8" floppy?

In the longer term, maybe the most importantly one is that a casual observer can understand that this is a high-density data artefact using commonly available tools like the human eye.

The Long Now Foundation does some long, really important work on this front, struggly to answer questions like "how will we be able to show people that this item contains valuable information at all, and not meaningless scratchings" when the entire course of recorded human history is less than six thousand years old.
posted by mhoye at 10:11 AM on March 5, 2009 [2 favorites]


As a microfilm guy with twenty-five-plus years in the biz (up till recently, when I drifted into the museum field), I gotta say this is exactly why people like me pushed and still push for anyone with large archives of irreplaceable material to microfilm their archives, then scan them, then distribute the collections to mitigate disasters like this. For what the digital whiz-bang crowd wants to say about the joy and wonder of digital media (a medium for which this is no archival storage medium or universal retrieval mechanism), you can't beat microfilm for long term storage and analogue fidelity, if it's done according to ALA and other standards for longevity.

Yes, the paper itself may last longer if kept properly, and they should be kept intact and in controlled storage, but a microfilm archive is compact and easily duplicated, so you can store complete copies of many millions of pages of archived documents in a pretty compact space. Microfilming documents is completely non-taxing on the documents (I used to film ten million dollar illuminated medieval books for local museums, and it was a painless, if meticulous, process where you manipulated the pages with polished ivory tongs), and once they're filmed, the film can be scanned, reprinted, duplicated, and accessed while the original material is safely stored.

Nicholson Baker, for what it's worth, just isn't qualified on any level, as a philosopher, to critique microfilm, and mistakes the idiocy of salesmen and expedient career librarians for inherent problems with microfilming documents, and his crusade actually served to damage archivists' efforts to created distributed archives that can survive this kind of nightmarish incident.

Let's just hope there's a lot of archives conservators in Germany.
posted by sonascope at 10:33 AM on March 5, 2009 [13 favorites]


Err, why not digitize AND put it on microfilm?
posted by showbiz_liz


That's pretty much what most places that have the equipment are doing.
microfilm is a joke

Bullshit. mhoye's answering that above about as well as I could. A bigger joke is the idea of reading a digital file created today 500 years from now.

Silver nitrate film, properly processed and stored can last for 500+ years.

How do we know this?


Chemists and scientists and people like that have done tests. They can try to stimulate aging through high tempature and other stress tests. The chemical principle of the silver reacting with light etc. involved in photography is pretty well documented I believe. Some beautiful negs still exist from the time period of the American Civil War.

Is 'can last' the operative term?

I'm not sure what you mean by that. If by "can" you mean there could be fuckups in the manufacturing, processing or storage of the film that would affect the longevity, yes.

On preview: sonascope thanks for covering Nicholson Baker.

I can throw some paper or microfilm in a drawer and forget about it for 200 years, chances are, in 200 years it will be legible (if the building doesn't fall in on it!). Throw some computer media in a drawer and what will you have in 200 years.
posted by marxchivist at 10:39 AM on March 5, 2009


Well this thread has taught me that microfilm is not a joke and might be one of the best long term backup storage media there is. Other than inscribing the micro-text on stainless steel (or bronze or stone something other than film). With laser engraving machines may not be far fetched cost wise, if your looking for biblical apocalyptic time storage.
posted by stbalbach at 11:08 AM on March 5, 2009 [1 favorite]


re: laser engraving: The Rosetta Project. (previously)
posted by stbalbach at 11:17 AM on March 5, 2009


keep in mind that germany is a country where any new construction requires hardcore planning and permissions. they do not let you build a shelf without a proper vetting process. consider then that there has been previous trouble in cologne with a church bell tower almost tipping over, also related to this subway construction that people now point fingers at. they are only speculating but chances are someone really screwed up. there has been talk of corruption in the area and a while back some serious arrests and convictions have happened in the Ruhrgebiet. this story could end up becoming much larger now that people have suffered direct harm.

also know that this is an election year.
posted by krautland at 11:32 AM on March 5, 2009 [1 favorite]


Silver nitrate film, properly processed and stored can last for 500+ years.

How do we know this?

Chemists and scientists and people like that have done tests. They can try to stimulate aging through high tempature and other stress tests. The chemical principle of the silver reacting with light etc. involved in photography is pretty well documented I believe. Some beautiful negs still exist from the time period of the American Civil War.


That's not 500+ years, a figure which is suspiciously round-sounding, and makes me suspect it is a guess meant to sound arbitrarily long - which it isn't particularly, in the scheme of things. Can you cite papers?

To clarify my position: by all means, let's have many copies on many supports - I wish we had microfilm of these materials - but never imagine that one copy is a substitute for its parent. Because there has been loss from the human record in thinking so.
posted by GeorgeBickham at 11:40 AM on March 5, 2009


I used to film ten million dollar illuminated medieval books for local museums, and it was a painless, if meticulous, process where you manipulated the pages with polished ivory tongs), and once they're filmed, the film can be scanned, reprinted, duplicated, and accessed while the original material is safely stored.

Did you use colour microfilm, sonascope? I understand that's pretty cutting-edge as a preservation format - is that so?
posted by GeorgeBickham at 12:06 PM on March 5, 2009




Thanks, marxchivist.
posted by GeorgeBickham at 12:39 PM on March 5, 2009


Bullshit. mhoye's answering that above about as well as I could. A bigger joke is the idea of reading a digital file created today 500 years from now.

Well, obviously disks demagnetize, CD-R disks fade, and even flash drives lose their data. But, I think if you had the correct data reading a file wouldn't be difficult at all. If you look at a word doc in a text editor you can read the text. And it's not like these file formats are going to dissapear, it's not like old computers can't be emulated, and it's not like programmers can't write software to interface with old files.

And by the way, It wouldn't surprise me if computers 500 years in the future are more like the computers today then the computers today are like the computers from just 20 years ago.
posted by delmoi at 1:15 PM on March 5, 2009


disks demagnetize, CD-R disks fade, and even flash drives lose their data

The key thing here isn't the media—it's the mechanical device to read the media, which is why there are warehouses filled to the brim with meticulously-archived digital media from NASA missions of the sixties that are stored in perfect temperature and humidity in acid- and lignin-free boxes that are inaccessible because there's virtually not one single surviving device left to retrieve that perfectly-intact data. CD-Rs didn't last long, in their original dye formulations, but there's all evidence that we've got some pretty stable dyes (after graduating from cyanines to phthalocyanines and onward), but where will you find a CD drive in 2076 that works? You're looking at the material longevity issues of a huge number of parts, maintaining those drives, from leaking electrolytic capacitors to oxidizing drive belts and a billion other things. There might be a hobby faction that builds these things, but I'm doubtful.

I'm curious to see if flash memory starts to solve some of these issues.
posted by sonascope at 1:38 PM on March 5, 2009


From my husband, a historian who works primarily with electronic sources (and thus has a bias in this matter):

I feel like there are two separate conversations going on, here, and I would like to weigh in on each of them in turn.

The first is about the survival, or otherwise, of the Cologne documents. Firstly, it obviously goes without saying that this is a terrible and irreparable loss, and the fact that lives may have been lost makes it all the more awful. It's only the latest in a string of recent mass-scale losses, the worst of recent times being the loss of the national archives in Iraq which burned with centuries of Ottoman and Iraqi history.

As to the possibility of going through the rubble and collecting surviving documents: well on the one hand there's the question of whether anything might have survived. As with others I note that the oldest material will probably survive the best. Nobody seems to have mentioned vellum yet, which is probably the toughest archival material and quite capable of withstanding great insult, notwithstanding the fact that we have to treat it very carefully just as we would any old document. Nineteenth and twentieth century stuff, on acid-rich paper, would probably have already been very brittle and degraded. That's the stuff that goes to powder when you give it a big bang, not the early modern and medieval stuff. Unfortunately, these are also the records that are the least likely to have been published in other places.

But there's another angle to this which nobody seems to have brought up yet. As a rule, archives run on an absolute shoe-string. Even if it were possible treat this ruin as an archaeological site, and dig through the rubble looking for fragments of old paper, every euro spent on that effort would have to be taken out of preserving another document in another archive somewhere. Every day around the world, millions of pages of valuable archival records disintegrate into dust because there simply aren't enough people preserving them. So if you're an archaist in Germany, at the moment, which document will you choose to save: the shreds of paper under a pile of rubble in Cologne, or the document scrunched up in the bottom of a box in Konstanz?

Which brings us to the second debate: whither the backup archive?

I have had many a long and heated debate with archivists about this and I usually start out by making the following case for digital archives:

1. Ease of distribution. Digital documents can be almost effortlessly distributed all over the world. Archivists may think that they are running on a shoe-string but pity the poor historian who has even less funding and yet, somehow, has to schlep from city to city just to get access to their raw materials. And if you're an independent historian, or a historian from a poorly funded university, or a student, or a historian from a developing country... well you can pretty much forget it.

2. Ease of indexing. Digital documents are inherently easier to index than microfilm. That means better searching and better research.

3. Ease of creation. Every archival historian I know carries an archive-quality camera to work with them. You want to create a digital archive for practically free? Just require those historians to share their images with you. What is more, the pictures that we take are much superior to the pictures taken by most microfilming firms. For every sonascope-style craftsperson with their ivory tongs, there are a dozen, a hundred, mass production microfilm people who won't even stop to check that their frame is in focus. You want to talk about document losses? What about all those documents, from the last fifty years, that have been photographed unreadably onto blurry microfilm and them pulped to save shelf-space?

4. Greater fidelity for electronic documents. I know that archivists tend to hate this idea, but the vast majority of historical documents produced over the past twenty years have existed primarily and often only in electronic format. I know archivists who are seriously considering preserving spreadsheets by printing them out. That's a bit like 'preserving' a steam engine by taking a photograph of the outside and then chucking the original in the bin. Digital photographs, richly formatted word processor documents, spreadsheets, databases, web pages, sound files... all these media are best preserved in their original format as much as possible.

So here goes the counter-argument, which we have heard several times in this thread already: ah, but the storage media for digital files degrade, and we won't be able to read them in the future. I think that this counter-argument stems from the basic outlook and training of archivists, which is to avoid handling and processing the original documents as much as possible. That really makes a lot of sense for physical media, but it makes no sense at all for electronic documents. In order to preserve electronic files, they must be network-resident. They have to be simultaneously on the hard disks of several computers around the world. This, thanks to the internet, is pretty easy to do, but I think that it's really counter-intuitive for archivists who tend to reject the idea as soon as you say that the file is going to be preserved by constantly reading and rewriting it. But let me make myself doubly clear: nobody is seriously suggesting that we should archive digital files by storing floppy disks or CD's. A network-resident archival system is one that can survive anything but the total collapse of our civilisation. At which point, yes, we have to fall back to whatever paper and microfilm archives survive the armageddon.

Which brings me right back round to the question of preserving microfilm. Yes, microfilm is wonderful stuff. Yes, this will probably be the ultimate repository of our civilisation's knowledge and experience or what have you. But making digital archives in no way precludes preserving the most important stuff on microfilm. It's trivial to make a system that goes from digital to microfilm or paper. That is, in fact, what people are usually advocating doing for new archival sources anyway. But if you microfilm first and then digitise, then you're destroying information. I think it's much better to do it the other way round, for all the reasons above stated.
posted by jb at 2:41 PM on March 5, 2009 [30 favorites]


Back your shit up. Repeatedly.
posted by gwint at 5:14 PM on March 5, 2009 [1 favorite]


Digitizing things won't help - see the recent issue about 40-year-old NASA data tapes that couldn't be read until the appropriate tape drive was found in a museum.

I've already encountered this to a smaller extent myself. Right before I moved to to Austin from Oklahoma City in 1995, I dumped my desktop system and its contents to tape. The tape was one of those "floppy tape" drives that connected to the floppy disk controller.

The tape is in a data format that I can parse with just about any Unix/Linux/OSX system (straight tar) but it's in a hardware format that I would have to hit up eBay and the classiccmp list in order to be able to read the data.
posted by mrbill at 5:36 PM on March 5, 2009


how is all of this stuff "irretrievably lost"? Isn't most of it text, on paper? Dig carefully, pull out all the books and folders, dust them off, uncrumple them if necessary, and you are good to go.

You should see what happens to items dumped in the book drops, much less a crushed building. But thanks for trivializing my entire profession.

Signed,
An uncrumpler and duster-offer of books
posted by ikahime at 7:17 PM on March 5, 2009 [4 favorites]


From husband again:

Digitizing things won't help - see the recent issue about 40-year-old NASA data tapes that couldn't be read until the appropriate tape drive was found in a museum.

That is a situation that occurs with old stores of data tapes, just as it occurs with old audio recordings and the like. But digital archives people are, by no means, talking about archiving stuff on tapes and sticking them in an archive. Network-resident archives on the hard disks of operating computers don't have this problem. They do, however, have other problems like closed file formats, but that's a whole other discussion.
posted by jb at 7:19 PM on March 5, 2009 [1 favorite]


And anything that is currently digitized on things like tape should be moved to network-distributed storage ASAP. Properly replicated network storage isn't going to suffer the same readability problem as it's constantly migrated and duplicated if an existing node goes down.

And for text, this is easy --- text is very very small, and you can store a ton of it in a safe way easy. Google wants to do this with books, for example (although for books in copyright it has been a long fight and I'm not clear what the final settlement was). If the archive is online and public, anyone who is concerned can keep their own copy, as well.

Archiving for digital storage, as jb points out, actually means copying and moving it frequently, kind of the opposite of storage of physical objects.
posted by wildcrdj at 10:19 PM on March 5, 2009


The most credible theory for the collapse I have seen is that a temporary concrete construction wall in the newly bored subway collapsed, rendering the underground beneath the building unstable. Basically the building tipped over.
That this could happen in a country like Germany which has a very diligent and thorough building code is horrifying. If this turns out to be the cause I suspect the contractor responsible will face severe legal problems.
posted by Catfry at 3:03 AM on March 6, 2009


One of the big problems with digital systems for this kind of archival work, and what is the crux of why I left the field, is that whatever system you'll end up using for these projects will be invariably created by a salesman, pitching technology he or she doesn't properly understand, implemented by recent college comp sci grads with no experience in information archiving, and run by a staff of temps who come and go on a whim. There's a complete disconnect between people and firms with a long history in archival conversion and the firms who skid in just under the lowest bid and then half-ass their way through, monitored by tired government upper-middle-management types who just want to be able to check off the boxes on their Excel spreadsheet.

I spent more time fighting with former realtors who just couldn't understand why we couldn't just flop first edition Poe on cheap flatbed scanners and "mash them down" for each scan, or computer whiz kids who lived to game, who would just shrug at 5% data losses in cross-media conversions and ask "well, what can you do?" I'm probably a bit of a snob, coming from a family business where we lived and breathed microfilm from birth, but there was a degree of personal investment that doesn't exist in the field now.

A system of storage, distribution, and migration is very workable in theory, but in practice, you end up bleeding off little bits of your archive because it's so much easier to lose data than it is to lose physical media. Now and then, a document would get missed, or misfiled, but there was a way to find it—digital data, on the other hand, just evaporates into the miasma.

There's also a built-in, and very evolved, system of quality control for microfilm, which involves filming industry-standard "targets" (a prepared sheet of white plastic printed with a variety of quality-checking images) on each roll of film. When you get the film off the processor, you read off your resolution from the targets, hit it with a densitometer to get the high/low densities, and do a number of other checks that will stop you right then and there if you're doing something wrong. Digital quality control, on the other hand, is limited to bored temps watching images flick by on a screen at low resolution, shrugging off "well, this box looks okay" without any kind of framework for these assessments. I'm a little bitter about the changes in the industry, so take that all with a grain of salt, but if I wanted to maintain my information over a long span (longer than the career of some middle manager or bureaucrat), I'd use a hybrid digital/analog approach.
posted by sonascope at 3:28 AM on March 6, 2009 [10 favorites]


Here's what you say about Nicholson Baker if you don't like his take on microfilm and libraries and actually want to convince people of your point of view:

Nicholson Baker is a well-meaning amateur who has performed valuable service in rescuing old newspapers and bringing attention to the unfortunate discard policies of some libraries; microfilm, however, is a far better medium than he thinks, and here's why.

Here's what you say if you just feel like ranting:

Nicholson Baker, for what it's worth, just isn't qualified on any level, as a philosopher, to critique microfilm, and mistakes the idiocy of salesmen and expedient career librarians for inherent problems with microfilming documents, and his crusade actually served to damage archivists' efforts to created distributed archives that can survive this kind of nightmarish incident.

My reaction to the latter: a roll of the eyes and the thought "oh, dear, another person whose toes got stepped on and can't think straight." That may not be fair, but such is life. If you want to be listened to, learn to talk like someone who hasn't joined the Lord's Archivist Army (We Hate Nicholson Baker Brigade).

And for the Cologne City Archive:

.

...and I hope whoever's responsible is punished severely.
posted by languagehat at 6:17 AM on March 6, 2009 [2 favorites]


In response to:

Here's what you say about Nicholson Baker if you don't like his take on microfilm and libraries and actually want to convince people of your point of view:


[and]

Here's what you say if you just feel like ranting:


[and so on]

—I'd point out that (a) I don't really have the time or inclination to write a well-reasoned, annotated argument in favor of archival conservation on what is, essentially, a bulletin board for those of us who like to explore a diversity of subjects without having to have post-doctoral degrees in those subjects, and (b) my response to Baker has nothing to do with hate, a "Lord's Archivist Army," or even a critique of him beyond pointing out that his training and academic history give him no authority or credibility in a field that he so earnestly and ignorantly attacked based on the decisions of a few select management types. If you'd like for me to produce a more detailed argument, with supporting documentation, I'll be more than happy to. My rate for technical writing in that field is $95 an hour.

Honestly, though, I love getting a dismissive and patrician, "If you want to be listened to, learn to [whatever]" lecture from someone whose sole response to this post was to jump in for a smug little chunk of the ol' ad hominem. Thank goodness the National Archives, the Library of Congress, and the Walters Art Museum were all dumb enough to fall for my "rants" for all those years I spent in preservation microfilming/scanning. Thanks ever so much for schooling my sorry self on what I'm doing wrong, so that I can strive to be a perfect being like you, O Great Wise One.
posted by sonascope at 7:50 AM on March 6, 2009 [12 favorites]


Oh dear, someone's toes really did get stepped on. Hope your life improves to the point where you don't take everything so personally!
posted by languagehat at 8:11 AM on March 6, 2009 [1 favorite]


Hey Sonascope - what about my colour microfilm inquiry? I'm genuinely interested in what preservation professionals think of it these days.
posted by GeorgeBickham at 8:37 AM on March 6, 2009


Mr. Languagehat, it's not about "toes being stepped on" and your withering, dismissive need to demonstrate your presumed higher level of authority by behaving like a petulant holier-than-thou jerk instead of actually discussing an issue without vitriol. Do you honestly think you're not being an obnoxious troll when you use language like "a roll of the eyes and the thought 'oh, dear, another person whose toes got stepped on and can't think straight,'" and then act shocked, shocked that someone would respond negatively to your bait?

I'm not so much offended as amused that someone who'd toss off a snarky, snobbish line like "Same goes for those crappy Coleman Barks translations of Rumi. Enjoy them if you will, but they're not Rumi," back in our last encounter would then feel that my off-handed, but nowhere near as snarky, quip on Baker required a schoolmarm's lecture on the proper approach to convincing a skeptical observer of an idea's worth. Here's hoping your life improves the the point that you have better things to do than acting like a snotty contrarian with no interest in arguing the actual subject of a post beyond being an ad hominem sniper.

-

On the subject of color microfilm, GeorgeBickham, I'd have to say the the industry tends to be unusually conservative, and so color microfilm wasn't making a lot of headway for quite a while. It's one of those industry changes that's been tricky, because it essentially requires a completely set of equipment and techniques to produce long-term archival copies, and the prevailing thought has been to preserve printed material on monochrome film and do 35mm and large-format films and transparency for documents where color is important. I suspect that archivists are warming to the color film, particularly since Ilfochrome Micrographic has been passing a lot of the long-term studies and tests, but it's slow going. There's more movement to the hybrid analogue/digital approach of B&W film/color digital images, though I have to admit that I'm about three years out of date on the subject now, having moved over into museum work in 2006. Interesting stuff, though.
posted by sonascope at 9:12 AM on March 6, 2009 [7 favorites]


your withering, dismissive need to demonstrate your presumed higher level of authority by behaving like a petulant holier-than-thou jerk instead of actually discussing an issue without vitriol.

I most certainly do not have a higher level of authority. However, I am capable of using my eyes and brain. I have personally seen libraries dump valuable books and periodicals on the sidewalk, I have read Baker's excellent essays and am quite confident he's not making shit up, and I respect the hell out of him and have no respect for defensive librarians and archivists who deal with his accusations by circling the wagons and accusing him of rank amateurism. Believe me, I have plenty of vitriol available for the misdeeds of libraries (and I have worked in several), but I'm not unleashing it here; I have better things to do. And I note that you have still not actually responded to my original point, so I take it that you agree that you have no interest in winning anyone over but simply enjoy venting about that nasty Mr. Baker, who's made the lives of hard-working librarians miserable by calling attention to their failings.
posted by languagehat at 9:48 AM on March 6, 2009 [1 favorite]


If you're capable of using your eyes and your brain, trying re-reading what you wrote in regards to my comment on Baker and please instruct me on how I'm not responding to your point. The nice thing about not being able to edit Metafilter posts is this—it's all right there, just as you wrote it.

There's nothing, not one one word in there about Baker, unless you count your invention of what you presume my personal feelings about Baker to be, for me to respond to.

I wrote:

Nicholson Baker, for what it's worth, just isn't qualified on any level, as a philosopher, to critique microfilm, and mistakes the idiocy of salesmen and expedient career librarians for inherent problems with microfilming documents, and his crusade actually served to damage archivists' efforts to created distributed archives that can survive this kind of nightmarish incident.

I don't say I hate Baker (and I certainly don't hate Baker, who's quite skilled in other areas), I don't mention or even allude to a "Lord's Archivist Army (We Hate Nicholson Baker Brigade)," and the sole point of my statement is that Baker was talking out of his hat and ended up undermining the cause that he championed. I don't suggest this out of personal dislike, but out of eight years of hearing his book quoted by smarmy bean counters looking for excuses to cut budgets. Yes, let's not spend money on microfilming our archives, because this book says…and it must be right, because it's a book. There are lots of compelling arguments in Double Fold, and he highlights issues that need to be illuminated, but his amateur approach and lack of genuine understanding of the technical side of the field make the whole worth far less than the sum of its better parts.

That said, I did in fact respond to your original post and point, which was to say that I'm not making my point properly, and to say that I can't "think straight." If you'd like some other response, you'll need to actually make the point you think you're making so I have something to discuss. If you mean that I'm somehow meant to respond to my own point, that's a whole other issue, and something I don't have the time for, either.
posted by sonascope at 10:20 AM on March 6, 2009 [3 favorites]


Just as a non sequitur, Rumi never stopped being a mufti of the Hanafi school of jurisprudence. Modern revisionists like to brand him as some kind of wine drinking hippie. Of his poetry, Rumi said something like "if you have a guest who likes tripe soup then that's what you serve them".
posted by Burhanistan at 10:56 AM on March 6, 2009 [1 favorite]


I don't suggest this out of personal dislike, but out of eight years of hearing his book quoted by smarmy bean counters looking for excuses to cut budgets.

Ah, a crucial bit of context. I didn't realize his book was put to such evil uses, and I withdraw my snark (though not my admiration for Baker)—you have a right to your (what seemed to me over-the-top) reaction. Thanks for explaining.
posted by languagehat at 11:28 AM on March 6, 2009 [3 favorites]


Comment now from jb, who makes her own digitized images of c1600-1700 documents.

I've used microfilm, and I've used digital. From both the end user experience and for the preservation of the information in the documents, my digtal images are far surperior to any microfilm I have ever used. Even the slightly blurry ones (things happen when you are doing 500 images a day) are easier to read.

There is data which isn't strictly in the words or images, but in the physical form - but even there digital images obtained in a non-destructive and non-invasive way (flashless long-exposure photography) still record more of the physical form than microfilm.
posted by jb at 5:17 PM on March 6, 2009


If we are going to spend any money, we should make digital images of our archives. Print them on microform if you want them in analogue, but as my husband pointed out, you lose a lot by moving to microform.

(sorry to pull historian's rank, but I work with these things - and it is clear which is a superior preservation of archival material. We just need to master preserving these images.)
posted by jb at 5:20 PM on March 6, 2009 [1 favorite]


Metatalk.
posted by Meatbomb at 6:00 PM on March 6, 2009


Aww man. This is shocking.
I mourn the loss of this big piece of Europese history.
posted by jouke at 7:39 PM on March 6, 2009


Just want to say thanks to the archival experts who've weighed in here, whatever your opinions. I've learned a lot.
posted by bookish at 7:56 PM on March 6, 2009


Apparently this is likely to have been caused by the digging of a new metro tunnel.
The same german company that was digging in Cologne was slated to digg the tunnels in Amsterdam as well. And Amsterdam houses on the Vijzelgracht canal have already been sliding away!
source in Dutch
posted by jouke at 7:58 PM on March 6, 2009




Einstürzende Neubauten!
posted by dydecker at 8:34 PM on March 6, 2009 [5 favorites]


languagehat writes : "I most certainly do not have a higher level of authority."

Truer words yadda yadda.
posted by bardic at 11:42 PM on March 6, 2009


I'm feeling a bit thick, but could someone explain the "software doesn't last forever" aspect of this?

Are we saying that: a) in 500 years people will be unable to decode and read, say, TIFF* files; b) in 500 years, all the TIFF files in the world somehow will have significantly degraded in quality; c) something else I'm not understanding?

Knowing nothing about this, I would have thought that a digital media would be superior to physical media because the microfiche itself physically degrades over time, doesn't it?

*Loss-less file format example chosen at very nearly random.
posted by DarlingBri at 7:35 AM on March 7, 2009


Darling Bri, a lot of a), some of b) if the media in which the files are stored degrade over time. The BBC Domesday Project is often trotted out as the poster child for a). (more than you wanted to know about the BBC Domesday Project and eventual recovery and preservation of data here)
posted by needled at 8:00 AM on March 7, 2009 [1 favorite]


And for anybody wanting to find out even more about the Nicholson Baker vs. the librarians throwdown, here's the Association of Research Libraries' page on the matter, including specific responses to issues brought forth by Nicholson Baker.

(The Association of Research Libraries is essentially the trade group for research libraries in the U.S. and Canada, which includes most major university libraries, national libraries such as the National Library of Medicine, as well as some larger public systems such as the New York Public Library. ARL member libraries would have ostensibly been the target for Baker, not your local underfunded public library scrounging for funds and space to keep children's programs going.)
posted by needled at 8:18 AM on March 7, 2009


Husband weighs in again:

Most opponents of digital preservation cite point b: you've seen it a number of times in this thread. The fact is that digital storage media are inherently unstable for a number of reasons. Burnable CDs rely on dies that degrade with time. Plastics in magnetic tape can break down and turn to goo, or just gum together to the point that you can't unwind them any more (for this reason, the archives of video tapes that TV companies preserve have to be periodically wound and unwound just to stop the tape sticking together), and even if your storage media is 100% the same in a thousand years, cosmic rays from outer space will randomly flip your magnetic digits and degrade the stored document.

But while this is a big problem with point b, it's not the biggest problem. The biggest problem is that all digital media relies on specialised machines to read it. This may, in the medium term, turn out to be no problem. Perhaps our technologically sophisticated descendants will just have some Star Trek robot pick up our 5¼" floppy disks in its biomechanically-adaptive info-jaws and suck off all the data. But we can't know this, and so we have to plan for it not being the case. And besides, and this is the much bigger problem, when planning for the long term we have to plan for a future in which our descendants know less about computers than we do today. As sonascope points out, microfilm can be read by anybody who has a lens, a light source and a lot of patience. Bury microfilm in a box and you can guarantee that in a thousand years somebody can dig it up and read it. Not so with a computer.

Now these two causes of problem b are ones which I argue have been solved, at least medium-term, by this idea of the network-resident archive. This is an idea that we, all of us here, have absolutely no difficulty understanding because we are, I suppose you could say, 'internet people'. I'm sure you've heard people say 'be careful what you say online, because anything on the internet lasts forever'. This sucks for people who's youthful indiscretions are preserved on Facebook, but it's very useful for archivists. Now that computers are cheap, and digital storage is cheap, it's almost possible to have every archive have a backup of every other archive. What's more, the cosmic-bit-flipping problem becomes easy to solve, because you can continually poll the archives to find errors. If archive A has a flipped bit, it will be different from B, C and D, and so you know that there's a solvable error there.

There are two problems with this, then, one easy to solve and one very hard.

The easy problem is the one that we've already seen above: the collapse of our civilisation and the loss of all our technological knowledge. Ok, so maybe this will be a difficult and traumatic thing for our descendants, but thinking like archivists this is not so much a thing that needs preventing as a historical inevitability that must be planned around. So how is it easy to solve? Well, you make sure that your digital preservation is in addition to (ie. redundant to) regular, plain old-fashioned preservation. You always want to keep the originals and you'll probably want to take your computers, every decade or so, and print out a copy of your digital stuff on microfilm as well. You always need to be planning for the eventuality that the lights could go out tomorrow. And this is pretty easy because digital preservation is so cheap (in comparison to other forms of preservation) that it really doesn't eat into your budget all that much. In fact, with a bit of imagination, you can get historians to do a lot of the heavy lifting for you. And even if it increases your yearly budget for preservation by 10%, you're going to save a bundle in other ways, over the long term, because your documents are going to get pulled out of storage a lot less frequently and thus they'll be damaged less and require less expensive restoration work and because your reading room won't need to be as big, which means that ultimately you can delay building that new storage wing, and so on and so on and so on.


Which leads us to the second, and much harder problem... Humans are dumb. You tell Homo Bureaucratus that this little humming box that costs $300 can store as much data as that huge building that costs $3 million, then you can calculate your next budget cut by subtracting the smaller number from the bigger. What is worse, your subsequent horrified reaction is likely to convince The Man that you are the wrong person for this, the 21st century, and that you need to be replaced by a kid with a computer science degree and no interest in archives.

This second contingency is very dangerous because your average computer scientist is likely to make the following three disastrous assumptions: 1. Old things are bad! Most computer scientists think that punch cards, for pete's sake, are ancient relics, let alone paper files from the pre-digital era. They're going to be pretty unsympathetic to keeping those highly-durable but expensive to store hard-copy originals, nor are they going to weep for the loss files which become more expedient to delete than keep, nor are they going to push for strong contingency-planning microfilming work. 2. Information is meant to be useful! So the most useful and efficient way to store information is to keep it in the most useful file format. So that means that we should continually update our files to the most up-to-date file format and... well... if we loose 15% of our data every three years because of lossy conversions, then so be it. This, of course, is exactly the opposite of what an archivist would think, which is to say 'always keep a backup of the original and re-convert from that for reading', but archivists are 19th century throw-backs who love old books, remember? 3. The number-one threat to data is a hard-disk crash. Computer programmers know all about data loss. They also know that the best way to prevent it is to take regular tape backups and stick them in the basement. This works right up until you get a Sarajevo or a Baghdad or a Cologne.

The sad part is that these cultural and psychological impediments to digital archives are fairly easy to get around, but archivists themselves are often guilty of working to prevent their circumvention. I'm sorry if I'm going to incur the wrath of sonascope, here, but I have to call this as I see it. Part of the problem is that most archivists I talk to, about this issue, simply don't understand computers, and this is very often a generational thing. I know plenty of senior archivists in their sixties, seventies and even eighties. They didn't grow up with the internet and they think of computers as stand-alone boxes that store things on magnetic disks that you take out and put on your shelf. My impression, though, is that even younger archivists are insufficiently trained in the principles of computer science – not just how to use computers, but the theory which underlies their operation. Many of them come into the field through the humanities, and they very often don't understand why things like the interoperability of databases are important. I know librarians who think that MARC is the last word in technological sophistication.

Part of the problem, though, is the way that archivists are taught to think. Most archivists see their primary role as that of preserving the papers. Everybody knows that the best way to preserve papers is to put them in a very stable environment and just leave them alone. The less they are touched or moved or subject to chemical or biological interference the better. Researchers, then, are sort of like parasites who shift things around and break things and use up the records by reading them.

But when you go to computer archives, almost the exact opposite is true. The best way to preserve digital archives is by constantly reading and copying and comparing and repairing. And those researchers? Instead of parasitic bookworms, they're suddenly busy little worker bees who catalogue and create and improve. It sounds funny, from the outside, but this exactly what I've heard from a significant number of the archival professionals I talk to. They think I don't know what I'm talking about because I seem to think that 'handling' the documents is a good thing.

Ok, so those are my thoughts on problem b. I'm going to do some actual work, for a bit, then I'll come back to the far more difficult (and to my mind interesting) problem of problem a.
posted by jb at 10:04 AM on March 7, 2009 [14 favorites]


People might be interested to know that Cologne State Archives has launched an initiative (English here) to get people with photographs, copies and microfilm of the documents caught in the collapse to send them in. They can even be uploaded on the archive website itself. There's also contact details for money donations.

The 'send in your copies' plea raised an additional point for me - I know some archives have moved with the times and now allow researchers to take digital snaps of the documents, but I'm pretty sure that quite a lot haven't. Unless things have changed, the last time I was in National Archives of Scotland and National Library of Scotland, you couldn't whip out your digital camera and snap originals. One of the reasons behind this, I'd imagine, is concerns about copyright. Yet it strikes me that this sort of user-generated source of digital images could be hugely helpful for preservation of copies in the event of disaster. Better still would be a massive formal microfilm/digital preservation programme, but until that's in place then surely users should be encouraged to use non-invasive methods of making digital copies which they could share with the archive and with others to help generate more back-ups.
posted by Flitcraft at 2:51 PM on March 7, 2009


By the way, this is an interesting blog post, leading to others, about just one of the manuscripts presumed lost - Cologne MS 106: The Book of Hildebald, which held important material by Bede.
posted by Flitcraft at 2:56 PM on March 7, 2009


Concerns about copyright make my blood boil! Archives are trying to demand copyright for documents that are already in the public domain. What is it in the UK again? I think 100 years after author's death, for manuscript and printed material. I work with 17th century material - and this certainly all public. But instead I have to sneak around to share historically important but financially worthless images with fellow researchers to save them trouble and money they don't have.

Grrrrr.
posted by jb at 5:53 PM on March 7, 2009 [1 favorite]


Here is the latest batch of text from my husband:

Hi, I'm back with part B of my rambling essay on digital archives which, confusingly, addresses problem a, the problem of file type preservation.

So I think we can all agree that network-resident digital archives circumvent many of the problems that are associated with 'disk on a shelf' digital storage. So let's then say that our children's children's children receive, when they look at our archival documents, an organised list of files, where each file consists of a non-random string of ones and zeros. The challenge, then, is to give them ways to translate these strings back into whatever human-readable format is appropriate.

I'm sure I don't need to explain this on a forum such as this, but just in case, here is the crux of the problem: every single file on a computer is, essentially, encoded. A computer programme turns the stuff you see on a screen, and play out your speakers, into binary mush, and then decodes it again when you want to read or play the file. This is why a set of instructions for reading and writing a file type is called a 'codec': coder/decoder. So if you want to read and write that file again, in the future, you need to store the 'codecs' with the file itself.

This sounds like it should be simple: you've got the instructions, you've got the file, you're good to go. But there are a couple of complications. The first is that many of those codecs belong to people. For example, the codec for 'wma' digital movie files is owned by Microsoft. They're not just going to let me distribute it in my internet archive unless I pay them a fee, and archives can't afford to pay fees to every single company that might own the rights to a codec they need. Eventually, these legal restrictions will run out as all the patent and copyright restrictions expire. But that's going to take a while, and in the mean time it's hard to run an archive that you can't legally access. And that's assuming that the legal restrictions do expire. Many countries, with the US taking the lead, seem to be marching toward ever lengthening copyright restrictions.

The second problem is that codecs are almost never available in their raw, human readable form, the form known as 'source code'. Instead, we get 'compiled binaries', a kind of file that will be useless to future generations. Again, I apologise to all the readers more knowledgeable in the ways of computation than I am – you can just skip the next paragraph – but I'm addressing my fellow historians here. : ) Codecs, like all computer programmes, have to be written by human beings using an ultra-precise but human-readable computer programming language like C. This is what future researchers will need in order to reconstruct the codec for their own computers. But computers can't read C. In order for the codec to actually work, it has to be translated, using a programme called a 'compiler' into a machine language that works on only one kind of computer (actually operating system, but same dif in the long run, and it's the long run we care about here). So a codec for windows won't work for a mac or a linux box (unless you have a second, special, translator programme called a 'wrapper') and it certainly won't work on the computer of the future.

Before we get to possible solutions to this problem, there is one particular kind of file that deserves to get singled out for special attention: the 'office' document. A huge percentage of the important historical papers being produced at the moment are primarily being produced in the form of 'office' documents: word processor files, spreadsheets, interactive presentations, etc. These files tend to have no formal 'codec' associated with them because they're meant to be read and written directly by a computer programme which, in the future, will long since have stopped existing. There has been a movement, in the world of computer programming, to produce a new set of file types for these document formats which will have the fascinating property of being 'self describing'. That is to say that a person from the future will be able to just look at a file and, using a series of very simple mathematical techniques, reconstruct exactly how it was meant to be read. These document types are called OASIS (aka ODF) and, if you want to, you can go on the internet right now and download a plugin for whatever word processor you use that will allow you to read and write these kinds of files.

Now here's where the dirty politics comes in. A globally recognised arbiter of computer-based standards is the ISO which, some time ago, recognised OASIS as an official standard. Once this happened, several governments decided to pass laws saying that all government documents had to be written in a self-documenting format recognised by the ISO, and a big reason for this was that they wanted their documents to be readable in the future. But this pissed off Microsoft, because even though Microsoft Word can read and write OASIS documents, so can their competitors. So Microsoft produced their own 'open' standard called OXML and pretty soon that was recognised by the ISO too, and thus legal for these governments to use. The thing is that opponents of Microsoft allege that a) OXML is broken and won't be readable in the future (that's the whole point, they say, to stop other companies making programmes that can read and write them and thus compete with MS Office) and b) that Microsoft essentially corrupted the ISO to get the standard passed. Some even say that MS paid for small countries to set up new standards bodies purely for the purpose of passing OXML. It looks like it could be a nasty case of international intrigue played for high financial stakes. The credibility of the international system for setting the telecommunications standards that bind the world together is under threat. And what is being fought over? That obscure, nerdy intellectual backwater that is digital archive preservation theory.

So do I have solutions? Well... not exactly, but there are a number of things that we could try:

1. Store data, exclusively, using open standards. This has the advantage that you pretty much know you aren't saving data that will be meaningless junk in fifty years time. But in order to do this you need to convert a lot of stuff to the open standards, and that means you're going to be degrading the information in it. You could store the original along side the 'readable' copy, but that's an expensive option and difficult to justify to the penny-pinching higher-ups.

2. Continually develop 'wrappers' for closed standards. This is sort of an extension of the network-resident archive. Basically, we continually update our code so that each new computer system will know how to work with old standards by translating old machine instructions into new ones. This is, of course, very difficult and expensive and may, in some cases, be illegal. It also supposes a continuous chain of development. Miss a few update cycles and the knowledge of how to 'wrap' a codec into a new operating system could be lost forever.

3. Preserve whole computer systems, digitally. We have no clear idea of how MS Windows, for example, works, but we do know how the hardware it talks to works. If we could develop an open standard for simulating that hardware, we could just preserve an entire copy of Windows, which could play old Windows files. This is expensive, of course, and it also makes working with the archives much harder (imagine having to emulate a centuries-old computer just to watch a movie file), but it would allow us to archive important cultural phenomena, such as video games, which are currently going to disappear without a trace very soon.

None of those 'solutions' are very attractive, but they're the best I've got for now. I'd be really interested to hear if anybody else had any thoughts about how we could solve these problems.
posted by jb at 6:32 PM on March 7, 2009 [7 favorites]


And much harder than video games, how do you preserve cultural artifacts that span several computers? For instance, think about preserving World of Warcraft, or Second Life. They have a client server architecture, and continually evolve over the lifetime of the game, both by the players and the creators. The landscape of Eve Online changed overnight from what it once, all due to player effort. What sort of system could archive that? This hasn't happened much yet, MMOs are quite long lived, but Tabula Rasa just shut down. Not terribly culturally relevant, but I'd say something like Ultima Online is.

Maybe we just have to consider MMOs more of a place than a cultural artifact: No one discusses archiving 1950's Chicago.
posted by zabuni at 10:51 PM on March 7, 2009


Unless things have changed, the last time I was in National Archives of Scotland and National Library of Scotland, you couldn't whip out your digital camera and snap originals.

I don't know what the NLS' current policy is, but there seems to have been a sea-change in attitudes over the last year or so. Many libraries now actively encourage readers to take pictures. Not sure if this is for preservation reasons.
posted by GeorgeBickham at 1:00 AM on March 8, 2009


Two quick comments to jb and spouse:

Concerns about copyright make my blood boil! Archives are trying to demand copyright for documents that are already in the public domain. What is it in the UK again? I think 100 years after author's death, for manuscript and printed material. I work with 17th century material, and this is certainly all public.

I'm afraid you are completely mistaken. In the UK, unpublished manuscripts created before 1989 remain in copyright until at least 2039. This applies to manuscripts of any date, which is why (to take a recent cause célèbre) Eric Robinson can claim copyright in the unpublished poems of John Clare, despite the fact that Clare died in 1864.

Part of the problem is that most archivists I talk to, about this issue, simply don't understand computers, and this is very often a generational thing .. My impression, though, is that even younger archivists are insufficiently trained in the principles of computer science

I daresay this is true. However, many archivists are now getting to grips with the problems of collecting and preserving digital archives. The programme for the recent Digital Lives conference at the British Library will give you an idea of some of the work that's going on in this area. See also this article by two curators at the Wellcome Library, on 'Collecting Born Digital Archives'.
posted by verstegan at 10:13 AM on March 8, 2009


I'm afraid you are completely mistaken. In the UK, unpublished manuscripts created before 1989 remain in copyright until at least 2039.

That's what was ringing bells at the back of my mind, Verstegen, but not having been a curator for seven years now, I couldn't quite remember chapter and verse. MSS copyright is a nightmare, and perhaps in the light of tragedies like this it needs to addressed so as not to prevent important archival material being made secure through multiple copies. Like jb I was once a 17th century historian too and it blows my mind to think of all the manuscript works vital to scholars still having pennies (and often pounds) squeezed out of them and research hindered by copyright.
posted by Flitcraft at 10:31 AM on March 8, 2009


My husband submits these thoughts:

I'm afraid you are completely mistaken. In the UK, unpublished manuscripts created before 1989 remain in copyright until at least 2039.

Yes, it seems that jb was mistaken. I just looked it up, and you're right, and when I messaged jb with this she asked me to express her embarrassment and contrition. However there are an additional couple of wrinkles, here, which deserve consideration. Firstly, the 2039 date only applies to documents whose authors are known. The vast majority of jb's sources are anonymously authored, meaning that copyright expired 70 years after the death of an unnamed clerk 400 years ago. Secondly, even if the source is in copyright, that copyright certainly doesn't belong to the owner of the paper itself, but to the heirs of the authors. The claim that many archives make – that they own the copyright and that you have to pay them to publish – is clearly rather suspect.

many archivists are now getting to grips with the problems of collecting and preserving digital archives.

I am heartened to hear about this, but please don't think me too much of a hardened cynic if I maintain that it's going to take many years for people in the humanities to get to where scientists are right now. I just submitted my dissertation to a major British university in which the history faculty told me, point blank, that they would not provide any computer support to historians. A few months ago I was having dinner with the head of a major manuscript collection and he was kvetching, to me, about the sheer impertinence of this computer person who wanted him to put his catalogue in some pesky standard format so it could be read by other databases for some unaccountable reason. Even my friends who actively care about and support digital archives think of it as mostly a theoretical 'one day maybe' pipe dream. My fear is that several decades from now, after a lengthy two-steps-forward-one-step-back process, we'll end up with a typically clunky and poorly rationalised patchwork of digital archives which will have none of the ease and elegance it should have.
posted by jb at 11:46 AM on March 8, 2009




A few months ago I was having dinner with the head of a major manuscript collection and he was kvetching, to me, about the sheer impertinence of this computer person who wanted him to put his catalogue in some pesky standard format so it could be read by other databases for some unaccountable reason.

What kinds of standards are we talking about? I do find that MSS. catalogues are often rather idiosyncratic as well. But it's hard to assess the reasonableness of this computer person's request in the terms that you have stated it.

There are also more fundamental issues about how one institution interprets a standard than file formats alone. One library's MARC is not another's, and this is not down to the nature of MARC itself, but rather how it is used and interpreted at a local level. Or so I'm told.
posted by GeorgeBickham at 3:22 PM on March 8, 2009


Husband says:

it's hard to assess the reasonableness of this computer person's request in the terms that you have stated it.

That wasn't really the point. The guy didn't have a problem with the technical aspects of the request; he just thought that it was intrinsically obvious that the idea of searching multiple catalogues at once was silly. I really don't want to say too much more about it because it was, obviously, a private conversation with somebody for whom I have enormous professional and personal respect. But for me it was emblematic of the suspicion with which many people in the humanities approach computers as a research tool.
posted by jb at 5:02 PM on March 8, 2009


More news this time about what's been found:

More than 100 books from the medieval chronicles collection have been recovered undamaged so far. Furthermore, more than 200 folders with manuscript fragments (they were fragments before the disaster) were recovered also almost undamaged, while other manuscripts were found wet and needed to be shock frozen so they could be treated and preserved at another location.
posted by Flitcraft at 9:00 PM on March 8, 2009


And much harder than video games, how do you preserve cultural artifacts that span several computers? For instance, think about preserving World of Warcraft, or Second Life. They have a client server architecture, and continually evolve over the lifetime of the game, both by the players and the creators. The landscape of Eve Online changed overnight from what it once, all due to player effort. What sort of system could archive that? This hasn't happened much yet, MMOs are quite long lived, but Tabula Rasa just shut down. Not terribly culturally relevant, but I'd say something like Ultima Online is.

Maybe we just have to consider MMOs more of a place than a cultural artifact: No one discusses archiving 1950's Chicago.


Here is one the only remaining collections of Neverwinter Nights "classic" ephemera. Meridian 59 has apparently been recreated, but I haven't looked to see if the old data was preserved.

Have any old MUD/MUSH/MOOs tried to 'archive' their worlds when preparing to shut down operation?

Another set of 'lost virtual settlements' (fumbling for a term) might be the early online services. CompuServe. Dephi. Prodigy. Even AOL (ewwwwww.)

So far as WoW is concerned, it's interesting to note that despite rumored database/storage problems that designers deal with when adding features (every bit they have to store for each character gets multiplied by some stupidly large number) Blizzard doesn't delete inactive characters.

Currently, we have no plans to delete account or character information for World of Warcraft accounts regardless of their activity history. Provided that the characters do not get deleted by the account holder, we will retain all character information on our servers indefinitely. An account may be re-activated by adding new payment information on the Account Management screen (www.worldofwarcraft.com/account), or by contacting us via telephone.


One interesting past scenario for evaluating "what if it dries up and blows away" anxiety about digital culture and archival is the BBS culture. The physical storage devices associated with that era of computing are already pretty hard to read at this point, due to hardware obsolescence. As a result BBS culture is only sparsely archived--I'm talking about message bases, email, etc. not t-files. In fact, the survival of the t-files in many redundant collections is an elegant argument for the efficacy of plain digital text for network centric redundant archival that exists in the wild.

So far as apocalyptic microfilm recovery fantasies are concerned -- are you supposed to bury a high quality polished lens and laser-engraved-on-stainless instructional comic along with your Autobiography of Ozymandias? Positing a complete collapse of civilization, how do we know whoever finds this stuff will have a clue what it is?
posted by snuffleupagus at 7:20 AM on March 9, 2009


Positing a complete collapse of civilization, how do we know whoever finds this stuff will have a clue what it is?

Well, outside of spy novels, most formats of microfilm that I've used are clearly something to the naked eye. You may not be able to read it, but it's pretty obvious that if you could just get it bigger you could see it fine.
posted by Rock Steady at 8:57 AM on March 9, 2009



Last victim found as Cologne Archives recovery switches to history.
Photo gallery.
posted by Rumple at 6:31 PM on March 16, 2009


I was thinking about Cologne today as I photographed parchment depositions from the 17th century at the British PRO today - and I was thinking about how the PRO should encourage me, not forbid me, to share these photographs on the web and with other scholars, thus creating many copies. Or at least take copies of my photos to store themselves. (And also about how they need to redesign the photography stands to take photographs of many of the larger, older documents with very good cameras). I've heard that the Beinecke Library at Yale has been allowing people to photograph - and then using copies of those photos to add to their digital collections. (Is this true, or just a rumour?)

As it is, you can all sleep easy knowing that if the PRO collapsed today, the world will not lose the depositions from late seventeenth century north Cambridgeshire tithe cases. Also, the Bedford Level Corporation order books from 1663 to 1750 are safe ... except for that one page I missed.
posted by jb at 4:56 PM on March 17, 2009


Interesting analysis of possible causes of the collapse.
posted by klausness at 6:42 AM on March 21, 2009


Construction managers and city officials saw measurements that clearly showed Cologne’s Historic Archive was sinking into the excavation work below it weeks before the building collapsed, it has been revealed.
posted by Rumple at 10:33 AM on March 21, 2009


« Older Mine? Mine.   |   I hate people who are... Newer »


This thread has been archived and is closed to new comments



Post