Will your grandkids being able to view a .jpg file?
January 9, 2015 5:11 AM   Subscribe

We all know printed photos, properly stored, have an extended shelf life; as many of us likely have at least a handful of family photos that are 75+ years old. Will our grandkids be able to read the DVDs they find in the attic, or the thumb drive full of jpg files that had been sitting in a box for 50 years? Will the media even be readable that far in the future? Maybe we should all be printing to paper the photos we really care about.
posted by COD (99 comments total) 36 users marked this as a favorite
 
This is a hobbyhorse of mine! People seem to think that, when they have digitized something, they have given it real permanence. In practice, that is not the case. The physical media decays (CD/RW is especially bad for this), the file formats become obsolete, and so do the peripherals and I/O devices that the media needs. I also don't see any reason to believe that cloud storage will be reliable for a timespan of decades.
posted by thelonius at 5:32 AM on January 9, 2015 [10 favorites]


Basically the only old family photos that I have are in a single photo album that my mother put together when I was a toddler.

As a comparison, we take all of our best digital photos for the year and compile them into commercially printed photo books every year and send them to the kids' grandparents as gifts. We have a whole shelf of these now, and the kids really enjoy going through them. There are also then 'backup copies' in the hands of close relatives in case something should happen to ours. Just on this basis, my kids actually have much more thorough documentation of events in their life than I ever even really had the possibility of having.

The digital photos are kept on the original SD card (cheaper than buying and processing a 36-exposure roll of film back in the day, isn't it?), backed up on a local hard drive, backed up on a NAS in the house, and pushed to cloud storage. Will any of these last forever? No, of course not, but as long as someone is paying a tiny bit of attention to them, they will last a very long time indeed.

There's nothing more or less permanent about digital files vs. paper copies of things - proper curation and care is required for both. Sticking things on a single hard drive and expecting them to last forever is no better or worse than putting them in a box in a leaky attic or a basement that floods periodically. Yes, my grandchildren will be able to read JPG files. Re-engineering the file structure from scratch really isn't even all that difficult. TIF is even less difficult. If the files are taken care of, they will last. If paper is not taken care of, it does not.
posted by grajohnt at 5:38 AM on January 9, 2015 [45 favorites]


The current push is to have larger storage and regularly back it up, though, which would fix this without any issue. Leaving things indefinitely on USB drives or hard drives is not so smart, but storing *anything* in a single format for long periods is not a good idea and it can degrade - paper included.

So if your long term storage is digital, it can be kept current in the same place as your current/everyday storage and the problem.... evaporates, doesn't it?

Also, the idea that something as pervasive as a jpg would fade out of use without huge availability of conversion/emulators to view or convert it is a bit laughable. It's unlikely we'd forget how to open an old file, or not be able to work it out as a tech society. Old file types is well known as a problem and the only time I haven't been able to open old obsolete files is when they were just so rare that nobody bothered to make a converter. Even my old Psion 5MX backup can be accessed and converted.
posted by Brockles at 5:41 AM on January 9, 2015 [16 favorites]


I think the best use of cloud storage is probably just as a backup. So you keep your photos archived locally, with the cloud as a backup. You move your photos as you upgrade/replace your home IT setup. If the cloud service goes out of business, you can just upload your local stuff to a different service. And so on. The odds of simultaneous local hardware failure and your cloud backup service going out of business in the same week are pretty low.

You can add extra redundancy to this by adding a NAS (maybe with RAID1) on your local network.

And while I religiously shoot my photos in RAW these days, like grajohnt I'm careful to keep a JPEG copy, on the assumption that JPEG isn't likely to be forgotten unless we're all left hunting rabbits with sticks in the dark.
posted by pipeski at 5:42 AM on January 9, 2015 [1 favorite]


I've been making this point to my friends for many years now, now just in regard to photographs but all manner of digital text documents too. Internet zealots will tell you that files can always be converted to whatever new format the future demands and remind you that paper copies degrade too. Both those things are true as far as they go, but the real question is what happens to your files in a situation of benign neglect.

Left to its own devices, a paper document will survive for many decades, perhaps for well over a century, requiring no more than a pair of eyes and a shared language to access it. You have to go out of your way to destroy a document like this by burning or shredding it, which is why neglect tends to operate in paper documents' favour.

Digital documents, on the other hand, require constant updating to make them accessible through each new format that comes along. That relies on someone believing it's commercially worthwhile to do so, and who's going to make that kind of investment in your family photographs? You have to go out of your way to preserve a document like this by repeatedly converting it, which is why neglect tends to operate to digital documents' detriment.

Digitisation is certainly a fantastic way of distributing information, but it's also a very poor way of preserving it. We really ought to be able to hold both those ideas in our heads at once.
posted by Paul Slade at 5:44 AM on January 9, 2015 [13 favorites]


You can't beat baked clay tablets for really long term storage. Getting jpegs onto them is a bitch, though.
posted by Segundus at 5:45 AM on January 9, 2015 [35 favorites]


Digital photos can disappear in the flash of a malfunctioning hard drive. Paper photos can be damaged or destroyed in floods and fires. Don't let your precious moments be subject to the whims of time and entropy! For a reasonable fee my company's artisanal stone carvers will preserve your family's memories for future epochs.
posted by The Card Cheat at 5:47 AM on January 9, 2015 [22 favorites]


I'm confident I could write a jpeg decompresser from scratch using only the publicly available documentation/standards. Imagining that the computer industry might somehow "forget" how to decode jpegs is like imagining we'll forget how to do floating point.

Now those RAW image files particular to your specific model of digital camera that was sold for a few years, then discontinued? Those device-specific formats are far more likely to be lost.
posted by ryanrs at 5:51 AM on January 9, 2015 [14 favorites]


Left to its own devices, a paper document will survive for many decades, perhaps for well over a century

Have you seen a 5 year old printed-out piece of paper recently? I don't know if it is just non-laser printed stuff, but the quality of hard copies of documents now is nowhere near as good as those created 50 years ago. So what lasted 75 years (when printed with 75 year old print processes) may not work with modern print processes as we have made that process much cheaper and 'just good enough'. So it is not necessarily valid that the same longevity applies just because it did 75 years (or more) ago. Paper quality, ink quality and process have changed significantly.

You have to go out of your way to preserve a document like this by repeatedly converting it
Er. Jpeg is currently over 20 years old and looks solid enough that it will exist for the near future, so 'constantly converting' is likely to be once every 25-30 years. So three times in most people's lifetimes, no? Hardly constant. In addition, file formats are becoming more stable, rather than less, so it's likely to be twice a lifetime at most before long.
posted by Brockles at 5:58 AM on January 9, 2015 [5 favorites]


I'm pretty sure if my brother and sister-in-law suffered a complete technology failure, we could still recreate my niece's entire childhood just based on images we've had printed on mugs, t-shirts and other tchotchkes for her Opa.

In most cases, I'd guess that losing family photographs after many years wouldn't actually be such a horrible loss. Nobody needs to see the 900 selfies your teenager took last Tuesday. A generation from now, one picture of your kid as a baby is going to be plenty for people to look at.

The problem, of course, is that in losing most, most people would lose all, since we encourage people to consolidate their storage in a single place in the cloud or on an external drive or whatever. Storing all your photos in multiple ways is cost prohibitive, and storing some of them in multiple ways requires curating and that's got time and emotional costs.
posted by jacquilynne at 5:58 AM on January 9, 2015 [2 favorites]


I've been bouncing around an idea for a subscription app/service that would be

a) Camera app for phone, every pic you take is saved in the cloud.
b) Simple interface to pick your favorites / best images.
c) Once a year you get an email saying "these are your best pics from this period, we will send you an album".
d) You can optionally go online and change the selected pics.
e) You get a printed album via snail mail with your best pics of the year.

I know that there are a million album printing services, the difference is that this is automatic: sign up, take pics, bam! you get an album every year + cloud backup. Does this sound like something somebody would pay for?
posted by signal at 6:02 AM on January 9, 2015 [16 favorites]


It's true that traditional photographs are vulnerable to water damage, decay from light or heat, and other problems, of course.
posted by thelonius at 6:04 AM on January 9, 2015


Paper is no panacea. Ask anyone who grew up in or near the Australian bush. When I finish a major scanning project I instinctively feel so relieved, because now the words and images aren't at risk of going up in smoke all at once; I can back them up. (I stlll keep the originals.)

And I haven't even been at risk of bushfire for the past thirteen years - the feeling is so ingrained that I can't shake it. The houses in my city are pretty old and inflammable themselves, though, so it's not totally illogical.

My parents spent a tense week two years ago cut off from home when bushfires swept into the area while they were out of town. Dad thought he was about to lose his life's work. Fortunately the fires spared their town, and now he has offsite backups galore.

I like your distributed family album approach, grajohnt.
posted by rory at 6:11 AM on January 9, 2015 [4 favorites]


The paper-vs-digital discussion has generated a lot of serious-sounding editorials which seem foolish once you realize it's just a question of whether you should expect good results by pretending really hard that something is actually something else.

You lose quality any time you copy analog media. The correct strategy is to use durable materials avoid avoid generational copies, which we have considerable experience with.

Digital formats tend to use relatively-fragile physical media but perfect copies are cheap and easy to make. The correct strategy is to make multiple copies to avoid depending on any single physical container.

The problem tends to be generational: people who grew up in the pre-digital era tend to focus on the physical object and push for things like allegedly “archival grade” tapes, CDs, etc. which would allow them to treat digital collections in essentially the same way as they've treated the physical collections even though it tends to minimize both the cost savings and benefits.

The specific area of file formats is more complicated because there's a network effect. It's extremely unlikely that you'll ever need to worry about reading a JPEG because there's a published standard and many high-quality implementations. The same is true for an HTML page, PDF, Word doc, etc. What we should be worried about are niche formats (e.g. the Navy is concerned about CAD files because ships have 50+ year design lifetimes) or anything where the manufacturer makes it hard to copy. I would worry about video games, files created on single-vendor devices or the RAW files which ryanrs just mentioned, particularly when the vendor uses something like DRM which would prevent hobbyists from preserving abandoned software.
posted by adamsc at 6:17 AM on January 9, 2015 [10 favorites]


We were discussing this article on Twitter yesterday and a professional photographer mentioned that every month she "stars" the photos from the previous month that she wants to print out. Then in January she sends the year's worth of photos worth printing off to a photo printer. So she has printed copies of the best shots from the year, then the usual collection of digital files stored locally and backed up to the cloud.
posted by COD at 6:19 AM on January 9, 2015 [3 favorites]


I'm glad they also mentioned VHS tapes, because we're struggling with that issue right now. It's easy to throw away or "donate" VHS copies of actual movies we either don't want to see again or have newer versions of on DVD or Blu-ray, but what about the VHS tapes of family picnics and weddings, or the compilation tapes of Monty Python or Beatles specials I lovingly put together decades ago? Are the originals worth saving? Are they worth converting to a different storage media? And what happens when that new storage media becomes obsolete?
posted by Curious Artificer at 6:21 AM on January 9, 2015


Doesn't the fact that there are literally billions, trillions, perhaps even quadrillions of JPEG photos out there more or less ensure the survival of the format? I see this article every few years, and it doesn't make logical sense and it's rather lazy writing. Per ryanrs, above, yes, your weirdly specific RAW format for your camera will disappear, possibly with the camera maker, more likely when the format is abandoned by that manufacturer and not open-sourced. But some semblance of JPEG reader will be around for as long as we have computers.

(Same goes for *.zip, *.txt, and *.mpg, by the way. Too much stuff in those formats. Maybe PDFs and Word docs, too.)

As for media? Well, that will certainly change. But as long as you are willing to consolidate to a new location every decade or so, you have a local copy and a remote copy (call it the cloud, if you like, or use your remote server), your grandkids will be able to look at your photos. Though you should certainly pull your mid 90s to mid 2000s files off of those CDs or DVDs. That stuff will rot away in the next 10-15 years.
posted by aureliobuendia at 6:25 AM on January 9, 2015 [2 favorites]


I'm confident I could write a jpeg decompresser from scratch using only the publicly available documentation/standards. Imagining that the computer industry might somehow "forget" how to decode jpegs is like imagining we'll forget how to do floating point.

Yeah I'm generally skeptical of the skeptics on this issue, you might say. If I was going for maximum archival value I might choose an uncompressed raster format, as for audio I'd choose some kind of uncompressed PCM. There are underlying representation techniques in the digital domain which I don't think are much less obvious than the idea of scratching sound waves onto a surface. And certain more complex formats like JPEG are exceedingly well documented over the course of many years.

It's minor or specialized formats that might be a problem, particularly if they are hard to reverse-engineer.
posted by atoxyl at 6:27 AM on January 9, 2015


All of two FPPs earlier, when writer Tobias Wolff is talking about his personal archives:
And letters. Stacks of them. Mostly typed, some even handwritten, these come to a pretty cold stop in the late nineties, with the advent of e-mail.
posted by ricochet biscuit at 6:29 AM on January 9, 2015


Jpeg is currently over 20 years old and looks solid enough that it will exist for the near future, so 'constantly converting' is likely to be once every 25-30 years.

I don't think people will have to convert their jpegs, like ever. Much like the Roman alphabet, the jpeg will be interpretable on a "longer than civilizations last" timescale. Even if interstellar ramscoops of 5,000 or 10,000 or more years from now don't actually run a descendant of unix like in Deepness in the Sky, I'm confident that jpeg-reading software will be buried somewhere in the bowels of their computer systems. I would not be shocked if they actually use jpeg as a common file format unless they've become posthuman or met enough aliens that a storage scheme keyed on human vision doesn't make sense any more.
posted by ROU_Xenophobe at 6:34 AM on January 9, 2015 [14 favorites]


My dad had a Nikon SLR in the 70's-80's and took all the family photos on slides. Theoretically, preserved forever. Practically, a pain-in-the-ass to actually access. You can't look at them without a slide projector, and converting them to digital even cheaply still would cost hundreds of dollars.
posted by smackfu at 6:36 AM on January 9, 2015 [3 favorites]


You can't beat baked clay tablets for really long term storage. Getting jpegs onto them is a bitch, though.

Note that TFA links to a service which etches text, line drawings, and photos onto metal plates.

I think that one of the things which gets left out of these debates is the proportionality: if you print out a photo you're probably spending more than a hundred times as much money as storing that photo in most digital storage mediums would cost. A single printed photo should be compared to many, many digital copies secreted all over the place in different media... that's one of the primary advantages of storing something as machine-processed data, that it can be copied nearly instantaneously.
posted by XMLicious at 6:40 AM on January 9, 2015 [2 favorites]



Even if interstellar ramscoops of 5,000 or 10,000 or more years from now don't actually run a descendant of unix like in Deepness in the Sky, I'm confident that jpeg-reading software will be buried somewhere in the bowels of their computer systems.

The pornography of today will accompany our scions out among the stars.
posted by XMLicious at 6:41 AM on January 9, 2015 [8 favorites]


I'm writing all my best memories in the form of sonnets, hoping that catch on in the popular consciousness and perhaps they'll live forever that way.
posted by resurrexit at 6:41 AM on January 9, 2015 [6 favorites]


ROU_Xenophobe: I'm with you on that timescale. The only scenarios I can come up with where we'd lose JPEG are the apocalyptic side where civilization collapses in a highly destructive manner. I guess at that point the analog argument would be right in a way as the last remnants of humanity would no doubt find paper records far more effective for starting fires.
posted by adamsc at 6:43 AM on January 9, 2015 [3 favorites]


JPEG might also fall out of favor if it turns out that computer vision systems can get useful information out of the noisy, high frequency data that JPEG discards. Otherwise yeah, we might keep using JPEG forever.
posted by ryanrs at 6:48 AM on January 9, 2015 [1 favorite]


I think as long as we all make incremental upgrades, we'll be ok. If I put my iMac away today and didn't take it out for 30 years I might have trouble reading the files off it, just like I have no idea how to read that 9-track tape with all my high school COBOL programs, or the 3 1/2 floppy from an original Mac with some Mac Paint drawings on it, or the 5 1/2 floppy with the TRS-80 program I submitted to Rainbow magazine.

But probably the last three times I got a new computer, the new hard drive was large enough that I could copy all my files from the old one and read them one way or another, maybe with the exception of a few AmiPro files from the mid-1990s. Every time I upgrade iPhone or Aperture it converts the old files to a new format. Probably only a matter of time where everything comes with an emulator so you'll just be able to fire up a VM of all your previous computers.

There will never be a time in my life where I don't have a computer of some sort in my house. I'm pretty confident my files will be readable forever, as long as they get some sort of attention every couple of years. They'll certainly be in better shape than all those old, cracked Polaroids my parents stored in a shoe box.
posted by bondcliff at 6:50 AM on January 9, 2015 [1 favorite]


Upping the ante a bit, video is an even harder problem. Even without sound to contend with.

For a Christmas gift I digitized some 60 year old 8mm home movies. The film had been stored like anyone would, in a metal box in a garage. Most of it survived OK but old film shrinks so the film won't play any more on a projector with normal sprockets. The fancy shop I took the film to has some more gentle film advance system that can still drive the film and a software algorithm to find the frame boundaries. Fortunately most of the film was in good shape but some of it had "vinegar syndrome", warping the film so it didn't lay flat. That's very hard to actually fix, but at least the image is still there if just a bit warbly and out of focus.

Anyway, end of the day the shop turned a stack of smelly unwatchable 8mm film into some beautiful 720p MP4 video I can play on any modern device. For about $4 per minute of video. I'm grateful we had the film archive, but 8mm is an awkward medium.
posted by Nelson at 7:12 AM on January 9, 2015 [7 favorites]


As others are saying, the file format may not change, but the way one access media certainly will. I'm sure there are whole archives lost because nobody has a working Zip drive anymore. I have a bunch of college work on 256 MB magneto-optical discs, but the only person I know who might have one of those drives to read it would also need a working computer with a SCSI drive, and so on and so on...

There have been herculean efforts to recover old NASA data from tapes, but the average family may not have the same resources to resurrect the images off of grandma's old featurephone in the future.
posted by fifteen schnitzengruben is my limit at 7:13 AM on January 9, 2015 [2 favorites]


My family had a number of 78 RPM records that were already mostly unplayable by the time I was a kid in the 1970s, because every stereo in the house could only be set to 33 or 45 RPM. The one record player that played 78s was the flaky one built into its own suitcase that (by default, I guess) fell into the possession of the youngest kid (me).

At least I grew up with a healthy appreciation for polka because of it.
posted by fedward at 7:29 AM on January 9, 2015 [9 favorites]


Simple solution:

1) Always include nudity or something embarrassing in every single photo you take.
2) Post every photo online somewhere.

Your photos will never go away.
posted by orme at 7:30 AM on January 9, 2015 [12 favorites]


Needless to say, all you people with stuff on zip disks and mo cartridges really ought to have converted to a new format a decade ago. You can still do it now if you are willing to do a bit of googling and pay someone a small amount money.
posted by ryanrs at 7:31 AM on January 9, 2015 [3 favorites]


We are in the process of converting a shoebox of my family's recorded phone conversations/audio letters, on 35-year-old cassette tapes, to digital. The husband thankfully has an audio studio setup so he can do what can be done with the degraded stuff.

But at the same time, I wonder; is it worth much in the timescale of history to preserve my adorable lisping 5-year-old Texas accent as I tell my dad about my lost tooth? I mean, it's cute, but once I'm gone, or once my son is gone, who is going to care?*

We're talking as if saving all of these images, videos, audio files is something we must do, but how much of it will be useful in 100, 200, 500 years? I don't personally care if it ends up in the human hive mind's vast data cloud in the year 3000, you understand, I'm just not sure I should freak out if some of it gets lost.

*We have already made the decision to not keep copies of the tapes where my parents were doing sexy talk to each other. They're both dead now, but it still feels wrong to listen to it.
posted by emjaybee at 7:43 AM on January 9, 2015 [3 favorites]


The problem with an article like that isn't that our generation's grandchildren won't be able to read arbitrary JPEGs (well documented format, blah blah), it's that they won't be able to read our JPEGs, because the physical media will be inaccessible and/or corrupt by then. Nobody has really figured out long term digital storage, and this isn't new. So you can save all the SD cards and hard disks, but who's going to have a reader or a computer with the right physical interface for the media? Who even has a SCSI or PATA interface for drives that were current just a few short years ago? And if the grandkids do manage to plug it in, what's the likelihood the media will be readable?

What's ironic about this article is the target audience is somebody like my mom who responds to fear. But mom had such distrust in computers that for a few years she printed nearly every photo of her grandchildren that she received. She thought by printing she was preserving those memories. She stopped printing them all after a few years, simply because by that point she could tell that her old printouts were fading so quickly, and so badly, that they were even less permanent than the files on her computer.

"Well, mom, that's the problem with inkjets. They're not generally meant for long term storage. Well, there are archival inkjet printers, but they're more expensive, and they're harder to use, and they can be weird about color …"
posted by fedward at 7:46 AM on January 9, 2015 [2 favorites]


I inherited a bunch of daguerreotype and tintype family pictures from the civil war era; they were the only copies that existed. So I am in the process of scanning them in, re-printing on larger media, and eventually I will send photo books to family members. Many of these family members have no idea the pictures exist.

Digital is not the only solution; print is not the only solution; one thing that does help a lot, though, is not having the single copy of something important on any one form of media. This doesn't just go for pictures, it also goes for your novel, your doctoral thesis, the live recording of that band you like, etc. Another key to preserving information is sharing the info, making people know that the information exists, so more people place value on keeping it around. Then future generations will take on the task of converting the media if they think it's important to preserve (most probably isn't).
posted by tempestuoso at 7:55 AM on January 9, 2015 [4 favorites]


emjaybee: the value of the media we're archiving is something I think about too. I mentioned it cost $4 / minute to convert my 8mm film to digital video. I didn't mention that there was 200+ minutes of film, and 90% of it is boring. Dad pointing his camera at the reservoir, holding it there for 15 seconds, like a really crappy jittering photograph of a boring lake. Fortunately Dad also turned the camera on his son for the last few feet of film he had left over before processing, and since the whole thing was a gift for the son, it was worth the trouble.

One problem with digital photography is it makes it very easy to take and keep many, many bad photos. That's where selection and editing is so valuable. Maybe we should spend less time worrying about how to preserve a whole stack of photos and more time worrying about how to make and preserve a well-edited scrapbook. My Flickr archive is a bit like that, although given the shenanigans over there I have some concern about the longevity of that. And it's remarkably difficult to preserve Flickr metadata, and nearly impossible to reuse it.
posted by Nelson at 7:56 AM on January 9, 2015 [2 favorites]


I take pictures as RAW, save them off as TIFF, then use JPG to distribute copies. Couple of time a year I print the important ones and store them in archive quality binders (acid free paper, etc)

I am concerned about the longevity of both laser toner and inkjet inks but I haven't come to a reasonable solution to that yet. It may be necessary to find a way to transfer them as negatives to film at some point, but the jury is out.
posted by disclaimer at 8:01 AM on January 9, 2015


fedward: the logical error is the assumption that both digital and analog media should have the same long-term storage solution. Long-term bit-level storage has been a solved problem for many years – make lots of copies rather than trying to make a single container which will last long enough that your grandkids can hold something which you touched.

At a normal person's scale, this isn't even particularly hard or expensive – for many people, simply using iCloud/Google Drive/OneDrive/Dropbox is sufficient. I personally like the approach which Crashplan has where you install the free backup client once and use any combination of external drives, other computers anywhere on the internet or their cloud servers, since that means that you can setup backups once and have reliable protection against hardware failures, natural disasters and even their business failing.
posted by adamsc at 8:13 AM on January 9, 2015


The point supra about RAW formats being more vulnerable than JPG (e.g.) is well taken, but I wonder if it'll prove out. I'm too new a shooter to know, but does Adobe keep support in Photoshop and Lightroom for ancient RAW formats? Like, if I had an original and un-upgraded 5D, would Lightroom be able to handle my RAW files?

Even so, outputting to a lingua franca for long-term archiving is probably the right call, and I think at this point we can assume that JPG will outlive us all. It's too common, and there are too many files people want in that format.

That doesn't absolve you from deciding what a long-term format is, or what a long-term storage medium is. For me, it's become all about keeping everything I ever might want spinning on my network all the time, forever. Space is cheap, and out-of-sight (i.e., offline and in a drawer on a flash-in-the-pan storage medium) is out-of-mind in the worst way.
posted by uberchet at 8:16 AM on January 9, 2015


Now those RAW image files particular to your specific model of digital camera that was sold for a few years, then discontinued? Those device-specific formats are far more likely to be lost.

This is why my camera RAW files all get converted to DNG RAW files.
posted by MrBobaFett at 8:17 AM on January 9, 2015 [2 favorites]


adamsc: I dispute the claim that long term digital storage is a solved problem. The current state of the art isn't so much "make a copy" or "make a number of copies" it's "keep making copies at regular intervals, endlessly." It works on a short scale, but I'm not even sure that Dropbox, Crashplan, or the like even work at 20 year scale, much less 200 year scale.

Also who's going to keep making those copies or paying for those services after we're dead? Who's even going to know that the stuff is there needing to be copied (or paid for) in order to stay fresh? At least when your grandparents kick it you find the box of photos when you clean out their house, but it's not like they leave you their 1password master password in the will along with a digital inventory.
posted by fedward at 8:26 AM on January 9, 2015 [3 favorites]


It's been a while since I worked in a library, but I seem to recall that this is also a huge challenge for them. Microfilm and microfiche held up pretty well, but databases on proprietary CDs that lived in a specialized hardware tower, that was a different thing. We have a long history of conserving books and magazines, digital formats and hardware to interpret it, that's a newer field. Also difficult to determine whether the new hotness will have enough longevity to warrant investment.
posted by fifteen schnitzengruben is my limit at 8:29 AM on January 9, 2015 [2 favorites]


I don't know that the future really needs our gazillions of jpegs. Whatever small fraction of them that survives will still be vastly greater than all of the combined archives of pre-20th century intellectual and artistic output.
posted by xigxag at 8:31 AM on January 9, 2015 [7 favorites]


The complexity of long term archival storage of digital media is yet another reason not to have kids added to my collection. Thanks!

Now how do I store that collection?
posted by srboisvert at 8:33 AM on January 9, 2015


Stored unused in a closet or attic, the mechanical parts in a hard drive can break down over the course of a year or two. At most, hard drives are built to last around five to seven years, Miller said.

What? I have hard drives that are over 30 years old stored in an outside garage (0 degree winters and 100 degree summers) and they worked when I turned them on for the first time in forever. Not to say this is the best way to do things, but in my experience hard drives stored offline have been extremely reliable (though no guarantees). Further, all of the data I created in the 1980s fits in a small corner of a single hard drive today.. and all the data I create this decade will fit easily in a single drive of the future.
posted by stbalbach at 8:38 AM on January 9, 2015


I want my family pictures to survive the heat death of the universe. Can the cloud make that happen?
posted by tempestuoso at 8:39 AM on January 9, 2015 [9 favorites]


uberchet:
but does Adobe keep support in Photoshop and Lightroom for ancient RAW formats? Like, if I had an original and un-upgraded 5D, would Lightroom be able to handle my RAW files?
Lightroom uses the Camera Raw plugin and does support the original 5D and quite a bit of stuff. Still I would convert that stuff into DNG, which is an open format Adobe created in 2004.
posted by thewalledcity at 9:09 AM on January 9, 2015 [1 favorite]


It's been a while since I worked in a library, but I seem to recall that this is also a huge challenge for them.

It's a huge challenge for anyone maintaining a collection, museums too. The standards in museums call for migrating data and creating redundancy on regular bases, but most museums aren't staffed for that. There is already a lot of information locked in early digitization systems that is functionally unreadable in many museums. Most museums have retained their old card catalog registration materials as backup, and they are great, but at each place entry went born-digital at some point so card records tend to come to a complete stop in the 80s or 90s, meaning there tends to be a gap where early digitization came in, only to see the institution move on from it because it was shitty, or to see proprietary software companies stop maintaining the software that supported the digital records.
posted by Miko at 9:10 AM on January 9, 2015 [2 favorites]


fedward: calling bit-level preservation a solved problem is not the same as “free” or “you never need to think about it”, only that there are simple strategies which, if followed, are known to keep working. Yes, you might need to pay for something – just as you need to pay for space to store books, photos, tapes, etc. At a low scale, that could be as simple as sharing a family photo archive with multiple family members using one of a number of options so that multiple people have full copies. At an institutional scale, that could be joining a preservation system like LOCKSS or uploading to a shared resource (e.g. the Internet Archive or your local historical society, etc. depending on the collection).

All of this would involve at least some thought into risk management but, again, the main reason this seems like a noteworthy challenge is the addition of “on a computer” – sharing a backup password with a relative is very similar to ensuring that there's more than one copy of a safety-deposit box key, safe password, etc. If it doesn't already, in a decade or two having a shared family dropbox is going to be as unremarkable as it does for e.g. a 70-year-old relative to make sure their kids know where they keep their photo albums and paperwork.
posted by adamsc at 9:11 AM on January 9, 2015


Isn't cloud storage going to make a lot of this worrying irrelevant? Or at least shift the burden of responsibility to hosting companies like Google?
posted by echocollate at 9:15 AM on January 9, 2015


adamsc: I don't think we're disagreeing on the facts, but we're disagreeing on whether the current state of things is good enough. Rolling the contents of every previous hard drive onto every new, larger hard drive isn't hard, but it's a pain in the ass, and I say that as a person with a banker's box of old hard drives in the basement and fifteen years of digital photos (not to mention some old PhotoCDs).

A photographic process print can be ignored for decades and still "work," in that when you pull it out of the shoebox or whatever it still presents some percentage* of the original data in a way you can make sense of just by looking at it. You literally cannot ignore digital data for decades with anywhere close to the same success rate.

Until digital data gets to the point I can largely ignore it for more than the lifespan of a single hard drive, archiving it isn't solved in the same way photographic process prints are.

* Discounted for color photos from the 60s and 70s that have all turned yellow, but which at least have useful luminance data, and not counting inkjet prints made with dye-based inks at all. B&W photo prints seem best of all, followed by color photo process prints and "archival" inkjet prints which haven't yet stood the test of time, but I digress.
posted by fedward at 9:26 AM on January 9, 2015


Cloud storage + SSD drives are going to make this issue less of a problem going forward. Now I need to go get some files off of my Jaz drive...
posted by jnnla at 9:28 AM on January 9, 2015


There's kind of an analogy here with recorded sound and movies. If you have the physical thing — a phonograph record, cassette tape or film reel — you can't hear the music or see the movie. You need the gadget that reproduces what's encoded on those physical things. We've been recording sound and moving pictures for over a century now, and we can still retrieve and reproduce the recorded sound and pictures from any of the original media — usually not with the original equipment. And we are now re-storing and re-producing all that old stuff in digital formats. I think the same thing will happen over time with digitally-stored photos (and sound, and video) — even though new storage methods and formats will develop, the old stuff will continue to be readable and transferrable to new formats.
posted by beagle at 9:40 AM on January 9, 2015 [1 favorite]


fedward: I think what we disagree on is mostly the average experience. It's easy to focus on the things which worked but ignore how many people have photographs which aren't in good condition or were completely lost due to time (cheap prints), environmental damage (e.g. simply the humidity in many places takes decades off of the theoretical max lifespan), or accidents of varying levels. My grandfather has negatives which he took in the 70s which are still in good condition but he also has Fortran code from the same era which still runs, too, so I have to remind myself that he's more of an outlier than the other relatives who've lost the majority of the photographs they ever took.

I would argue that we're past the point where there's a significant difference for the average person. One of the interesting aspects is that digital data is easy to handle in aggregate without being aware of every file. Ignoring the cloud angle for the moment, while it's true that you can't ignore the lifespan of a drive the common path for most people is to use the built-in transfer feature when they buy a new computer, which quickly produces a full copy of everything on the old system including files which probably haven't been looked at in years and may even have been forgotten. There's not really a physical equivalent to that and it makes the lifespan of the drive less relevant for any system which is actively used.
posted by adamsc at 10:22 AM on January 9, 2015


Cue Roy Batty holding a fistful of floppy discs and exclaiming, "All those jpegs will be lost in time ... like years in the rain." Except in Bladerunner they seem to have gone back to paper, or something like it.
posted by lagomorphius at 10:34 AM on January 9, 2015


This reminds me of a plotline in Robert Reed's Great Ship series. One character is able to confirm that she had been visited as a child, centuries earlier, by a particular person by paying a data historian to trawl through all available images from the time of the encounter. They found footage from someone's wearable camera (without audio) that had been uploaded and floated around for all that time.

I've no idea what our data recovery capabilities will be in the future (I tend to side with the hardware being a greater, or at least more immediate, problem than software). This thread does remind me to try to get in the habit of going through my images and identifying a handful of quality ones so I have a smaller amount of data to worry about shepherding through the coming decades (perhaps making archival hard copies as some have suggested).
posted by audi alteram partem at 10:48 AM on January 9, 2015


LOCKSS - Lots of Copies Keep Stuff Safe.
posted by ikahime at 10:48 AM on January 9, 2015 [3 favorites]


In addition, the learning curve for preserving paper and preserving digital media is radically different.
posted by ikahime at 10:49 AM on January 9, 2015


what about the VHS tapes of family picnics and weddings, or the compilation tapes of Monty Python or Beatles specials I lovingly put together decades ago? Are the originals worth saving? Are they worth converting to a different storage media?

If you've got stuff you care about on VHS, it's absolutely worth converting to a digital medium.
posted by flabdablet at 11:05 AM on January 9, 2015


I'm using the multiple digital copies strategy. I'm not worried about being able to open jpgs and tiffs. After all, MS Word still opens my 1989 WordPerfect files and jpg is way more prevalent than wp5 or wpd ever were. I can even download a converter from MS that will open my older WordStar files in Word. If there is enough demand, a solution will be available. (Of course if I hadn't transferred those docs off of floppy disks years ago, I'd have a different problem. Not an insurmountable one, but a problem.)
posted by JParker at 11:14 AM on January 9, 2015


Yeah, archival prints are expensive and difficult. Unless we're headed toward a civilization collapse, you're OK if you have a ton of JPEGs running around. There are companies who make good money selling little rackmount servers pretending to be 1950's era mainframes, running 1950's era mainframe software and talking to 1950's era control systems. There will be money in showing you your kid's baby pictures, so the software is always going to be available, believe it.

On the other hand, I'm going to be doing a backup of all our family photos on an archival blu-ray twice a year, and leave one copy at my parent's house. Trust not the cloud.
posted by Slap*Happy at 11:32 AM on January 9, 2015


I don't know what this noise about "obsolete formats" is. Except for purpose-built encoding schemes for one-off uses, there aren't any. Once the codec libraries are out there they're out there. It's trivial to include TIFF support in your latest image viewer even though the standard is over 20 years old.

When it comes to consumer-class media the electronic formats will be good for as long as our species is capable of understanding electronics. There are people like you and me, at this very moment, playing around with wire recording.
posted by clarknova at 12:04 PM on January 9, 2015 [1 favorite]


2014 was a miserable year for me, with the death of close family members. One of the most heartbreaking and stressful difficulties to overcame is the incredible abundance of pictures and films: 8mm movies, black and white, sepia, color, faded color, wallet size to 16x20, forgotten people oil portraits, fair grounds caricatures...You name it, I have it.

It is my new year resolution to tackle the unyielding mess, twenty minutes per day. So far I have shipped out the 8mm films and the slides to be rendered into digital format, and I have started to scan the dozen and dozen of boxes of pictures and rotted picture albums. I'm getting to be a very stern judge of artistic or historical value of a picture: the sheer volume has made me cold hearted. I'm keeping some in labeled archival boxes, I'm backing up as I do, and also storing all of it in cloud storage.

Hopefully my children will be able to access the pictures, but.. start chucking and shredding, Mefites.
posted by francesca too at 12:07 PM on January 9, 2015 [3 favorites]


Now I need to go get some files off of my Jaz drive...

I have a dozen 1Gb Jaz carts, which cost me over $100 each. I have a maybe ten 40Mb Syquest carts (~$50-75) and maybe five 20Mb Bernoulli carts. And then, there's that box of maybe 400 1.44Mb floppies, I have all the hardware necessary to migrate this data, but I never have fired it all up and got it working all at once.

But that isn't my worst headache. I have the entire remaining family photo collection from my Mom's estate. I am supposed to distribute it fairly to my 5 siblings. I don't know how I became arbiter of this. I thought I'd scan them all and make copies for everyone, but there are hundreds of them, it would take me years. And some prints would require special handling since they are fragile, or poorly exposed and need careful scanning. This was not what I envisioned doing with my life, when I took classes in archival photography and conservation.

There is absolutely zero chance that future devices will be unable to read or display jpegs, or any other conventional format in use today. Of course people will have to migrate digital data to new platforms. Making prints of precious photos is not a viable solution. Modern inkjet and other conventional digital printing processes are absolutely not archival. If you want archival, you need to use a photographic silver process, or some other known-archival process like platinum or carbon printing. Archival color printing is still tricky, but antiquated dye-transfer and other processes like gum bichromate (my personal expertise) have well known archival properties. These prints will last longer than the paper it's printed on (that's a major archivality issue in itself).

There's an old quote from Linus Torvalds, "Only wimps use tape backup: real men just upload their important stuff on ftp, and let the rest of the world mirror it." If your data is important enough, people will expend considerable effort to archive it. Perhaps it is best if most peoples' personal photos are lost over time.
posted by charlie don't surf at 12:09 PM on January 9, 2015 [2 favorites]


clarknova: I don't know what this noise about "obsolete formats" is. Except for purpose-built encoding schemes for one-off uses, there aren't any. Once the codec libraries are out there they're out there.

There are definitely obsolete formats, but you've hit the nail on the head, I think. It's not terribly likely that anything well-supported by multiple widely-used products is going to disappear in the next 20 years. That being said, I definitely have files that I would be surprised to be able to read in 20 years: .doc/.docx files are actually a bit suspect, since minor compatibility problems seem to be common now, for instance, and I have a handful of odd software programs I use that have odd proprietary formats I wouldn't expect to last that long. That being said, JPEG/TIFF/PDF/h264/MP3/AAC and all of the like aren't going anywhere anytime soon.

Side note: for those preserving RAWs as DNGs, be aware that software likely needs more data about your camera (and particularly its sensor) than is contained in the DNG file itself to process the pictures, at least as accurately as current software. Keeping a JPEG copy wouldn't be a bad idea, just in case.
posted by thegears at 12:21 PM on January 9, 2015


Supposedly the gold discs will last hundreds of years. Tif is better than jpeg for storage, bigger but not compressed, so durable. I know from costly personal experience to use archival ink and paper for prints you want to last even five years. I have thirty, forty year old cibachromes, still good.

The truth is in the interest. Kids build the history they want to take with them, parents are left with the fifty pounds of family photos that lie like a time bomb for surprise catharsis every decade or so. The mirror lets you know personally that documenting your inevitable decline is of questionable value.
posted by Oyéah at 12:34 PM on January 9, 2015


Still, pictures like Jack Ruby assassinating Lee Harvey Oswald, need to stay sharp.

Archival printing at home is not that expensive anymore. You have to be firm with printers regarding archival ink and paper. I had a good shop swear to me dye ink was just as good, or at least for a 25 year run, no the prints didn't even last three years before the reds and yellows were gone.
posted by Oyéah at 12:41 PM on January 9, 2015


> I don't know what this noise about "obsolete formats" is. Except for purpose-built encoding
> schemes for one-off uses, there aren't any.

Digital isn't the only kind of electronic. I have an open reel of, ahem, media, labeled "Samuel J. Fuller, 1947-1948." If there's any detectable signal left on this stuff it's a recording of my grandfather's voice. The stuff on the reel is hair-thin wire. I expect there are still a few services able to transfer audio from wire to more recent media for $ but any money I had to devote to such transfer would go first to recovering my dad's reels of 16mm home movies from the 1950s and 8mm stuff from the army (North Africa and Italy, 1942-1944.) It would be neat to play back the wire if it were something I could do, but it doesn't seem to be. As it is, I might as well use it as dental floss.

> They'll certainly be in better shape than all those old, cracked Polaroids my parents stored in a shoe box.

Can't help thinking about those old cracked 140k single-sided single-density Apple ][ floppies I have in a shoebox. (Converse basketball shoes. High tops.)
posted by jfuller at 4:14 PM on January 9, 2015


N.b total derail, but goog search for "wire recording" turned up Tom Lehrer's very first recorded performance. 1951, on wire.
posted by jfuller at 4:23 PM on January 9, 2015


Left to its own devices, a paper document will survive for many decades, perhaps for well over a century
Have you seen a 5 year old printed-out piece of paper recently? I don't know if it is just non-laser printed stuff, but the quality of hard copies of documents now is nowhere near as good as those created 50 years ago...Paper quality, ink quality and process have changed significantly.

Modern paper, at least the common A4 size 'copy paper' you can buy at the supermarket for under $5 a ream, starts to yellow and deteriorate almost immediately. I see lots of people buying into the argument that paper will last forever and digital content will vanish (often after a failure to back something up ended in its loss) that are setting themselves up for massive disappointment in the future.

I'm sure there are whole archives lost because nobody has a working Zip drive anymore. I have a bunch of college work on 256 MB magneto-optical discs, but the only person I know who might have one of those drives to read it would also need a working computer with a SCSI drive, and so on and so on...
One of the things that librarians do (at a state or higher level at least, perhaps not your local library) that many people aren't aware of is managing and maintaining stocks of hardware that can read pretty much every digital storage media ever made. Tucked away in dark rooms, maintained in working order, are multiple devices for all the weird and wonderful storage mechanisms ever made. If the digital records are important enough, there will always be a librarian somewhere that can retrieve them. Assuming we always have librarians, of course, but who would want to live in a world without them?
posted by dg at 4:27 PM on January 9, 2015 [3 favorites]


If you want a current physical storage medium that is likely to be recoverable in 200 years, like the cassette tapes in The Handmaid's Tale, your best bet is probably SD cards because these all support a very simple SPI transfer protocol which can be implemented by bit-banging with very simple microcontrollers. The full speed transfer protocol and things like USB are unlikely to be duplicable without specialized hardware and documentation that's not likely to be available, but instructions for reading SD cards with Arduinos, Propellers, and similar hardware are everywhere, so it's likely the data archaeologists would run across something that would help build a reader.

SD cards also have no moving parts. Flash isn't a perfect storage medium but will probably hold up as well as photographic processes over time, especially in a cool location.

And of course you don't need a precision optical system as you do for CD's and DVD's, assuming those survive at all. You can theoretically read a CD by taking a really hi-rez picture of it, but that still assumes you know how the data are encoded, and there are a lot more sources for SD SPI implemented by so many hobbyists than for the multiple optical data formats implemented mostly by corporations.

As to the actual data, JPEGs will probably be readable again because everyone who's ever been curious has written a decompressor and it's an open standard. It's likely that code fragments will survive somewhere. BMP would be better (both much simpler and for surviving bit rot) but drastically reduces your capacity.

Movies are much harder. Audio and video formats are constantly evolving and most of them are "containers" capable of holding data in many subformats, a nightmare for a data archaeologist. AVi files consisting of JPEG frames aren't too bad, but I would be doubtful of any audio format other than WAV being decipherable in the distant future.

Otherwise, we have nothing which dependably approaches the longevity and accessibility of photographs, which can be pulled from a shelf and observed without effort after a century, or books which (if not made of acid-based paper) can survive for multiple centuries. Both the hardware and data are moving targets and multiple entities, such as NASA with their moon picture tapes and Pixar having to make the Toy Story DVD from a film print because they lost the files, have been caught with their pants down by obsolescence or media degradation.
posted by localroger at 4:44 PM on January 9, 2015 [1 favorite]


Have you seen a 5 year old printed-out piece of paper recently?

Modern print processes are to traditional ones as digital processes are to print. Inks are formulated for inkjet performance rather than durability, and the carbon particles that make India Ink so durable don't work too well in nozzles. Laser print is relatively permanent, but unlike ink sits on the surface of the paper instead of soaking in. In a bound book the thermal glue which fuses the toner particles to the page will gradually glue the pages together creating a mess of merged facing images when they are pried apart. I've had less trouble with this in POD books than in pages from regular copiers and laser printers, but I still wouldn't be sure of their archival longevity.

And all paper other than pricey high-rag archival paper is very highly acidic nowadays. It's very striking to pick up a reference book printed in the 1930's which was on high rag low acid onionskin and is still parchment white and the pages look like they did the day it was printed, versus a five year old novel whose pages have started to brittle and yellow.
posted by localroger at 5:17 PM on January 9, 2015 [1 favorite]


I expect there are still a few services able to transfer audio from wire to more recent media for $ but any money I had to devote to such transfer would go first to ...

If you live in a city large enough to have a hackerspace you might want to inquire as to whether anyone wants to take on the challenge of hacking a reader, because a lot of people would jump at it for the chance to brag on having done it. Technical info about wire recording isn't hard to come by, and the big technical trick had more to do with writing (AC bias!) than reading. Wire is also extremely durable; there's no danger of wearing the oxide off with a poorly executed attempt as there is with old tape.
posted by localroger at 5:43 PM on January 9, 2015 [1 favorite]


There are companies who make good money selling little rackmount servers pretending to be 1950's era mainframes, running 1950's era mainframe software and talking to 1950's era control systems.

This sounds fascinating. Got any links?

My first job, in the mid-90s, was with a credit card processing company. I worked in the tape library – a big room filled with tens of thousands of 3480 cartridges on sliding shelves, and a bank of enormous drives into which they were inserted. We also had a couple thousand reel-to-reel tapes, and the corresponding drives. Even in the 90s, it was surprisingly archaic.

In the center of the tape library, there was a rack of about a dozen dumb terminals. The terminals displayed a steady stream of messages like this: "A35428 E07". That meant we had to run back into the aisles, find cartridge number A35428 on the rack, carry it over to drive E07, insert it, and return the cartridge that had been in drive E07 to its slot on the shelves. We worked twelve-hour shifts in four-man crews, mounting a few thousand cartridges per shift.

Also, every day we got a dot-matrix printout of about two thousand cartridge numbers. These had to be picked from the shelves, loaded into big wheeled carts, and given to a dude with a bushy white mustache to be taken offsite for storage, in case our building burned down or something.

There were slow periods when we could put our feet up and shoot the shit, but it also got very busy at times. Sometimes the cartridge we needed wasn't where it should be, and then we had to search every shelf, rack, nook, and cranny until we found it. Reel-to-reel tapes sometimes required on-the-spot splice jobs. And the tape drives broke down constantly; the repair guy was there more often than not. Apparently the cost of constant repairs was still cheaper than buying new equipment.

The data center (a separate area) filled an entire floor of the building, and contained a bunch of Amdahl mainframes which had to date from the 70s or 80s.

I eventually got promoted to Tape Librarian, which meant that I sat at a desk doing slightly more administrative stuff, instead of running my legs ragged mounting tapes. When the tape monkeys pissed me off, I'd log onto the system and run a pointless job that would force them to mount a few hundred tapes (they had no way of knowing I was behind it). I eventually got fired (partly) because they found a game I'd written in QBasic on my work PC, titled "Escape From [My Company's Name]" and prominently featuring my bosses as antagonists. I regret nothing, and still count that firing as one of the best things that's ever happened to me.

Also I wore onions on my belt. The point is, once a data format has achieved a certain level of popularity – especially once it's become the standard in one domain or another – it can endure for a surprisingly long time. I wouldn't be surprised if there are still young punks running around in that stale, windowless room, slapping old reel-to-reel tapes onto drives and going mad with boredom.
posted by escape from the potato planet at 5:43 PM on January 9, 2015 [3 favorites]


It's very striking to pick up a reference book printed in the 1930's which was on high rag low acid onionskin and is still parchment white and the pages look like they did the day it was printed, versus a five year old novel whose pages have started to brittle and yellow.
And here's an example (self-link). On the left, you have the Manual of Seamanship Vol I, printed in 1922 (by authority of the Lords Commissioners of the Admiralty, no less) and on the right, a popular novel printed in 1991. Both have been well-read and travelled extensively, with the older book having made many sea voyages in its life.

Even though the earlier printing is very noticeably less sharp, the yellowing of the newer book is much more pronounced and, while it doesn't show in the photo, the pages feel very brittle around the edges.
posted by dg at 5:59 PM on January 9, 2015


The ability to decode a jpeg will absolutely endure longer than a printed photo would. Your Photoshop psd file will be useless in ten years, though.

The durability of digital files over long time spans is an interesting question. Any good backup scheme or cloud storage system these days can survive traditional disk failures, where the disk completely stops working, but I doubt any of them would notice or correct subtle file corruption. This can and will happen over long time spans for a variety of reasons. We have the tools to handle this (basically, a combination of redundancy and vigilance), but it's not a threat many people are worried about right now because digital archiving is still in its infancy.
posted by qxntpqbbbqxl at 6:50 PM on January 9, 2015


Ah, this is making me want to ship multi-disk NASes to family, with tape drives to copy data amongst them.

For looong-term data parsing, like jpegs, I want to archive both source code, down to the compilers, and some working VMs.
posted by Pronoiac at 7:40 PM on January 9, 2015


qxntpqbbbqxl, real end-to-end data integrity has become a lot more accessible with ZFS.
posted by ryanrs at 7:43 PM on January 9, 2015 [2 favorites]


Anyone catch the Browser-emulated MS-DOS games? We need open formats with open source implementations to even hope to preserve all this.

Anyone know if OpenZFS on OS X is really stable yet? If so, maybe it's the missing choice for sharing drives between OS X and Linux.
posted by jeffburdges at 9:07 PM on January 9, 2015


I work at the Washington State Digital Archives. Our photographs (and 150 million other digital objects) are going to be preserved so long as their is a Washington State. If you transfer stuff to standard formats, preserve it on multiple hard drives (and two tape backups), and most importantly have a regular schedule of "forward migration" to new formats as needed, it can be available indefinitely.

The biggest difficulty we have run into is not old file formats but old hardware. A state agency once brought us a box of hundreds of 8-inch floppies of a particularly obscure type and for the life of us we could not come up with the drive that was meant to read them.
posted by LarryC at 10:14 PM on January 9, 2015 [2 favorites]


LarryC: I'd check with the Internet Archive, and the Museum of Computer Presevation down inmate South Bay. They might be able to help. (Though you've probably already checked.)
posted by Pronoiac at 12:14 AM on January 10, 2015


This sounds fascinating. Got any links?

I'm trying to remember where I ran into them - I think I was tracking down software to mount and read old HP-3000 tapes. I'll spend some quality time with google later. In the meantime, enjoy this PDP-8 hardware emulator kit.
posted by Slap*Happy at 4:34 AM on January 10, 2015


As long as facilities like LarryC's are continuously staffed and funded their curated collections will most likely endure. But the problem is that curation of these media is a dynamic process. A typical family will forget about that box of VHS videotapes until it is nearly -- or perhaps worse than nearly -- too late. And even the curators can trip, as with NASA's forgotten moon tapes and LarryC's own box of 8-inch floppies. If there is only one system capable of reading those media and it suffers a mechanical failure, you better have already converted them all.

The bigger problem is an upheaval that disrupts the curators. It's kind of silly to pretend that something which has happened with great regularity through all of recorded history will never happen again, and even a relatively minor economic collapse could cause a significant interruption and loss of hardware and documentation.

When the Roman empire collapsed we lost all of their engineering knowledge because they used a dynamic process of moving it forward instead of writing it down in permanent form, and it was more than a thousand years before anyone would have indoor pllumbing again.

Much of the engineering knowledge to create modern electronics is propriety and there are really only a few physical plants capable of executing the most advanced processes.

Consider what the Visual 6502 project had to go through to reconstruct the inner workings of a CPU which is still in use in billions of embedded devices. When MOS technologies folded up they tossed all the design documents, and after awhile nobody was left who remembered how it really worked. Fortunately the masks were adaptible to newer processes but nobody actually remembered how the logic elements were hooked up. So nobody knew, for example, why certain illegal opcodes seemed to totally disable the chip, or had other odd effects. Only by taking the chips apart and creating a physical simulation were they able to reconstruct the design.

Now what happens when, instead of a primitive chip being emulated with advanced techniques, we have lost the advanced techniques and are trying to reconstruct them using more primitive ones we've rolled from scratch?

Don't think it can't happen that we might lose the advanced techniques. The world is already full of devices that can't be serviced any more because the wrong company went out of business and their IP or tooling was lost in the process. LarryC is probably nursing some of them in his very archive. One day the one I'm typing this into will certainly be among them.
posted by localroger at 7:17 AM on January 10, 2015 [1 favorite]


Consider what the Visual 6502 project had to go through to reconstruct the inner workings of a CPU which is still in use in billions of embedded devices. When MOS technologies folded up they tossed all the design documents, and after awhile nobody was left who remembered how it really worked.

Well that is utterly ridiculous. What is the point of a transistor-level reconstruction of a chip? That's just an exercise in masochism, it has no functional purpose. The functional specification of the 6502 is well known, its behavior for every clock cycle under every possible condition is predetermined and predictable. It has been this way since it was released and detailed performance specs were issued in 1976. If you created a chip that behaved in exact accordance with this spec (and nothing more) it would be absolutely compatible with the original chip. You could even do this with emulation, it would make no difference, as long as it met the functional spec.

I found the original published specifications for the 6500 series, you should look at it. In particular, look at bus waveform diagrams like page 35. Now go look at the appendices, like page 141 which has photographs of oscilloscope readouts of those bus signals.

This is the way we used to program microprocessors, there were quite a few times when I had to put stuff up on the oscilloscope and check timings on the bus were performing the way I expected. If it didn't, it was always my programming error.
posted by charlie don't surf at 9:44 AM on January 10, 2015


The functional specification of the 6502 is well known

You might want to read what Visual 6502 found before dismissing their work so cavalierly.

There's a reason they started with the 6502 (though they've since gone on to do other chips). There are a number of longstanding mysteries of the 6502's functionality which aren't in those specifications, such as exact instruction cycle timings and the operation of illegal opcodes, which by definition aren't part of the specification. However, in the heroic early years of 8-bit programming, and most particularly for the Atari 2600, developers went to great lengths to profile undocumented opcodes in the hopes that they could shave a cycle or two from the kernal inner loop of an A2600 video driver. Those games will not play on an emulator that does not exactly mimic even the undocumented behavior of the 6502.

And while emulators do exist that match the profile, until V6502 nobody knew why those instructions worked the way they do, or how reliable they really were, or how the cycle timings arose when not forced, or why certain instructions kill the chip so thoroughly only a power cycle can resurrect it. V6502 has answered all of those mysteries.

My favorite revelation is that the 6502 sometimes uses "might makes right" bus arbitration. In some cases two signal sources will assert a bus at the same time. Rather than gate one source out, which adds transistors and propagation delays and needs a control line, the MOS guys gave the favored signal bigger transistors so it could stomp the unwanted signal. That is the kind of thing that is not apparent even if you have an accurate logic diagram of the original circuit.

You could say that programmers shouldn't use undocumented features but this is the real world we're talking about, and there was a whole decade during which if you owned any computer at all it was most likely 6502-based via Atari, Apple, or Commodore. It has been maddening since the era when MOS was still in business that nobody knew exactly how KIL killed the chip. Now we know.

And it's not just the 6502. When they did the Z80 another longstanding mystery was resolved, which is the reason the Z80 pin diagram seems to have been laid out by someone on a LSD trip. It turns out that the Zilog guys were almost out of room on the die and extended partial buses to several parts of the die, which made it convenient to site some pads in odd places.

This is all useful stuff to know, both for practical and historic reasons. I frankly think it is appalling that we have been schlepping 6502 cores into stuff for 40 years without having much of any idea how the damn thing works. That's a good way to end up with mysterious nonperformance that can't be troubleshot. Such as, for example, any Atari 2600 game you might try to play on your nifty FPGA 6502 core -- most of them won't go at all, because they only match the published specification.
posted by localroger at 11:57 AM on January 10, 2015 [1 favorite]


All proprietary formats should be considered ephemeral. After suffering for years because a lot of stuff I wanted to go back to was in formats that were no longer supported by modern software, I started converting anything I wanted to keep into non-proprietary formats.

Paradoxically, just after I started doing this, virtual PC technology became widely available and convenient to use, and made it easy to keep a working copy in the original format alongside a working version of the original software. I ripped my floppies to VFD, my CDs and DVDs to ISO and, of course, the "hard drives" are VHD. Although VFD and VHD are owned by Microsoft, all of these formats are widely documented and can be converted to other formats as required. I have virtual PCs running DOS 6.22, Windows 98, and Windows XP 32-bit, which pretty well cover my needs (although, irritatingly, making the latest versions of Virtual PC read floppies requires some scripting). I can mount the VHDs to read files directly, or I can run the virtual PC to use the original software to natively read and convert files. I also have older DOS versions (back to 3.2), but in practice I have nothing that needs the old versions - DOS 6.22 (with Windows 3.11) has enough backward compatibility for my purposes. Likewise, Windows 98 works for everything I have that won't run on modern versions of Windows. My past experiments with Linux never got to the point where I need to make a virtual machine for them, and I've never used a Mac, but I could make virtual machines for those too if required.

For other operating systems I have emulators that can run ancient software back as far as the Sinclair ZX-81 that was my first personal computer. If an emulator won't run on a modern version of Windows and there isn't an updated port available, I install the old version on the appropriate virtual machine.

As hardware and operating systems come and go, I now expect that there will be software provided to allow old stuff to be run in emulation. The biggest hassle is that some companies (I'm looking at you, Apple and Google) don't like emulators that run competing operating systems, and zap them from their online stores. This simply guarantees that I will keep buying PCs running Microsoft as long as Microsoft keeps putting out versions that allow me to run old Microsoft OSes in emulation. I have an Android tablet and an Android phone, but my main workbench is my Windows laptop. In the event that Microsoft goes belly-up, eventually someone will fill the ecological niche and when they do, my virtual hard drives will be waiting.

Emulating old hardware like the 6502 is always going to be a soluble problem if you have sufficient processing power available to quickly try many different options. For example, you don't need to know WHY certain 6502 signal sources are favoured over others, as it will become obvious that some sources need to be favoured for the software to run successfully. So you write that into your emulation.

As for reversion to a more primitive technology, a lot of advanced stuff simply won't be relevant until the appropriate level of technology is re-attained to run it. You don't care about 3D rendering if you're trying to catch food in the new wilderness. If the archivists did their job, the digital records will be readable (it doesn't come any simpler than patterns of zeros and ones) and will themselves point the way to rediscovering lost techniques. If all else fails the new civilisation will find its own way to do these things (as we did after the fall of the Roman Empire) and there will be an "aha" moment many years later when someone looks back and realises that, oh, that's what the ancients were getting at!

Provided the records survive, of course, which is what this thread is really all about. To make rabbit stew, here is the recipe: first, catch your rabbit ...

Printing most photos for archiving purposes is plain dumb even if you use archival quality materials and techniques. Even the best print loses vital details. The retrieval of stunning Lunar Orbital imagery was not made possible by scanning old printouts (which had already lost detail) but by reading the original files from the (fortunately) archived tapes and using modern software to reconstruct the images with a fidelity that far exceeds the original printouts. When it comes to digital, the focus needs to be in ensuring the bytes survive!

In 2000 and 2001 I traveled the US, Canada and Australia with a film camera and spent hundreds of dollars developing the photos and putting them in albums. Then I scanned the photos and never looked at the albums again. In 2002 I went digital and have never looked back.

For backups I used to do incremental backups to CD/DVD, until I got bitten when some of the older CDs died. Since I'd made two copies of most backups, I was able to salvage most of my data by merging the good files from each bad disk, and from then on I started a process of progressively merging old backups onto new media (and keeping the old backups too, just in case). But nowadays I have three external hard drives - two for dead storage and one for live backups using Windows Backup. I plug the dead storage disks into my USB 3.0 port and run a Robocopy script, with error checking. Live data from the HDD gets copied to one location with /mirror (keeping it synchronised), and one folder for stuff I want to move to dead storage gets copied to another location. The two dead storage disks are never stored together. When I need to transfer the files to a new external drive I just plug the old and new drives in and copy everything from one to the other.

I am unconcerned about preserving most of my stuff for the far future. Anything I think is worth preserving gets sent out into the world somehow. The stuff that is only important to me will survive so long as someone else cares about it, and once nobody cares about it, why bother?

An issue with JPEG is that it is lossy. Most people crank up the compression to reduce the file size. That's not necessarily fatal if your JPEG is the final generation (so the loss is a one-off), but for archival purposes you may want to keep a lossless version of the original. Lots of people here use TIFF, which is fine, but in fact you can also make a lossless JPEG (which is not the same thing as saving a regular JPEG at 100%). In practice, unless the image is already JPEG when I get it (in which case I archive the original JPEG file), I generally convert it to PNG.

I used to have issues with old vector images from DrawPerfect and CorelDraw! but resolved those one way or another.

Databases and spreadsheets are soluble - all my old databases can be read and translated, with a little effort and jiggery-pokery. I have dBase 3 databases that I have migrated into SQL Server 2012 by way of MS Access.

The real issue is documents. A lot of my old stuff is in Pagemaker and WordPerfect and old versions of Word and I haven't had time to export it to an open format. So far I get by by running the old software in a virtual PC and exporting to a generic format as required, accepting a loss in formatting. After all, all I really need are the words. Any embedded images are usually available in better fidelity in my image backup.

The trickiest problem was some Pocket Word and Pocket Excel files from my PocketPC days. MS offered a translator for these in Office 2003 but dropped it with Office 2007. Fortunately I still have my last PocketPC so whenever I encounter one of these files, I have just used that to export to Word97/Excel97 format, which modern versions of Office can read. And there are now online converters for these formats, validating my confidence that where there is a need, the means will eventually appear!
posted by Autumn Leaf at 6:56 PM on January 10, 2015


(Minor correction - on rereading the article the Lunar Orbiter pictures were stored in analog rather than digital form. Mea culpa. Apparently the tapes were an analog recording of the original radio signals. It doesn't change my argument, which was that printing a photo is destructive - the print contains less data than the original. Retrieving the Lunar Orbiter images from the original tapes instead of scanning from photos allowed more data to be retrieved than before, not less.)
posted by Autumn Leaf at 7:48 PM on January 10, 2015


For example, you don't need to know WHY certain 6502 signal sources are favoured over others, as it will become obvious that some sources need to be favoured for the software to run successfully. So you write that into your emulation.

You really aren't getting this. Yes, people knew how the chip behaved, because they'd been misusing those instructions since 1977. But without knowing why it is difficult to characterize, for example, whether they will still work that way if you use an automatic algorithm to shrink the die for a new fab process. Is the observed behavior robust or is it an edge effect dependent on a race condition working out a certain way? Nobody knew. If we had the original notes from the MOS engineers who built the thing Visual 6502 would not have had to decap a chip to answer that question, which is potentially very dangerous. It is generally considered a bad idea to use components in your design whose workings are unknown, but we've been doing exactly that for decades because of lost documentation.
posted by localroger at 6:36 AM on January 11, 2015 [1 favorite]


I understood Autumn Leaf's argument to be that there's minimal need to die-shrink a 6502 because (almost) anything you wanted to do on the 6502 you can do by emulating whatever 6502 system you want on some other cheap chip.

Amusingly, with uneducated back-of-envelope math it looks like if you made a 6502 on a modern process the die would take up 0.01 mm2.
posted by ROU_Xenophobe at 7:05 AM on January 11, 2015 [1 favorite]


Dropped zeroes there. I meant 0.0001 mm2.
posted by ROU_Xenophobe at 7:11 AM on January 11, 2015 [1 favorite]


ROU die shrunk 6502 cores are in current production by the billion. They have been a mainstay of embedded systems forever and nobody uses emulation because the poorly understood MOS masks are smaller and cheaper. You might think it's silly but this is reality. A lot of 6502 cores get slapped onto common silicon with things like LCD drivers and blob mounted to make things like photo key fobs. They do it because it's cheap and easy to build both hardware and dev tools. And yeah the mini but recognizable 6502 cores are hilariously tiny and often parked in a corner of a much bigger die. I've seem micrographs.
posted by localroger at 9:38 AM on January 11, 2015 [1 favorite]


The 6502 remains my favourite of all the 8 bit designs I've encountered, and that's quite a few. It manages to do a great deal with very little, and it's not even horribly unpleasant to code for even though it's fairly shockingly compiler-unfriendly.

I'm quite glad it's got everywhere, and I look forward to it being among the first designs to be implemented on molecules once we get the whole desktop nanofactory thing worked out.
posted by flabdablet at 11:20 AM on January 11, 2015 [1 favorite]


nobody uses emulation

Hobbyists do.
posted by flabdablet at 11:30 AM on January 11, 2015


What if someone makes a virus that destroys jpegs and it gets on all the computers and there is a jpocalypse?
posted by snofoam at 1:08 PM on January 11, 2015


Hobbyists do.

They're not mass producing custom silicon.
posted by localroger at 1:45 PM on January 11, 2015


What if someone makes a virus that destroys jpegs

Tumblr will have to make do with animated GIFs.
posted by localroger at 1:46 PM on January 11, 2015


What if someone makes a virus that destroys jpegs and it gets on all the computers

That's why careful people use multiple backup drive sets only one of which is ever connected to a computer at any time, and then only for as long as necessary to do a backup session; and why online backup services allow retrieval of past versions.

Because there's already a virus that destroys jpegs: it's called Distracted Human Oops.
posted by flabdablet at 6:58 AM on January 12, 2015


I just went through this last year trying to salvage files off of an old desktop tower. It wasn't even that old -- a grey Win98 Compaq from circa 2002 -- but getting files off of it took a surprising amount of ingenuity:

- USB Flash drive? It recognizes the stick, but won't install the drivers to mount the thing.
- Floppy disc? It's got a floppy drive. Problem is, none of the modern computers I want to transfer shit to do. And I lost my external diskette drive years ago.
- Connect to the internet and upload it somewhere? No WiFi, and I couldn't figure out how to get it set up on the router or how to piggyback it on my laptop's connection. I doubt its obsolete browsers could even render something like Gmail or Dropbox.

I eventually cracked it by burning the files to CD, using OEM software I'm glad I never got rid of. Even then, the files were burned in an archaic format that I had to buy special software to decode.

And that machine was barely a decade old, and using formats that are etched in stone compared to today's more ephemeral web-based stuff. Ars Technica has some illuminating articles on working with older Mac computers and trying to archive Android history, if you want more.

snofoam: "What if someone makes a virus that destroys jpegs and it gets on all the computers and there is a jpocalypse?"

"Most video tapes from that era were damaged in 2443, during the second coming of Jesus."
posted by Rhaomi at 9:34 PM on January 13, 2015


I just went through this last year trying to salvage files off of an old desktop tower. It wasn't even that old -- a grey Win98 Compaq from circa 2002 -- but getting files off of it took a surprising amount of ingenuity:

If you have to do it again, buy an external enclosure for the hard drive(s). You can still buy them for IDE drives ; newegg had a bunch for USD10-20
posted by ROU_Xenophobe at 10:52 PM on January 13, 2015


« Older When it clicks, it clicks.   |   Fake 3D Until You Make 3D Newer »


This thread has been archived and is closed to new comments