JournalSpace: R.I.P.
January 3, 2009 6:13 PM   Subscribe

JournalSpace: R.I.P. [Sub-Titled: When is the last time you tested your backups?]
posted by GatorDavid (69 comments total) 9 users marked this as a favorite
 
Even though it's been proved to him beyond doubt ...
As data is written (such as saving an item to the database), it's automatically copied to both drives, as a backup mechanism.
... he's still making the mistake that killed him. Just as you don't point an unloaded gun at someone, you don't think of Raid as a backup.

It's a redundancy mechanism, not a backup. Raid is not a backup. Repeat till remembered.
posted by bonaldi at 6:23 PM on January 3, 2009 [2 favorites]


So what was JournalSpace, before it stopped being anything altogether?
posted by penduluum at 6:29 PM on January 3, 2009 [4 favorites]


Unforgivable.
posted by fourcheesemac at 6:30 PM on January 3, 2009


lol noob
posted by ryanrs at 6:31 PM on January 3, 2009 [3 favorites]


> The list of potential causes for this disaster is a short one. It includes a catastrophic failure by the operating system (OS X Server, in case you're interested), or a deliberate effort.
It's difficult to imagine a way in which the OS would cleanly overwrite an entire hard disk of data. If the OS did it, I would imagine the data left on the disks to look like a car wreck, with many fragments left over.
posted by fatbird at 6:38 PM on January 3, 2009


Dumb as a bag of hammers. Also, I second penduluum's question.
posted by brundlefly at 6:38 PM on January 3, 2009


I learned the RAID-is-not-a-backup lesson, too, many moons ago. I think part of the problem is that, to non-technical or mildly technical people, it seems like it should be a backup, when of course it's not. Many folks get as far as thinking "great, it's writing data to two disks" and stop there, without going to the next logical step, which is "when deleting data, it deletes it from both disks."
posted by maxwelton at 6:41 PM on January 3, 2009 [3 favorites]


Wow. I guess the lesson here is "don't trust your web/blog/whatever hosting service to have a backup strategy."

As bonaldi pointed out, RAID is not a backup strategy, and this is a perfect example of why not — it doesn't protect against a software issue that corrupts your data, a malicious user, or just the dreaded "sudo rm -rf /" issued by accident. But it wouldn't surprise me if a lot of "cloud" services weren't doing real backups, and instead were just relying on built-in redundancy and finger-crossing.

It's not a happy thought, because a lot of services don't give users an easy way to back up their content themselves. (Flickr is one of the only exceptions I'm aware of, and that's just a consequence of its well-exposed API.)
posted by Kadin2048 at 6:42 PM on January 3, 2009


Crap site wiped from face of web; handful emos mourn.
posted by fire&wings at 6:46 PM on January 3, 2009 [2 favorites]


I recall one of the itunes installers had a bug that caused it erase the destination disk if it had a space in its name (e.g. 'Macintosh HD', the default name).
posted by ryanrs at 6:47 PM on January 3, 2009 [1 favorite]


How big was the userbase? Did they have any export tools available to users that might have allowed progressive individuals (read, not lazy-arse like me) to periodically back-up their journals?
posted by maxwelton at 6:51 PM on January 3, 2009


Is this something you would need a JournalSpace to understand?

Because I don't, and the odds of me obtaining one now seems unlikely.
posted by Alvy Ampersand at 6:54 PM on January 3, 2009 [1 favorite]


This looks identical to the diaryx failure a few years back.

Lessons learned?
posted by blixco at 7:05 PM on January 3, 2009 [2 favorites]


Yes, but every idiot has to learn it for themselves.
posted by ryanrs at 7:09 PM on January 3, 2009 [1 favorite]


Snark all you want, this is heartbreaking. Proper versioned backups are difficult to do right.
posted by Nelson at 7:10 PM on January 3, 2009


Snark all you want, this is heartbreaking. Proper versioned backups are difficult to do right.

We can argue about that all night, but it's completely irrelevant, because the JournalSpace DB didn't have any backups.
posted by Jairus at 7:12 PM on January 3, 2009 [2 favorites]


Henceforth the IT version of the Darwin Awards will be simply known as the Journalspaces.
posted by netbros at 7:12 PM on January 3, 2009 [4 favorites]


JournalSpace is No More: "JournalSpace.com is intending to sell their domain name and trademarks, and an enterprising investor may find this a good time to grab a valuable piece of electronic real estate. But for a large part of the blogging community, it is the end of a six-year voyage, and one that could have been easily avoided by the correct procedures for data storage and retrieval."

JournalSpace's admins opted to block WayBack Machine's access via a Robots.txt Retrieval Exclusion.
posted by terranova at 7:17 PM on January 3, 2009 [5 favorites]


Try telling your parents that when they send out a 'daily devotion' they should edit it in Word or something and cut-and-paste into google... Gave up, let little sister handle it. A lot of people over 50 or so just think they have magic internets box.
posted by zengargoyle at 7:18 PM on January 3, 2009


Another common problem is file system corruption. It happens slowly and silently, infecting the entire RAID and even tape backups if it happens slowly enough. This happened to our ISP once and we were basically unable to recover the NetApp or tape backup - I think we had some really old tape archives that prevented a total loss but it was just luck.
posted by stbalbach at 7:22 PM on January 3, 2009


zengargoyle: Why? Gmail autosaves drafts, Word doesn't.
posted by Jairus at 7:23 PM on January 3, 2009


Why? Gmail autosaves drafts, Word doesn't.

Word doesn't autosave? Pardon?
posted by Dark Messiah at 7:26 PM on January 3, 2009


Word autosaves documents that are already saved. Not draft documents that are unsaved.
posted by Jairus at 7:28 PM on January 3, 2009


Snark all you want, this is heartbreaking. Proper versioned backups are difficult to do right.

What's really hard to fathom here is that they were running on OS X, which has one of the better, built-in, no-added-cost, works-like-magic versioned backup systems included - Time Machine.

I have it configured on all my Macs, and while it's not perfect, it has saved my sorry ass more times than I care to remember, and it's brain-dead simple to set up and use.

WTF?
posted by kcds at 7:31 PM on January 3, 2009


Well there are different levels of RAID. This was a mirror, two identical drives written nearly simultaneously. When one fails the other takes over. If both fail...game over. Backup aside, the best IMHO RAID is RAID 5 where only parts of the data are written on several disks or 'striped', typically with a parity bit that allows the data to be rebuilt from other stripes if one drive fails. Unfortunately some SysAdmins don't like RAID 5 because it is slower, and requires several more drives.

But Geez what fool would think any type of RAID is a backup?
posted by Gungho at 7:32 PM on January 3, 2009


But Geez what fool would think any type of RAID is a backup?

Everyone. Everyone thinks RAID is backup.
posted by Jairus at 7:33 PM on January 3, 2009 [1 favorite]


Wow, love seeing all the folks jumping on the guy for making an honest mistake (see Jairus' comment above) instead of taking a moment to think of how soul crushing it must be to wake up one morning and find that the community you spent six years building has just disappeared. What a horrible, horrible thing.

Compete says they had 15,500 uniques in November, down from 35,000 a year before. So it's not just some dude and his six friends.
posted by bpm140 at 7:43 PM on January 3, 2009


Wow, love seeing all the folks jumping on the guy for making an honest mistake

a fool and his data soon are parted
posted by pyramid termite at 7:45 PM on January 3, 2009 [8 favorites]


when I back up, I make that 'beep-beep' noise like a truck. It dosen't get people out of my way neccessarily, but it keeps me entertained.
posted by jonmc at 7:52 PM on January 3, 2009 [10 favorites]


This is interesting and all, but that site sucks. Best of web? I don't think so.
posted by cjorgensen at 7:53 PM on January 3, 2009


Coincidentally enough, I signed up for a Mozy account today, because I have a lot of backups but no offsite backups, and a musician I know just lost all of her material for her new album in a fire.

Also her studio and house.
posted by Jairus at 7:54 PM on January 3, 2009


Wow, love seeing all the folks jumping on the guy for making an honest mistake ... What a horrible, horrible thing.

If you are a stupid fuck who doesn't know how to back up goddamned data, you should not be inviting people to entrust their data to you.
posted by jayder at 7:56 PM on January 3, 2009 [2 favorites]


Ah, could it be an excuse to duck out clean, since their traffic's fallen by half in a year?
posted by fourcheesemac at 7:58 PM on January 3, 2009 [1 favorite]


This is a lesson that everyone learns at some point, and it's usually painful. What really surprises me is that this guy got to the point where he was running a fairly large site without ever learning it.
posted by brundlefly at 7:59 PM on January 3, 2009


I've got six years of entries in my LiveJournal and I would grieve their loss.

Those of you with LiveJournal accounts and a desire to back them up may want to look at LJArchive. Free open-source application for Windows that will grab all of your entries (and comments) into a local archive. The local archive is easily browsed through the application and you can output the contents into HTML or XML.

I feel better having backed my stuff up.
posted by DWRoelands at 8:13 PM on January 3, 2009 [3 favorites]


Ah, could it be an excuse to duck out clean, since their traffic's fallen by half in a year?
posted by fourcheesemac at 9:58 PM on January 3


This was my first thought as well.
posted by Ynoxas at 8:15 PM on January 3, 2009


These guys were running their data server on an OS X Server under Raid 1.

I haven't been to journalspace before, but I am going to assume they were really just using the box because it was a "simple" apache / php server. But from an admin standpoint, especially a public facing web service box, Apple does not keep apache or php as up to date as some web developers would want, and at that point you end up managing your own installs anyway. Possibly this was their MySQL server, but Apple likes to break that between revisions also, since they don't actually *support* mysql in anyway, besides just compiling it and making sure it runs.

In short: They did not have an IT team, or a production team, or a server admin. I would assume they had a web coder guy who had worked on hosted services before, and deciding to save cost, they went and bought their own xserve (store.apple.com has it listed cheap, UNLIMITED access licenses, shiny!), figured out what they needed to run VNC on it, and shipped it out to a colocation facility.

If they actually had someone who had any inkling of server knowledge, to know that Raid 1 is not backup, and that if someone deletes a file, it is gone, they probably would have built a redhat box or something else, that would have been cheaper. But then again the cost of to hire someone like that would outweigh the savings from buying a nice 1U HP box.

As for how the disk was nuked completely?

sudo diskutil secureErase 2 /Volumes/DATA

or

sudo rm -f /path/to/msql.db
sudo diskutil secureErase freespace 2 /Volumes/Macintosh HD

I've been called in to help other such "IT Guys" help fix their xserve based web hosting setups, and in almost every case it was someone thinking that buying an xserve would someone excuse them from ever having to learn how to do use backup, the command line, or more than following poorly written "How to install LAMP under leopard" guides online. And I know the appeal, as most of them are doing the web development on their own macbooks and decide "how much harder would an xserve be."

(not saying OS X is not a good development platform, or even a deployment platform, but rarely do I know coders who have an appreciation or ability for system management).
posted by mrzarquon at 8:21 PM on January 3, 2009 [10 favorites]


Everyone thinks RAID is backup.

As stated above, no. In the everyday, pointy haired boss sense, yes, it's a back up in case of a single failed drive in that pair (or set, or whatever). It is not an archived copy (known as a back up in the magical world of server rooms) which offers a restore point in the event of data corruption or loss.
posted by crataegus at 8:28 PM on January 3, 2009


> What's really hard to fathom here is that they were running on OS X, which has one of the better, built-in, no-added-cost, works-like-magic versioned backup systems included - Time Machine.

Time Machine is not recommended for OS X Server, in fact it handles things like mysql and other database files horribly (ie, it rarely copies them because they are in use).

Usually for database (and mail) backups, you need to actually put some thought into it more than just "copy all the files from this folder to this folder." You have to script the service to stop, copy the relevant files, and start back up again. Or you can script something with mysqldump to pull out all new records since the last update for incremental.

And while the state of backup software for OS X is abysmal right now, for professional level services, getting something like 'rsync the contents of the mysqldump folder to another box ever night' setup is easily possible and would have saved their company. Hell, they could have gotten an answer here from ask.me on how to do it.
posted by mrzarquon at 8:29 PM on January 3, 2009 [2 favorites]


And RAID5 is no panacea. RAID 6 is just a bandaid fix.

At the rate's individual disk space is growing, the chances of a one disk failing, followed by another one containing crucial parity information increases dramatically as the disks get larger. Also the rebuild times are greater.

ZFS keeps looking better and better.
posted by mrzarquon at 8:34 PM on January 3, 2009 [1 favorite]


As stated above, no. In the everyday, pointy haired boss sense, yes, it's a back up in case of a single failed drive in that pair (or set, or whatever). It is not an archived copy (known as a back up in the magical world of server rooms) which offers a restore point in the event of data corruption or loss.

I know RAID isn't backup. My point is that pretty much every person I've met outside of a server room who's had an opinion on RAID was of the opinion that RAID 0 made their games run faster and RAID 1 backed up their data.
posted by Jairus at 8:41 PM on January 3, 2009


This really isn't going to look too great on the company CV.

I'm not so sure I buy the purchase of journalspace as a good investment:

"Check out the new journalspace.com!"

"Say, the only time I even heard of you guys was when you completely borked your site and had no backup."

"Wait, that was the prior guys! We're different!"

"Oh, I'm sure. Same software, you say? Well, have to be going now..."
posted by maxwelton at 8:48 PM on January 3, 2009 [1 favorite]


blixco: or, on the other hand, the XCalibre failure a few months back.

Jairus: good move. FWIW, I'm using rsync.net, a similar but more techie-oriented outfit, and I've been quite happy with them.
posted by hattifattener at 9:06 PM on January 3, 2009 [1 favorite]


hattifattener, I looked at rsync, but they're too expensive for me. I'd be paying thousands a year.
posted by Jairus at 9:15 PM on January 3, 2009


Tar anyone?

My point is that pretty much every person I've met outside of a server room who's had an opinion on RAID was of the opinion that RAID 0 made their games run faster and RAID 1 backed up their data.

Lots of people inside server rooms think this too. I had a boss once who was enamored of RAID 0+1 for some strange religious reason, even after I calculated the failure rates in front of him (even granting the optimistic assumption that drives fail randomly rather than in clusters). Several catastrophic data failures later he was turfed.
posted by benzenedream at 10:05 PM on January 3, 2009


OS X Server is a joke. Performance is terrible, decent admin tools are nonexistent. I wouldn't trust anyone who would use that as a deployment platform.
posted by signalnine at 10:08 PM on January 3, 2009 [1 favorite]


Proper versioned backups are difficult to do right.

Perhaps. But a daily, compressed dump of the database to another server with two big, cheap disks that are swapped off-site every week would be trivial to set up and at least provide something to go back to. A simple, dumb strategy, to be sure. But something is better than nothing.

Another common problem is file system corruption.

Ahhh, yes. Ask Joyent about the joys of ZFS on that one.

Carrier grade file system. Or not.

What a horrible, horrible thing.

Sure it's horrible. But really, there's no excuse for it. It's a 101 mistake that is forgivable for a home PC, but in a large provider that thousands of people depend on? It's negligence. I have a lot of sympthay for the people screwed by this, but for the guy who screwed them by not running his business properly? Not so much.

ZFS keeps looking better and better.

Sure, if you're a branless zealot who ignores the well-documented problems with it, or thinks there's something magical about Sun finally shipping built-in volume management with their operating system (welcome to AIX circa 1995).
posted by rodgerd at 10:08 PM on January 3, 2009 [2 favorites]


When is the last time you tested your backups?

<smug>Tue, Dec 30th, 2008 at 14h25 MST</smug>

It was for something of trivial importance, but that's probably the best time to see if your backups and restore procedures work.
posted by furtive at 10:10 PM on January 3, 2009


[Nelson Muntz]
HA HA!
[/Nelson Muntz]

Seriously, it's pretty hard to feel sorry for these yahoos. This isn't some tragic mistake that could happen to anyone, it's plain stupidity.
posted by DecemberBoy at 10:10 PM on January 3, 2009


(Also, for those of you complaining about backups being hard, just let amanda do it. It's free, it's open source, and these days it can do its incremental backups to pools of disc)
posted by rodgerd at 10:18 PM on January 3, 2009 [1 favorite]


The first thing that I do when I set up sites to set up a script that zips the site contents and db dump each day and then SSHs them to a separate location, with each file date stamped. We're talking a few lines of script here, hell, I'm willing to share it at the expense of it being picked apart.

Here's the DB one:
# DB BACKUP SCRIPT

#############################################################
# ADJUST THE VALUES OF THE FOLLOWING VARIABLES AS NEEDED
#############################################################

#define name of database
SQL_DB_NAME="insert_db_name_here"
SQL_DB_USER="insert_db_user_here"
SQL_DB_PWRD="insert_db_password_here"
SQL_DB_SERV="insert_db_server_here"

#define local path (shouldn't use relative paths in crontab)
LOCAL_PATH="/home/username/htdocs/"

#define remote (sftp) variables
REMOTE_PATH="/home/username/backups/db/"
REMOTE_CONN="username@server.com"

#############################################################
# YOU SHOULDN'T HAVE TO TOUCH ANYTHING BELOW THIS LINE
#############################################################

#define variable with name of file to be used for backup
FILE_NAME="backup_${SQL_DB_NAME}_$(date '+%Y%m%d')"
SQL_FILE_NAME="${LOCAL_PATH}${FILE_NAME}.sql"
ZIP_FILE_NAME="${LOCAL_PATH}${FILE_NAME}.zip"

#get backup from database (server name, db name, password, username)
mysqldump -h${SQL_DB_SERV} -u${SQL_DB_USER} -p${SQL_DB_PWRD} ${SQL_DB_NAME} > ${SQL_FILE_NAME}

#compress backup
zip ${ZIP_FILE_NAME} ${SQL_FILE_NAME}

#send backup to remote location
echo "put ${ZIP_FILE_NAME} ${REMOTE_PATH}" | sftp ${REMOTE_CONN}

#clean up local copies of files
rm -f ${SQL_FILE_NAME}
rm -f ${ZIP_FILE_NAME}

#that's all.  
Similar for app except that I cron an rsync to a remote server daily and then have a separate script that compresses the rsync'd mirror daily.
posted by furtive at 10:25 PM on January 3, 2009 [7 favorites]


Of course, in real life the above script isn't double spaced.
posted by furtive at 10:26 PM on January 3, 2009


mrzarquon: At the rate's individual disk space is growing, the chances of a one disk failing, followed by another one containing crucial parity information increases dramatically as the disks get larger. Also the rebuild times are greater.

This has happened to me personally several times with high-end hardware RAID5 — a error reading a single block causes the card to treat the whole disk as if it has died. It's a pain when a drive drops out for such a bullshit reason, but it's so much worse when the rebuild fails because of the same BS.

Parity RAID is for cheapskate assholes (like me).


ZFS keeps looking better and better.

I don't think so! It's immature and not widely tested, not just because it's new, but for of a cascading set of reasons:
  • It's monolithic: it pushes every possible feature related to block devices, volume management, snapshots, VFS, etc. into the filesystem implementation — this only flies on Solaris / FreeBSD / OSX because none of those layers were already present.
  • It'll never be in a Linux kernel: not just because the license was constructed to keep it out — even if relicensed, it invents all its own layers and is written in a style antithetical to Linux
  • It's not distributed: it's not useful once you get beyond what can be handled by one big fileserver (with failover, etc.) — it'll never be used for big applications
posted by blasdelf at 11:05 PM on January 3, 2009


furtive: LET THE PICKING APART COMMENCE!

Using a temporary file is somewhat understandable, but why are you creating multiple ones in htdocs where they could be presumably snatched via the web by an evildoer? Using zip and sftp is a huge WTF, but at least you're assuming ssh keys are used!

How about this:

mysqldump … | gzip | ssh ${REMOTE_CONN} 'cat > ${REMOTE_PATH}${FILE_NAME}.sql.gz'


Or if you want to use a temporary file (for example to keep some recent backups locally):

mysqldump … | gzip > ${LOCAL_PATH}${FILE_NAME} && scp ${LOCAL_PATH}${FILE_NAME} ${REMOTE_CONN}:${REMOTE_PATH}${FILE_NAME}
posted by blasdelf at 11:33 PM on January 3, 2009 [4 favorites]


blasdef: Don't forget multiple known kernel panic cases (one of which is present in 10u5 and only has a t-patch fix), the Joyent data corruption problem (root cause: Sun failing to disclose a data corruption bug as a reason to upgrade), the inability to shrink ZFS volumes, the 100% full stupidity, and a number of other issues.

Of those, the kernel panics - triggered by adding and or unexpected loss of LUNs! - in an 'enterprise filesystem' are the most ridiculous, since they beg for data corruption.
posted by rodgerd at 11:35 PM on January 3, 2009


> OS X Server is a joke. Performance is terrible, decent admin tools are nonexistent. I wouldn't trust anyone who would use that as a deployment platform.

Depends on what you need. Need to manage a bunch of macs, works pretty well.

Rapid deployment ruby on rails web environment? probably not. Also you can't virtualize it in any way useful for real time deployment either.

blasdelf- ZFS is something that has caught my attention because of what they were attempting to accomplish.

As for big app deployments, there are hand offs between distributed storage vs centralized storage. You end up with dedicated storage or volume management servers one way or the other (with the possible exception of GFS). I'd see the "sun" solution as a solaris head node on a stack of drives, resharing over NFSv4 over 1-10GB ethernet to the application nodes. (I've seen demos from Apples Performance team, which resulted from their doing all the science cluster stuff, that had 10.6's nfs/IP performance being pretty damn amazing, making the idea of an NFS sharing solution look competitive to xsan in everything except extreme low latency situations).

ACFS, or xsan, would have your using atleast one, if not two, metadata controllers to actually manage the storage metadata across all the luns, which the nodes read/write to directly over fibre channel (using a dedicated ethernet network for metadata).

GFS might be a bunch of virtualized RH installs using iscsi to talk to iscsi LUNs, but I haven't had a chance to play with that as a full deployment. I like the idea of iscsi / gfs / virtualized setups, because once you have ethernet run to all your boxes, moving roles and functions around between machines is just changing things in software (in the vmware, in switch vlans, in the iscsi configurations), making things extremely flexible.
posted by mrzarquon at 11:37 PM on January 3, 2009


I think it would be pretty crushing to realize that you spent six years building a community and to have it vanish. That's tragic.

However, when you spend six years building a community, but you don't know the first thing about electrical codes, current safety standards, and have not realized that the town you built is actually on top of a system of old, shallow mineshafts about to collapse any time, because you clearly were not interested in hiring anyone with a clue, and just figured that the do-it-yourself spirit was going to save the day, and then the community just plunges into sinkholes and gaping caverns, that's something else entirely.
posted by adipocere at 12:25 AM on January 4, 2009


b1tr0t: It's not too bad if you use something like rsync-over-ssh so you aren't burning too much of your traffic allowance, but yeah, it's not exactly a rigorous approach - merely a demonstration of how easy it is to implement a simple interim solution until you can engineer a real one.
posted by rodgerd at 1:14 AM on January 4, 2009


I note that they claimed before they scrapped the idea that the data recovery place was going to charge more than they made the entire previous year to get their disks back. Was their income, er, modest, or is data recovery a lot more expensive than I would have expected?
posted by maxwelton at 2:46 AM on January 4, 2009


or is data recovery a lot more expensive than I would have expected?

Back in 2004 I was doing some consulting work for a friend who shipped his "web guys" a 250G IDE disk in an external enclosure. An intern working for the web guys plugged the wrong power supply into the external enclosure. Poof, hard drive controller board was toast.

A small local-ish firm in Austin got the job. They bought an identical drive, swapped the controller PCB over, then mounted the disk up and copied the data to DVD-Rs. For this work (they also demanded to keep the "new" disk), they charged the "web guys" around $3K. Anything needing more work would have cost even more.

From what I've heard, DriveSavers (which seems to be a much better operation than the clowns in Austin) is way too expensive for most people; their pricing for some jobs can end up at "if you have to ask, you can't afford it" levels.
posted by mrbill at 3:08 AM on January 4, 2009 [1 favorite]


They bought an identical drive, swapped the controller PCB over, then mounted the disk up and copied the data to DVD-Rs. For this work (they also demanded to keep the "new" disk), they charged the "web guys" around $3K.

Did exactly that for my boss last week for his home computer. No backup and all his company data sitting on a dead brick he'd tried for a week to bring back to life. I was pricing out clean-room services for retrieval ($$!) when I found the same drive/controller board in my box of junk. Since no spin when powered up, I swapped out the controller board and it booted perfect. Handed it back to him within an hour. Come time for review, he will remember nothing else I did all last year.
posted by hal9k at 5:59 AM on January 4, 2009


I don't pretend to understand 80% of this thread, but surely the really stupid thing is entrusting your personal writings to a website like this without keeping copies of them yourself? The guy who ran this site may have done something idiotic, but the users should have realized that there's a whole range of ways - technical failures but also legal or corporate bullshit - that could have led them to lose control of their writing.
posted by game warden to the events rhino at 6:38 AM on January 4, 2009


FYI: in the post in the blog of this journalspace (whatever it was), was this explaination:
It was the guy handling the IT (and, yes, the same guy who I caught stealing from the company, and who did a slash-and-burn on some servers on his way out) who made the choice to rely on RAID as the only backup mechanism for the SQL server. He had set up automated backups for the HTTP server which contains the PHP code, but, inscrutibly, had no backup system in place for the SQL data. The ironic thing here is that one of his hobbies was telling everybody how smart he was.
In conclusion, this guy should be blamed for allowing asshats to run his servers, not for misunderstanding RAID

as an aside, i'm a unix systems engineer, and some of the managers/directors where i work also don't understand that RAID != backup... and then i have to save their asses. fools!
posted by Mach5 at 7:31 AM on January 4, 2009


Heh. A web 2.0 site with web 0.7 technical infrastructure and management.

> Proper versioned backups are difficult to do right.

Bull.
posted by Artful Codger at 8:14 AM on January 4, 2009


Bull

Yeah, all you need to do is convince all your database servers, etc to flush their files in a consistent state. Then make copies via rsync or something to a large pool of disks, preferably on another machine. All while not disrupting your running application. Easy! I know how to do these things. I even mostly do them for my own stuff. I totally understand how interesting web application projects could fail to do it right.

It's not clear to me how you could make backups as easy as, say, setting up a web server with some PHP scripts. I'd say the filesystem is in the best position to do the work, only filesystems have bugs too.
posted by Nelson at 11:11 AM on January 4, 2009


Bull.

Backing up a database -- that is changing in real-time -- without negatively impacting your service's up-time is a big deal, actually. Setting up a system to properly snap-shot and back-up the database is definitely not impossible but we don't exactly have the entire breakdown of their financial / technical resources.

Backing up data seems to be like driving; when you're doing it, everyone else seems like an idiot. I know just as many "techs" who have made similar mistakes with everything from personal data to important work-related stuff.
posted by Dark Messiah at 11:32 AM on January 4, 2009


Backing up a database -- that is changing in real-time -- without negatively impacting your service's up-time is a big deal, actually.

Not for any decent database. In fact, I would go so far as to describe any DB that can't be easily backed up without a significant service interruption as junk.
posted by rodgerd at 1:05 PM on January 4, 2009


Backing up a mysql database is simple. You run mysqldump, compress the dumped data, then make copies to remote servers. If you're using innodb or another transactional engine you don't need to worry about stopping your DB to keep your transactional data safe. But even if you're using a non-transactional engine, like the default MyISAM, you might end up with some transactional problems but they'd be things like a logged but missing blog post, or a tag id with no data. No big deal.

and if running mysqldump eats up all you DB server's processing overhead, it's time for you to scale out.

and raid is like so 2006. the thing to do is have multiple running copies of your services and data on big, cheap, single disks.
posted by mexican at 10:22 PM on January 4, 2009


1. Take an old PC offsite, put a 500G disk or two in it, run the Linux/UNIX of your choice.
2. Install rsnapshot.

This will give you "snapshot"-style backups that go back however long you have disk space for (or set your retention policy).

From that box, stage the data to removable media (tape, DVD-R, whatever).

This takes maybe an afternoon to set up, if that. It gives you offsite backups, instantly-accessible online backups, and backups to removable media. If your "sysadmin" doesn't have access to the machine, it provides a fallback should they "slash and burn" when leaving.

I've been backing up my colocated servers with variants of this for ~5 years now.
posted by mrbill at 11:33 AM on January 6, 2009


« Older The PMRC "Filthy Fifteen"   |   Internet Bird Collection has videos and... Newer »


This thread has been archived and is closed to new comments