Code Cowboys
March 21, 2011 12:19 PM   Subscribe

FuckYeahSourceCode.tumblr.com
Tumblr has had a major security breach causing the web server to spit out source code containing passwords, database schema, and secret API keys. How did this happen? Probably editing the [live] file in vim, forgot it was in Replace mode, and tried to enter Insert mode by tapping i while at the beginning of the file.
posted by wcfields (125 comments total) 21 users marked this as a favorite
 
M-x smirk-at-vi-user
posted by sebastienbailard at 12:25 PM on March 21, 2011 [44 favorites]


I believe now they are tapping the letter F approximately 7 times, followed by the letter U 12 times.
posted by Mister Fabulous at 12:25 PM on March 21, 2011 [51 favorites]


Holy shit.

Ha, from their statement: "we’ll be seriously evaluating and adjusting our processes". Yeah, maybe consider something like, oh, source code management. Maybe even something so radical as separate dev/test/prod environments.

I have to admit, I have had some times when I've wondered about possible bad effects from the power of vi/vim. I'll forget that I'm not in insert mode and try to paste some code which then ends up doing all kinds of crazy shit. But it's easy to back out, and oh, I'M NOT EDITING LIVE PRODUCTION CODE.
posted by kmz at 12:26 PM on March 21, 2011 [22 favorites]


:facepalm:
posted by johnnybeggs at 12:28 PM on March 21, 2011


Holy fuck. I love Tumblr but I have been dying for them to monetize for the last year. If anyone needs an injection of stable cash and the capacity to actually make their site work... it's Tumblr. I have to add on third-party apps just to bring the functionality up to what it should be.
posted by PostIronyIsNotaMyth at 12:28 PM on March 21, 2011


M-x smirk-at-vi-user

Emacs is bloated enough I wouldn't be surprised if that's a real command.
posted by kmz at 12:31 PM on March 21, 2011 [30 favorites]


I'm trying to post a link to these articles at Tumblr right now. It took three minutes to go through. Oy.
posted by PostIronyIsNotaMyth at 12:32 PM on March 21, 2011


M-x smirk-at-vi-user

Eight Megs And Constantly Smirking?
posted by Kadin2048 at 12:33 PM on March 21, 2011 [10 favorites]


nano 4 life!

Anyway vi or not, they're idiots for editing live stuff. They deserve everything they get for not having separate dev/test/acceptance/production environments. Not having money is no excuse.
posted by Threeway Handshake at 12:34 PM on March 21, 2011 [10 favorites]


Tumblr is written in PHP? huh.
posted by device55 at 12:35 PM on March 21, 2011


Is this the modern 2.0 version of "Oops sorry boss i hit send on that email" ?
posted by infini at 12:36 PM on March 21, 2011 [1 favorite]


Hey, that's the same process we use at my current workplace! I guess if folks at Tumblr can text-edit the live version of the script on the server, it's okay that we've been doing it too.

Someone back me up here.
posted by The Lurkers Support Me in Email at 12:37 PM on March 21, 2011 [10 favorites]


Is this something I would have to have a television computer fixie degree in computer programming to understandf?
posted by tomswift at 12:37 PM on March 21, 2011 [2 favorites]


Someone back me up here.

::crickets::
posted by brennen at 12:37 PM on March 21, 2011 [7 favorites]


This is pretty funny. So either:

1) They have developers editing live code, or
2) They didn't even check a single page on the development environment before pushing it out - since this appears to be a global include viewing ANY page would have tipped them off to the problem

That's competence, folks. When they say "we’ll be seriously evaluating and adjusting our processes", I presume they mean "we will be creating some processes."
posted by chundo at 12:38 PM on March 21, 2011 [8 favorites]


Places like Quora also do "continuous deployment" where code is reviewed afterwards. This guy should have, at the very least, the developer should have made sure it ran on his instance. I have seen a few other postmortems where people fire off and update and never knew they screwed something up until the frantic phone calls.

Speaking as somebody who has edited live code with notepad and hosted production systems, yeah shit happens, but some shit is much much worse than others.
posted by Ad hominem at 12:39 PM on March 21, 2011 [1 favorite]


infini -- This is a web 0.1 version if it's web at all... considering we're talking vi this is something that could have happened on a gopher site.
posted by malphigian at 12:39 PM on March 21, 2011


If anyone needs an injection of stable cash and the capacity to actually make their site work... it's Tumblr.

Bullshit. You can piece together a good-enough dev environment for probably less than their office's daily coffee bill with old cheap hardware. If anything, this shows that they shouldn't be getting cash infusions.

Is this the modern 2.0 version of "Oops sorry boss i hit send on that email" ?

No, this would be more like, I don't know, a car mechanic doing a tire rotation while the car is driving down the highway.
posted by Threeway Handshake at 12:40 PM on March 21, 2011 [19 favorites]


And here I thought I was the only guy using m3MpH1C0Koh39AQD83TFhsBPlOM1Rx9eW55Z8YWStbgTmcgQWJvFt4 as a password. Jeez.
posted by bhance at 12:40 PM on March 21, 2011 [19 favorites]


Ten years ago I was trying to fix some minor bugs on a major California-based e-commerce site. I would tinker with the scripts, update the page to the test server, refresh my browser but my changes were having no apparent effect.

Frustrated, I decided to buzz off to the gym.

When I came back an hour later everyone was running around like their ass was on fire, as I had brought down the clients website and halted online sales. Turns out that, due to my extremely laggy remote desktop connection, I had accidentally dropped my files into the production server file tree instead of the test server, which were unwisely right next to each other on one filesystem.

They started putting the two servers on different machines after that.
posted by CynicalKnight at 12:40 PM on March 21, 2011 [2 favorites]


Is this something I would have to have a television computer fixie degree in computer programming to understandf?

Not sure if you're totally serious or not, but in the interest of non-techno-weenies out there...

They've inserted a character at the very beginning of a program file which effectively prevents the code from being run.

Normally the webserver would get a request for a resource, pass it off to the code interpreter, run the code, and return the response (usually a web page)

In this case, the code interpreter says "hey that's not PHP, so you just want me to serve this file up? Ok."

The double embarrassing part is this file is chock full of security stuff that people shouldn't be seeing.
posted by device55 at 12:41 PM on March 21, 2011 [8 favorites]


Definitely not the best advertisement for modality.
posted by tommasz at 12:41 PM on March 21, 2011 [5 favorites]


@tomswift: Someone at tumblr edited the version of the code that is used on the version of the website that is used by the public. They are using a notoriously insecure (but perversely popular) programming / web development language called PHP that if not set up carefully will serve your code to the website visitor if it doesn't understand what it is (caused by the first character being an i instead of '<').

The password to their databases is in line 9, so a cautious person would not trust that their further security has not been breached and would change their password if they used the same password somewhere else (though such a cautious person would not use the same password for different websites, of course...).

As for tumblr, they will have to perform a very careful audit to ensure that noone has gained further access to their databases.
posted by Morbuto at 12:42 PM on March 21, 2011


Midnight deploys are for idiots. Here is another tale of a production deploy that nobody bothered to check. How do these people have jobs ?
posted by Ad hominem at 12:44 PM on March 21, 2011 [4 favorites]


Tumblr's way to big to allow cowboy crap like editing production code on the fly. I've never worked for a web company but I've worked for similar sized software shops and you don't edit stuff outside of source control and don't push changes without review.
posted by octothorpe at 12:44 PM on March 21, 2011 [1 favorite]


nano 4 life!

If there's anything emacs users and vi users can agree on, it's that nano sucks. ;) Come on, you have to move your hands off the home row just to navigate!
posted by kmz at 12:44 PM on March 21, 2011 [3 favorites]


So, does this mean that user passwords were compromised? My reading suggested that it was server password and data....
posted by tomswift at 12:45 PM on March 21, 2011


OK, so tumblr gets compromised. What's going to happen? Someone posts non-hedgehogs to "fuck yeah, hedgehogs", or what?
posted by Wolfdog at 12:46 PM on March 21, 2011 [40 favorites]


Speaking as a guy who pretty much lives in Vim and writes a ton of PHP, this strikes me as a possible explanation, but I wonder how many people are actually spending time in replace mode. Not to mention how many people aren't, you know, immediately checking the results of hitting :wq on a production box, should they be dumb (or desperate) enough to be editing code there.

In this case, the code interpreter says "hey that's not PHP, so you just want me to serve this file up? Ok."

Somewhat more exactly, PHP treats everything not sandwiched between <?php ?> as a literal string to be echoed to the browser, which itself is a sufficiently loathsome feature that I wish I could at least turn it off outside of template files...
posted by brennen at 12:46 PM on March 21, 2011 [1 favorite]


So, does this mean that user passwords were compromised? My reading suggested that it was server password and data....

Yeah, nobody got the CEO or IT department's door keys, just the keyring of the Janitor...
posted by jscott at 12:46 PM on March 21, 2011 [3 favorites]


Could someone give a brief explanation of vi/vim? Does it offer some advantage?
posted by zennie at 12:47 PM on March 21, 2011


tomswift, presumably user passwords in the database. are encrypted.
posted by octothorpe at 12:47 PM on March 21, 2011


Somewhat more exactly, PHP treats everything not sandwiched between <> as a literal string to be echoed to the browser, which itself is a sufficiently loathsome feature that I wish I could at least turn it off outside of template files...

Of course you're correct - I didn't want to get too densely technical for non technical people.

The main idea is just that instead of running that code, the webserver was just displaying that code for all to see, including the sensitive bits.

Sort of hard to explain though without getting into how PHP works on a webserver.
posted by device55 at 12:49 PM on March 21, 2011


Could someone give a brief explanation of vi/vim? Does it offer some advantage?

vi/vim are both programs that allow you to edit files. You can use it for text, code, etc. Looks very DOS-ish for the unacquainted.

Pros: It's been around forever. It will let you do anything you want to code.

Cons: It's been around forever. It will let you do anything you want to code.
posted by Mister Fabulous at 12:50 PM on March 21, 2011 [11 favorites]


Ok, can someone explain to me the benefit of having 4000+ lines of routing configuration in-code contained within one multidimensional array, that doesn't seem to have any human readable enumerated values and seems to have no rhyme or reason to it's structure?

For example
            [366] => Array
                (
                    [0] => /pay/:amount/:key/:nipple_direction/:nipple_offset/address
                    [1] => Array
                        (
                            [controller] => secure
                            [action] => pay
                            [change_shipping_address] => 1
                        )

                )
I'm just dying to know what exactly a 'nipple_offset' is, and why they've got multiple entries in their routing table for it.

God, is that some code that needs a cleaning.
posted by SweetJesus at 12:50 PM on March 21, 2011 [4 favorites]


octothorpe: "tomswift, presumably user passwords in the database. are encrypted."

Here's a tip for judging security of databases: if you use a "forgot my password" and they send you an email with it, RUN.
posted by wcfields at 12:51 PM on March 21, 2011 [10 favorites]


Could someone give a brief explanation of vi/vim? Does it offer some advantage?

Assuming you're not joking, vi is a text editor. It is special, in that you have to press "i" on the keyboard to go into "insert mode" in order to change anything. It offers zero advantages to anything in the history of the universe and is quite possibly the worst thing ever.

:q!
posted by Threeway Handshake at 12:51 PM on March 21, 2011 [11 favorites]


Could someone give a brief explanation of vi/vim? Does it offer some advantage?
posted by zennie at 12:47 PM on March 21 [+] [!]

Assuming this is not one of the best troll questions ever, I have to add this to Mister Fabulous answer:

Almost every single machine running anything resembling UNIX will have vi/vim installed. It is hard to learn, but you only have to learn it once and you can work on most machines out there.
posted by Dr. Curare at 12:52 PM on March 21, 2011 [7 favorites]


Tumblr's way to big to allow cowboy crap like editing production code on the fly

Yeah I honestly think this is a case of an automated deploy after a checkin.

One way to be sure : No way tumblr is running only one instance so did the souce pop up on only one server or all of them. An automated deploy would have deployed to the entire cluster.
posted by Ad hominem at 12:53 PM on March 21, 2011 [3 favorites]


Ok, can someone explain to me the benefit of having 4000+ lines of routing configuration in-code contained within one multidimensional array, that doesn't seem to have any human readable enumerated values and seems to have no rhyme or reason to it's structure?

That looks like a variable dump as part of an error page and not config data. Most configs in PHP are done in INI files or XML
posted by device55 at 12:53 PM on March 21, 2011


tomswift, presumably user passwords in the database. are encrypted.

If they're editing live production code, all bets are off. And hey, the Gawker passwords were supposedly encrypted too.

It is hard to learn, but you only have to learn it once and you can work on most machines out there.

For me it's never really been about that, but more the fact that (with CapsLock remapped to either Control or Esc) your hands never have to leave home row to do 99% of normal editor commands.
posted by kmz at 12:55 PM on March 21, 2011 [1 favorite]


SweetJesus: Ok, can someone explain to me the benefit of having 4000+ lines of routing configuration in-code contained within one multidimensional array, that doesn't seem to have any human readable enumerated values and seems to have no rhyme or reason to it's structure?

I thought the same thing, but I sort of wonder if it's not generated. (Or, on preview, maybe pulled out of an INI/YAML/XML, like device55 said).
posted by brennen at 12:55 PM on March 21, 2011


Midnight deploys are for idiots. Here is another tale of a production deploy that nobody bothered to check. How do these people have jobs ?

You know what's even more fun? Being that guy's boss. I have a lotta rant in me about cowboy types... bonus points when they tell you you just don't understand how coding works when you point out that they shouldn't be pushing to production without testing first.
posted by L'Estrange Fruit at 12:57 PM on March 21, 2011 [3 favorites]


This is the same bunch that can't figure out how to make the queue system actually work.
posted by Legomancer at 12:58 PM on March 21, 2011 [1 favorite]


Does it offer some advantage?

It's installed by default on pretty much every server running a Unix-like OS, so if you're SSHed into a machine it's pretty much guaranteed to be there if you need to edit a file.

Its syntax is so complex that learning and using it could also double as one of those 'train your brain' exercises for staving off Alzheimers (but some would probably claim that using vi voluntarily is a sure sign of dementia).
posted by Kadin2048 at 12:58 PM on March 21, 2011 [7 favorites]


It's also been said that vi/vim is an excellent tool for generating random alphanumeric strings. All you have to do is put a novice user in front of it and use their keystrokes as they try and quit as output.
posted by The Lurkers Support Me in Email at 12:59 PM on March 21, 2011 [40 favorites]


Someone back me up here.

You're editing live and you don't have backups?
posted by justkevin at 12:59 PM on March 21, 2011 [7 favorites]


Ok, definitely the question was a masterful troll.

Even if you do not know it zennie, the troll is strong in you :)
posted by Dr. Curare at 1:00 PM on March 21, 2011


Device55 is correct, that is part of an error page. If you notice they are setting up a javascript array on an onclick event.
posted by Ad hominem at 1:01 PM on March 21, 2011


I thought the same thing, but I sort of wonder if it's not generated.

Yeah, that part isn't actually part of the code. That's from a print_r dump of a variable. Looks like maybe some kind of debug trace.
posted by kmz at 1:02 PM on March 21, 2011


installed by default on pretty much every server running a Unix-like OS ...

___
{o,o}
|)__)
-”-”-
O RLY?
server:/tmp/,/fast# vi 1.user
E558: Terminal entry not found in terminfo

posted by k5.user at 1:02 PM on March 21, 2011 [8 favorites]


That looks like a variable dump as part of an error page and not config data. Most configs in PHP are done in INI files or XML

Well, that would make a lot more sense. But that begs the question of why are they dumping their stack contents to the web rather than writing to a local log file. So were they using some sort of internal tool to do diagnostics, and due to an errant 'i' when someone hand edited it in vim, their php script was treated as plain text?

I've been out of the PHP world for almost 10 years now, and I'm struggling a bit to see how exactly this could have happened so easily...
posted by SweetJesus at 1:07 PM on March 21, 2011


I've been out of the PHP world for almost 10 years now, and I'm struggling a bit to see how exactly this could have happened so easily...

Whatever framework they've cobbled together may do an error dump to the page based upon the environment - sort of like "if dev, then dump errors to page" - that's fairly common with newer PHP frameworks like Zend Framework - it makes debugging slightly easier over tailing the logs. (depending upon how things are set up, maybe you don't want developers in your log directory? I dunno)

I haven't gone through that entire file - and I ain't gonna - but I reckon that some of the PHP executed but not all of it, which is why it's part code, part error HTML.

The first part of the file contains:

if (__FILE__ == '/var/www/apps/tumblr/config/config.php' || __FILE__ == '/data/tumblr/config/config.php') {
define('ENVIRONMENT', 'production')


Which obviously didn't get run, because it's there in the page. So maybe the the error dump assumes a development environment by default.
posted by device55 at 1:14 PM on March 21, 2011


The var dump is rather strange, now that I think about it. There's no way that should be in a production page at all, even if the page didn't have the PHP tags messed up. Now I'm thinking it was an errant deploy. The other possibility is that the mistagged portion caused a runtime error later on in the page, which triggered an automatic error handler that dumps vars among other things. Which is also a really stupid thing to have configured on a production system.
posted by kmz at 1:14 PM on March 21, 2011


I feel their pain. In any editor it's easy to get in a hurry and put a stray character at the top of a file. Woe betide anyone who does this at the top of a C++ header file -- the errors spewed forth look like the Congressional record.
posted by RobotVoodooPower at 1:15 PM on March 21, 2011 [6 favorites]


The thing that gets me is that that all the Database definitions look to be written by hand, as some of the entries have associated comments, and some entries are commented out. If that section was generated via XML or an INI file it probably wouldn't have comments, and it almost certainly wouldn't have entries that are commented out 'c-style'. Of course, it could be a programmatic cut/paste of a stub file that someone wrote, but I think PHP has a native function call to do something like that.
posted by SweetJesus at 1:21 PM on March 21, 2011


I am reminded of two experiences early in my computing career, the first "utter horror" at realizing I had just hit !G<typoinshellcommand>:wq!

and the utter joy of doing something similar at the first place I worked that had version control.

That we never even covered the existence of version control systems in my college Comp-Sci curriculum...well one of our professors degree was in Music Theory.
posted by nomisxid at 1:23 PM on March 21, 2011


Let's list out a few ways that this wouldn't be such a big deal if the guys at Tumblr took the time to do things right:

- Store all connectivity, private key, etc., data in a file outside of the web root, and not "include"-d in your source. Read it once on application startup, save it to memory, and be done with it. (Java does this via web.xml, JNDI, properties files, etc; you have to go out of your way to do this incorrectly)

- Create unit tests for all of your functionality, pages, methods, etc., and only allow a production deployment after all unit tests pass. If you ever have a regression like this, you catch it before it goes live. This adds a lot of time to your development cycle, especially if you have a larger site, but it is well worth the time investment.

- Do not allow direct access to the production environments; only have automated jobs that move code from environment to environment. Team-lead or higher folks should be the only ones with access to production, to prevent the types of scenarios as rsync'ing to the wrong box, etc.

- Any environment-specific configuration should, in theory, be done once, when you set up the environment. "environment='production'" should be the only variable you ever need to set, besides any connectivity variables, then let your codebase switch based on the value of that variable. (Again, Java's JNDI allows you to do this extremely easily)

- Only use a mature IDE for your development, one that is syntax-aware. A developer looking at an entire file that just lit up as all purple, for example, will know something's up before they decide to save their file. Besides code-completion, automatic refactoring (which can be good or bad), and other various nice-ities, a syntax-aware IDE can go a long, long way for developer efficiency. Java developers who have hunted for trailing semicolons know what I'm talking about here.

- Use reverse proxy servers that include the ability to rewrite the http response, so you can set up rules designed to limit information leak. In Tumblr's example, if they had a rule that stopped any response with ?php within the first few bytes of the response, this page wouldn't have leaked data. If you have the money to spend, the F5 BigIP with its iRules is extremely invaluable for this type of sanity checking.

- Last, but definitely not least, make sure that your developers know to take the time to do things right, even in the case of emergency breakfixes. This bug was the result of a developer rushing, no question about it. Whether they were editing a live file or doing an automated deployment, it doesn't matter, they didn't take the time to do things right. Dev team leads: sit down, right now, with your team, and tell them that they won't get fired for being five minutes late with a code deployment. Take the extra time to sanity check their work.
posted by mark242 at 1:24 PM on March 21, 2011 [35 favorites]


Anyway vi or not, they're idiots for editing live stuff.

Or at least *hitting refresh* and looking at the page after you've edited it. Doing this in production is bad, doing this in production then not even bothering to see if it worked is bad to to bad power.

The posit is someone changed this file in production and never noticed that it broke the page. Either the posit is wrong, or the fail is strong in whoever was editing that file.
posted by eriko at 1:24 PM on March 21, 2011


Only use a mature IDE for your development, one that is syntax-aware. A developer looking at an entire file that just lit up as all purple, for example, will know something's up before they decide to save their file. Besides code-completion, automatic refactoring (which can be good or bad), and other various nice-ities, a syntax-aware IDE can go a long, long way for developer efficiency. Java developers who have hunted for trailing semicolons know what I'm talking about here.

Vi is an incredibly mature development platform. It's got syntax highlighting built in, plenty of code completion & refactoring plugins (hello C-scope!). I'd rather use Vi, GCC and Valgrind to do my day to day work than Eclipse or Visual Studio.
posted by SweetJesus at 1:28 PM on March 21, 2011 [3 favorites]


The thing that gets me is that that all the Database definitions look to be written by hand, as some of the entries have associated comments, and some entries are commented out. If that section was generated via XML or an INI file it probably wouldn't have comments, and it almost certainly wouldn't have entries that are commented out 'c-style'

The 'generated' stuff is towards the bottom - the stuff you describe looks to be hand edited with a number of quick and dirty edits in place.

Many PHP classes are designed to accept arrays for options, which is what they're doing - but the clean and tidy way to do that is to define the array via configuration and then just dump the array variable into the constructor.

e.g. Database::set_defaults($config->dbDefaults);
posted by device55 at 1:29 PM on March 21, 2011


pico > vi, amirite?
posted by andreaazure at 1:29 PM on March 21, 2011 [7 favorites]


Vi is an incredibly mature development platform.

Well, Vim is. Plain vi, not so much.

That we never even covered the existence of version control systems in my college Comp-Sci curriculum...well one of our professors degree was in Music Theory.

Well, now we're getting into the age old debate on what Computer Science education at the university level should be. Most CS departments are still focused (and rightly so, IMO) on the science and theory of computing. Things like IDEs, version control, etc are practical applications. There probably should be at least one or two basic prep courses in good coding practice, but they are to CS what lab techniques are to chemistry.
posted by kmz at 1:34 PM on March 21, 2011 [2 favorites]


The 'generated' stuff is towards the bottom - the stuff you describe looks to be hand edited with a number of quick and dirty edits in place.

Yeah, I see what you're saying. It still seems insane to me that this sort of stuff isn't data driven. Even if you think nothing is EVER going to go wrong, and you have the most bulletproof processes in existence, why would you store that sort of information in plain text on a webserver? Why wouldn't you want to read it out of a encrypted file, decrypt it in memory, and use that? Genuine question, I don't do web development anymore, and I'm wondering if this sort of thing is standard practice?

Well, Vim is. Plain vi, not so much.

You're correct. I use Vi and Vim interchangeable, but I meant vim in my head. GVim in particular.
posted by SweetJesus at 1:37 PM on March 21, 2011


There probably should be at least one or two basic prep courses in good coding practice, but they are to CS what lab techniques are to chemistry.

An entry-level requirement without which you shouldn't be permitted to enter advanced classes, never mind graduate?
posted by feckless at 1:39 PM on March 21, 2011 [9 favorites]


Well, now we're getting into the age old debate on what Computer Science education at the university level should be.

Want to have a fun discussion at a sufficiently nerdy party? Bring this up. Ask if it is better to be trained/educated as a "computer scientist" or a "programmer".
posted by maryr at 1:41 PM on March 21, 2011 [1 favorite]


Thanks everyone for answering. It was kmz's reference to the "power of vi/vim" that got me curious. I just googled "vi commands." ! Who developed this editor, and what did they have against humanity?
posted by zennie at 1:41 PM on March 21, 2011 [7 favorites]


Who developed this editor, and what did they have against humanity?

If you think that's scary, you should google 'Emacs commands'.
posted by SweetJesus at 1:43 PM on March 21, 2011 [3 favorites]


What they had wasn't something against humanity, but rather a desire to advance the state of the art. vi was as much better than the editors that came before it as the editors that came after it are than it. Look up edln or ed, for example.

(also: emacs is a perfectly good operating system - too bad it doesn't come with a decent text editor)
posted by Fraxas at 1:44 PM on March 21, 2011 [1 favorite]


tomswift

This question has probably been answered to your satisfaction... but what they were apparently doing is the equivalent of a construction crew going to a school playground during recess and tightening bolts on the playground equipment while children are currently playing on it.

Ideally, on any non-hobby website, or any site that has commercial aspirations, the code that people are working on should be several steps removed from that used by the general public.
posted by The Confessor at 1:44 PM on March 21, 2011


Ask if it is better to be trained/educated as a "computer scientist" or a "programmer".

Meta-Holy War: Is the CS/programmer holy war better than the vi/emacs holy war? Discuss!
posted by kmz at 1:45 PM on March 21, 2011 [1 favorite]


 ___
{o,o}
|)__)
-”-”-

ASCII art ... or valid VI session?
posted by zippy at 1:45 PM on March 21, 2011 [40 favorites]


Places like Quora also do "continuous deployment" where code is reviewed afterwards.

Although it's true that a real person usually only looks at the code after deployment, places like Quora use continuous integration methodology. That means that the code being pushed out has to pass all of its unit tests before it's actually deployed.

Whatever tumblr is doing, editing live code, that's just stupid.

As for vim, here's why you use vim.
posted by Who_Am_I at 1:47 PM on March 21, 2011 [2 favorites]


Google crawled at least a handful of sites while the bug was present which makes for even more amusement as it reveals that the box that they're running on is a stock CentOS 5.5 server with a load average of only 1.48 (which is surprisingly low given how frequently I see the tumblebeasts). Also, as far as I can tell they're not proxying the requests, they're just going straight to Apache... I kinda feel like giving them a hug.
posted by togdon at 1:48 PM on March 21, 2011 [1 favorite]


When they say "we’ll be seriously evaluating and adjusting our processes", I presume they mean "we will be creating some processes."

I read it as "no longer will we allow drinking beer on site" or "We'll be adding some people to the unemployment rolls"
posted by rough ashlar at 1:50 PM on March 21, 2011 [1 favorite]


Even if you think nothing is EVER going to go wrong, and you have the most bulletproof processes in existence, why would you store that sort of information in plain text on a webserver? Why wouldn't you want to read it out of a encrypted file, decrypt it in memory, and use that? Genuine question, I don't do web development anymore, and I'm wondering if this sort of thing is standard practice?

There's a lot they could have done better here. It looks like code that was 'good enough for launch' and never got refactored.

PHP is all plain text - it's a scripting language interpreted by the PHP interpreter (which is itself written in C) - the interpreter is made available to apache webserver via an apache module or via CGI.

There are techniques for obfuscating PHP, but somewhere along the line you're going to have to have a decryption key or a password or something somewhere in your code.

Ideally, though, this bit of code is not accessible via the webserver directly. That is it would be impossible for one to type in a URL which loaded up your sensitive information - so if PHP fails like it does here, the sensitive stuff isn't served up.

Most modern PHP frameworks do this more or less by default these days. The application config, database connectivity, etc, are all outside of your web root and would require someone have SSH or FTP access to your machine to get at it.
posted by device55 at 1:50 PM on March 21, 2011


> Who developed this editor, and what did they have against humanity?

If you think that's scary, you should google 'Emacs commands'.


Holy... I retract my comment about the good and wonderful developer of vi.
posted by zennie at 1:50 PM on March 21, 2011 [3 favorites]


you should google 'Emacs commands'

I think you mean M-x browse-url "www.google.com/search?query=emacs commands"
posted by wildcrdj at 1:51 PM on March 21, 2011 [3 favorites]


BTW, why does fuckyeahsourcecode.tumblr.com not exist yet? This seems like it should pretty well satisfy the end of the fuckyeahNOUN meme well and good.
posted by maryr at 1:52 PM on March 21, 2011 [2 favorites]


If there's anything emacs users and vi users can agree on, it's that nano sucks. ;)

Did you happen to miss the subject of this post? Was it nano sucks? Nope, it sure wasn't. In fact, I have yet to hear a good argument against nano, except that it's not L33T enough because anyone can use it. Which is an utterly asinine argument, and leads to the kind of crap described in the OP.
posted by Civil_Disobedient at 1:53 PM on March 21, 2011 [1 favorite]


Bullshit. You can piece together a good-enough dev environment for probably less than their office's daily coffee bill with old cheap hardware. If anything, this shows that they shouldn't be getting cash infusions.

Especially not when they're still spending the cash they do have on "fashion directors" and refer to the people who create their sole product as "technicians". Garret Murray nailed it, basically.
posted by bonaldi at 1:53 PM on March 21, 2011 [2 favorites]


the errors spewed forth look like the Congressional record.

There was grandstanding and attempts to compile comments like
/* We should use Java for this */
/* Ruby is a sane design */
?
posted by rough ashlar at 1:56 PM on March 21, 2011


I have yet to hear a good argument against nano,

You conveniently snipped out my personal reason to not want to use nano, which is dependence on arrow keys to navigate text. You might not mind moving your hands off the home row all day long, but I do.
posted by kmz at 1:57 PM on March 21, 2011 [1 favorite]


PHP is all plain text - it's a scripting language interpreted by the PHP interpreter (which is itself written in C) - the interpreter is made available to apache webserver via an apache module or via CGI.

There are techniques for obfuscating PHP, but somewhere along the line you're going to have to have a decryption key or a password or something somewhere in your code.


Yeah, but why couldn't you do something along the lines of:

1) Store your encrypted password in a plain text file that is below the root directory of the webserver.
2) In your php script, open the file with the password, somehow parse the contents and assign the encrypted password value to a variable.
3) Decrypt the password in memory (whether you have to have some private key in another file or whatever, assume the implementation details are not that important)
4) Use the variable with the decrypted password as you normally would use your plain text password in the function call to the database.

That's the way I would implement something like that. Granted that I probably don't know what the hell I'm talking about w/r/t web security, but it seems to me that would be the way to go.
posted by SweetJesus at 1:58 PM on March 21, 2011 [1 favorite]


Who developed this editor, and what did they have against humanity?

Bill Joy. Co-Founder of Sun Microsystems.
posted by cmdnc0 at 1:59 PM on March 21, 2011 [1 favorite]


leads to the kind of crap described in the OP.

And oh, you really think the choice of editor is the crux of what happened here? Are you serious?
posted by kmz at 1:59 PM on March 21, 2011


Yeah, but why couldn't you do something along the lines of…

You totally could - most well written off-the-shelf PHP stuff works more or less as you describe but skips the encryption/decryption step for convenience (if your credentials aren't web accessible they are reasonably secure1) but it would be trivial to add an encryption step for extra security.

If someone has access to your source code, then they have access to your decryption key, and then they have access to your password. So encrypting is definitely more security and a good thing - but once someone malicious has access to your source code, you're only inconveniencing them a little bit.

The tumblr code on display here is sloppy sloppy, but that's not the security flaw. The flaw is that the web server is serving secure data directly. If those database passwords were simply in another file which was not accessible by the webserver, encrypted or no, this would have only been an embarrassing bug.

It's like that scene in the Simpsons where Mr. Burns goes through several, increasingly ludicrous levels of security in the power plant, only to find the screen door was left open and a dog was hanging out in there.

1. "reasonably" assuming your web server is secured and you encrypt user passwords and and and and.
posted by device55 at 2:14 PM on March 21, 2011 [1 favorite]


Miraculously, she said, it was as though the decades had rolled back to the good old days of internet axe murderers and classic snark.
posted by infini at 2:18 PM on March 21, 2011


I JUST made a Tumblr page for my new band, will this debacle impact me at all?
posted by Cpt. The Mango at 2:32 PM on March 21, 2011


k5.user: "E558: Terminal entry not found in terminfo"

The program that generated that error was vi.

vi is installed.

It is technically not even a valid Unix machine if you don't have vi.

On a dumb terminal ex (which can do anything you can do in vi at the ":" prompt). Ex is usually the same executable as vi, but runs without the screen oriented display.

I am an emacs user. Hell I have written some decent sized programs in emacs lisp. But yeah I know vi (and I can run it in ex mode, or even use ed in a pinch) because vi is always there and often comes in handy.

(also if the only vi you have installed is vim it is not a Unix system because vim does not follow the specifications for being a valid vi implementation)
posted by idiopath at 2:36 PM on March 21, 2011 [2 favorites]


Is this something I'd need a test environment to understand?
posted by Spatch at 2:40 PM on March 21, 2011 [8 favorites]


It's also been said that vi/vim is an excellent tool for generating random alphanumeric strings. All you have to do is put a novice user in front of it and use their keystrokes as they try and quit as output.

WTF: that just happened to me! I read the post before it, realized I was logged into a friends server via SSH, and went to try it out. I just got out.
posted by coolxcool=rad at 2:44 PM on March 21, 2011 [3 favorites]


echo < ed < vi
posted by ennui.bz at 2:47 PM on March 21, 2011


Bill Joy wrote vi in 1976. I bet most people working at tumblr weren't even born then.

This interview with him is interesting: I tried to write the thing in Pascal because Pascal had sets, which Ken Thompson had permitted to be of arbitrary length. The program worked, but it was almost 200 lines long - almost too big for the Pascal system.

Also, metafilter has this handy script.
posted by exogenous at 3:13 PM on March 21, 2011


Is this the modern 2.0 version of "Oops sorry boss i hit send on that email" ?
No, it's just the classic version of flat out incompetence.
Could someone give a brief explanation of vi/vim? Does it offer some advantage?
It's advantage is that it's somewhat less intimidating to learn than emacs.

Emacs and vi are the two de facto standard text editors found on every Linux and Unix installation not called Ubuntu. Both editors are important to Unix users because they are fully functional when used from a command-line text interface.
posted by i_have_a_computer at 3:18 PM on March 21, 2011


I JUST made a Tumblr page for my new band, will this debacle impact me at all?

If you're using the same password for the Tumblr page that you're using for other systems, say:
m3MpH1C0Koh39AQD83TFhsBPlOM1Rx9eW55Z8YWStbgTmcgQWJvFt4
then you should updates those passwords.
posted by sebastienbailard at 3:19 PM on March 21, 2011


Kadin2048: "Eight Megs And Constantly Smirking?"

justin@information-density:~$ ps -eo cmd,rss | grep emacs
/usr/local/bin/emacs-23.1 48896
grep emacs 728


pretty soon we are going to have to upgrade that cliche to "eight gigs and constantly swapping"

ennui.bz: "echo < ed < vi"

Most systems keep ed in /usr/bin and vi in /usr/local/bin so there are very few circumstances where that command would do anything but print an error.

On the other hand "echo > ed > vi" should create one empty file and one containing a single newline.
posted by idiopath at 3:25 PM on March 21, 2011


vi is an excellent text editor. I use it every day: code, HTML, email. I even write my long Metafilter comments in vi.

This is not a problem with vi; this is a problem with a company that doesn't have competent developers.
posted by sonic meat machine at 3:34 PM on March 21, 2011


sonic meat machine, you can have all the source control in the world and still fail to test before moving code to production.

Now, if you linked tools that did code analysis on check-in...
posted by mikeh at 3:36 PM on March 21, 2011


This holy war between emacs and vi... you ARE just using it for an excuse to exhibit your dashing wit, right? Nobody's going to murder me because I find windowed IDEs convenient?
posted by LogicalDash at 3:44 PM on March 21, 2011


octothorpe: "tomswift, presumably user passwords in the database. are encrypted."

wcfields: Here's a tip for judging security of databases: if you use a "forgot my password" and they send you an email with it, RUN.


As of just now, Tumblr does not send you your email, just a link to reset it. Which means that if your browser has stored your password and and you don't remember what it is, you have no way to find out what your password at Tumblr was (to check whether you use it anywhere else).

Does anyone really know if their passwords were encrypted or not? And since their server was compromised, does that mean that even encrypted passwords are effectively compromised?
posted by msalt at 3:48 PM on March 21, 2011


This holy war between emacs and vi... you ARE just using it for an excuse to exhibit your dashing wit, right?

That's what *I* told the police, anyway.
posted by Dark Messiah at 3:48 PM on March 21, 2011


Also, metafilter has this handy script.

This is actually built-in now, isn't it? I'm not running any userscripts, and I can j/k down/up.
posted by limeonaire at 3:53 PM on March 21, 2011 [1 favorite]


Learning curves for editors
posted by BigCalm at 3:53 PM on March 21, 2011 [8 favorites]


It's easy to lol at this mistake, but part of me admires it. Yeah, in a proper development shop there should be about four stages of checks to prevent the mistake. Unit tests fail, integration tests fail, code review sees the mistake, or the release engineer catches it.

But "proper development" is slow. And tedious. The best and most productive times in my life have been when I really can just hack the live server code, update it and reload immediately, and life is good. I always regret when something I work on has become popular / big / serious enough that I have to put process in place, because the process slows me down. Still, Tumblr has enough users they really need to be more careful.

You know how gets this right? Etsy. They deploy 20+ changes live to the site a day and they've got enough infrastructure in place it seems to work without disaster. Here's notes from Etsy on continuous deployment, I think they also spoke about their process at SXSW. It must have been expensive to set up all the deployment support, but I sure envy what they can do with it.
posted by Nelson at 4:03 PM on March 21, 2011 [4 favorites]


Actually, it's

:0,$facepalm
posted by clvrmnky at 4:05 PM on March 21, 2011 [2 favorites]


Which means that if your browser has stored your password and and you don't remember what it is

You can almost always find out what the browser has stored for the password. At some point or another the browser is going to have to know the password in plaintext form. Some browsers just let you see it, others might require an extension or external program.

The best and most productive times in my life have been when I really can just hack the live server code, update it and reload immediately, and life is good.

At the very least you can have a dev directory. It doesn't really add much process at all, but it's a lifesaver for very simple big mistakes.

You know how gets this right? Etsy.

With code, maybe, not so much with privacy policies.
posted by kmz at 4:08 PM on March 21, 2011


msalt: "Does anyone really know if their passwords were encrypted or not? And since their server was compromised, does that mean that even encrypted passwords are effectively compromised?"

I would be inclined to say they are using some sort of one-way hash function to secure their passwords, making it unlikely that a malicious user could extract your plaintext password from the database in a trivial fashion. That said, given the apparent lack of regard for best practices they've shown so far, I have to figure they've screwed something up along the way, like not salting their hashes, making some sort of rainbow table type attack possible.

Basically, while your password is probably safe, its probably a good idea to change it anyways. That goes double if you use the same password in a lot of places, and triple if you use same/similar usernames in those places. Finally, consider swapping out your browser's password manager for a password safe likeKeePass orPasswordSafe.
posted by grandsham at 4:15 PM on March 21, 2011 [1 favorite]


Wow its /. night at mefi. Where my mod points at?
posted by tempythethird at 4:28 PM on March 21, 2011 [2 favorites]


*changes password to ':nipple_offset'*
posted by quonsar II: smock fishpants and the temple of foon at 5:20 PM on March 21, 2011 [4 favorites]


Emacs vi mode.
posted by benzenedream at 5:42 PM on March 21, 2011


Finally, vengeance.

I've got a beef with Tumblr. I joined it soon after it started, and nabbed the brilliant URL plastic.tumblr.com (nothing there anymore). I joined Tumblr because they were promoting the feature that it could pull in content from a set of RSS feeds and aggregate them into a sort of meta-blog for you. So that's what I did - provided it with various feeds from my blog, Youtube, Flickr, whatever, and I had a beautiful "What's Jimbob been doing today" Tumblr.

Then, one day, it disappeared offline. I couldn't log in. I had been banned, with the message that I had broken their terms of service by...using the RSS aggregation function! It seems they decided that aggregating RSS feeds wasn't what they wanted Tumblr to be for anymore, and so banned me. They didn't disable my feeds, maybe shoot me an email telling me they weren't supporting that feature anymore. I was just banned for using a feature that a year earlier they had told me to use.

So screw them. The Fuck Yeah sites are funny, though...
posted by Jimbob at 5:52 PM on March 21, 2011 [6 favorites]


You conveniently snipped out my personal reason to not want to use nano, which is dependence on arrow keys to navigate text. You might not mind moving your hands off the home row all day long, but I do.

Arrow keys are ubiquitous. 99.9% of the keyboarding population can be presented with a keyboard and understand automatically what they do and what their expected behavior is. This is the reason Vi also supports arrow keys. So while I understand the reason why you don't use nano, I don't understand the reason the person who fucked this up didn't use it. From my own experience as a software architect, I would venture to guess the reason they didn't use nano is because they were brow-beaten by their peers to use a "Real" editor (that they didn't understand) instead of something specifically designed to cater to people that are used to modern editors. I see this all the time when a new hire expresses a preference for Windows over Linux.

And oh, you really think the choice of editor is the crux of what happened here? Are you serious?

Do I think their choice of editor is the reason they fucked up the code? Of course it was. There's very little doubt in anyone's mind what editor they were using, what command they thought they were keying, and how it ended up.

Now, do I think their choice of editor is the reason the fucked up code wound up in production? Don't be silly.

And to whomever suggested this was a version control problem: uh, no. Version control just means your fuck up gets preserved in a convenient snapshot for all of history to laugh and point fingers at. But it does nothing to stop bad code from going into production. Cries of continuous integration are similarly misguided. Continuous integration merely ensures the bad code moves off the dev's box in a controlled manner, but alone does nothing to ensure the quality of that code.

The real problems are:
  1. They didn't run this through any kind of QA, automated or otherwise
  2. Developers shouldn't be able to get within a hundred miles of a production box. They shouldn't even have logins, never mind root privileges.
posted by Civil_Disobedient at 5:58 PM on March 21, 2011 [3 favorites]


Could someone give a brief explanation of vi/vim? Does it offer some advantage?
posted by zennie at 12:47 PM on March 21 [+] [!]


Before the telephone, there was Morse Code. vi is more arcane than Morse Code, but, strangely, it came after and not not before.

One advantage is that if you leave a text editing session open, it's unlikely that anyone a normal person can come by and save your session or exit the program.
posted by StickyCarpet at 6:09 PM on March 21, 2011 [1 favorite]


Which means that if your browser has stored your password and and you don't remember what it is, you have no way to find out what your password at Tumblr was

Using Firefox? Tools -> Options -> Security -> Saved Passwords -> Show Passwords -> Yes
posted by inigo2 at 6:19 PM on March 21, 2011 [1 favorite]


Passwords do not belong in source code. Put that in your config file and keep that config file outside the webroot.
posted by o0o0o at 7:50 PM on March 21, 2011


So, in order to continue my track record of exposing myself to a higher-than-average levels of pure mockery, I'm going to admit it:

I maintain a .asp ecommerce store on a windows box with notepad++. The last time I used nano was to fiddle with the hosts file on my personal PC. I used VI once, 7 years ago in high-school. I write all my personal sites/themes/views in Coda on a mac.

God, I feel dirty. You people are my idols; still though, I kind of feel bad for you.
posted by thsmchnekllsfascists at 8:41 PM on March 21, 2011


BTW, why does fuckyeahsourcecode.tumblr.com not exist yet? This seems like it should pretty well satisfy the end of the fuckyeahNOUN meme well and good.

maryr: It exists now!
posted by tomierna at 8:49 PM on March 21, 2011 [3 favorites]


Fraxas - but ed is the standard unix text editor, it says so in the manpage!
posted by russm at 8:52 PM on March 21, 2011


I maintain a .asp ecommerce store on a windows box with notepad++

We have all been there, I've maintained a Coldfusion ecommerce site with notepad, we used access as a backend and used Excel to edit production data.

My process is far more manual than I would like right now but this is how it works. Keep in mind this isn't even publicly available yet.

1) code is checked in to TFS with gated checkins, the code is built and the tests are run. If the conditions aren't met, the code is examined and checked in again. For certain sections of the code we do reviews before checkin so we don't end up with "but it works!" code.
2) code is deployed to a test system that consists of 2 web server behind a hw load balancer and 2 app servers behind a load balancer, and a SQL server cluster. These have the same setup as production, just less servers, so all the same firewall rules are in place.
3) I mark the feature as implemented, a QA guy goes and makes sure everything meets business requirements, and if so marks it as verified. PM sends screen shots to the business to get sign off on the look and feel. These guys like to meddle so they will tinker with wording or icons
4) towards the end of the iteration we promote the current QA build to staging servers, these match production. There are 4 web servers, 6 application servers and a SQL cluster. This matches production exactly with same load balancer setup and firewall rules ( I should add I am not allowed to tinker with the hwlb or firewalls). We let the business know that a release candidate is up and they go check it out.
5) we schedule downtime with the business after EOB west cost time, usually around 10pm eastern
6) I manually build with DEBUG undefined and config transformations handle all the configuration changes needed,.
7) we isolate each production system, update the code, manually test each feature from the iteration and test a checklist of features for regression. We do this on a voice bridge so we can sound off as each checkbox gets checked.

Why all this nonsense? Because 95% of the production issues I have seen are caused by configuration issues, not code. Things like firewall rules being more strict in production then on a dev box, conditions you assume are true turning out to be false, and plain old communication screw ups with infrastructure guys, devs, and DBA.

Step 6 sucks, downtime is uncool nowadays and most of the issues come from me working tired and forgetting to turn a server back on or something dumb, but we catch an issue before the server even goes back into the pool. As we expand I hope to get this tightened up so all I have to do is press a button, sit back, and relax.If I just dumped the code on the servers and went home I would be out on the street.
posted by Ad hominem at 9:33 PM on March 21, 2011 [1 favorite]


It's also been said that vi/vim is an excellent tool for generating random alphanumeric strings. All you have to do is put a novice user in front of it and use their keystrokes as they try and quit as output.

And that vi/vim has two modes: one that beeps at you, and one that corrupts your text.

(I still use it over emacs. I already have one OS on my machine, thanks.)
posted by asterix at 9:50 PM on March 21, 2011 [2 favorites]


For a guy who has never really used source code management and uses Windows 7 and would like it to be as pain-free as possible, what should I use?
posted by maxwelton at 10:33 PM on March 21, 2011


For a guy who has never really used source code management and uses Windows 7 and would like it to be as pain-free as possible, what should I use?

Give Github and Tortoisegit or gitextensions if you need VS integration a shot. You won't have to maintain a server and you should be able to do almost everything from windows explorer.
posted by Ad hominem at 10:51 PM on March 21, 2011


Civil_Disobedient: Arrow keys are ubiquitous. 99.9% of the keyboarding population can be presented with a keyboard and understand automatically what they do and what their expected behavior is. This is the reason Vi also supports arrow keys. So while I understand the reason why you don't use nano, I don't understand the reason the person who fucked this up didn't use it. From my own experience as a software architect, I would venture to guess the reason they didn't use nano is because they were brow-beaten by their peers to use a "Real" editor (that they didn't understand) instead of something specifically designed to cater to people that are used to modern editors. I see this all the time when a new hire expresses a preference for Windows over Linux.

In fact, Nano, as a clone of Pico, supports a limited but useful variation on the basic Emacs movement keys. Ctrl-F moves the cursor forward, etc. It's certainly not the worst editor in the world. I continued to write e-mail in Pico (the default editor for Pine) for a good 5 or 6 years after I had moved to Vim for most everything else.

The rest of this argument is not particularly moving. Nano's design has little to do with "modern editor", and pretty much everything to do with a lightly tweaked and slightly modernized spin on an effort at a less-intimidating subset of the functionality of a "serious" editor a good 15 or 20 years ago. Working programmers tend to use tools with different interfaces and steeper learning curves because those tools are useful. The apparent lesson here isn't "somebody made a text editor goof"; it's "somebody made a text editor goof where it could matter, and that's something to avoid".

Text editors are by now an ancient holy war, of course, but while I only play the vi-vs.-Emacs game now for laughs, there's lately a thread of "you ridiculous troglodytes! Serious professionals have all abandoned those tools and modalities!" entering the discussion that I think is not doing anyone any favors.

Developers shouldn't be able to get within a hundred miles of a production box. They shouldn't even have logins, never mind root privileges.

I just realized I should have stopped when I hit the phrase "software architect".
posted by brennen at 1:25 PM on March 22, 2011 [1 favorite]


(Not, to perhaps reduce the flaminess of that last, that I'm going to argue there's anything objectively wrong with the general mindset that wholeheartedly embraces terms like "architect" and espouses an absolute separation of domains between people who hack and people who run servers, but thee and me might as well be living on different planets.)
posted by brennen at 1:36 PM on March 22, 2011


« Older Exodus for Apple?   |   The site should smell like a musty book. Newer »


This thread has been archived and is closed to new comments