"...truly general principals are hard to come by, and ultimately every case is a different case."
November 30, 2008 9:19 PM   Subscribe

Tom ("Duff's Device") Duff examines improving computer source code to make it more concisely communicate it author's intent, in Reading from Top to Bottom.

Of particular interest, Duff suggests historical reasons many coders don't write code that can be read from top to bottom, taking a first cut at an intellectual history of influences on the styles and convention of coding practice.

While aimed at an audience that codes, this essay can be read and appreciated by a motivated non-coder.
posted by orthogonality (52 comments total) 9 users marked this as a favorite
 
Hmm, personally I feel like putting in a superfluous 'else' in the code actually makes it easier to understand. It makes it obvious that the second block only runs if the first does not.
posted by delmoi at 9:32 PM on November 30, 2008 [1 favorite]


return i?a(i-1, j?a(i, j-1):1):j+1;

Maybe he could take a paragraph or two to discuss the importance of ever using whitespace.
posted by 0xFCAF at 9:41 PM on November 30, 2008 [1 favorite]


Didn't he lay out an early compositing arithmetic for using the alpha channel of a CG image to composite it over a background?
posted by jfrancis at 9:43 PM on November 30, 2008


delmoi writes "It makes it obvious that the second block only runs if the first does not."

In the first case, yes, I agree. The next couple of examples, you start to see where the superfluous "else" would get in the way readability.
posted by orthogonality at 9:49 PM on November 30, 2008


I love the combination of admiration of honed code & utter disgust its creator shows when writing about Duff's Device: looking for excuses to ignore it, but the compilers just refuse to reject it.
posted by Pronoiac at 10:14 PM on November 30, 2008


Didn't they have switch in 1999?
posted by twoleftfeet at 10:16 PM on November 30, 2008


Maybe he could take a paragraph or two to discuss the importance of ever using whitespace.

I think the ternary conditional is much easier to read with no spaces around the operators, especially when nested.


It makes it obvious that the second block only runs if the first does not.

Multiple return statements don't do that for you? Maybe if he indented the final return to line up with the first it would drive it home for you? The else clause isn't free either: barring compiler optimizations it costs a local variable, an assignment, and a jump.
posted by blasdelf at 10:22 PM on November 30, 2008 [2 favorites]


For me you can omit the else (and the extra variable declaration) in the first example, no problem.
I used to have a boss who was mad-keen on the compose(d) method, which looks like it has a similar aim. Here's a bit from Martin Fowler. Anyone who writes code while stopping to think about the people who (inevitably) will have to read it is doing something right, imho
posted by mjg123 at 10:24 PM on November 30, 2008


mjg123 writes "I used to have a boss who was mad-keen on the compose(d) method, which looks like it has a similar aim."

That looks more like functional decomposition, composing non-primitive operations out of primitives.

On anther note, yeah, Duff could definitely use more whitespace. Especially for things like "i<n->a", which my eye wants to read as if the less than is an angle bracket paired with the pointer-to-member's ">" symbol. I always surround binary operators other than pointer-to-member with whitespace. I even prefer to put whitespace after an open parenthesis, "( ", and before a close parenthesis, " )".
posted by orthogonality at 10:39 PM on November 30, 2008


Really, a lot of C code is written quite terribly these days, or rather, it was terribly unreadable even back then, but the compiler was too stupid to not write that way. I'm reminded of suggestions by Carmack (or someone else at iD) that main() should be at the bottom of a file. I always did this to avoid prototyping functions, but it sure does make understanding things hard. C had several constraints attached to it when designed, such as single pass parsing, that totally suck now.

Another artifact of history is pointer arithmetic; it dramatically reduces the optimizations that can be made. I've read benchmarks where such arithmetic helps in unoptimized compiler passes but not optimizing compilers. I should probably try to duplicate that sometime.

A few curmudgeons will point out that commercial compilers don't include a lot of optimization techniques and the full employment of compiler programmers theorem means we'll never be perfect. Still, register allocation is almost never touched on in hand-optimization techniques and yet is a vital part of good code.
posted by pwnguin at 12:03 AM on December 1, 2008 [1 favorite]


> more like functional decomposition,

Yes, definitely, but it immediately sprang to mind. The topic of making code easily readable (ie concisely communicate its author's intent) and maintainable is one that I like, and that's what I thought Duff was getting at...
posted by mjg123 at 12:10 AM on December 1, 2008


Also, I've gotta wonder why an article from ten years ago is now on MeFi and reddit. Seems his personal server can't handle the traffic.
posted by pwnguin at 12:20 AM on December 1, 2008


I've always felt that for any computing problem there is always at least one best (fastest) solution, one best (human-parsable) solution that's approximately 5-10% slower, and 10,000 other solutions that are neither.
posted by Civil_Disobedient at 1:21 AM on December 1, 2008 [4 favorites]


Civil_Disobedient writes "I've always felt that for any computing problem there is always at least one best (fastest) solution"

"Best" depends on your hardware (and microcode).

"one best (human-parsable) solution that's approximately 5-10% slower"

Depends on your human.

Seriously: I find the ternary operator very easy to read, and prefer
  int n = q < 10 ? 0 : 1 ;
to
  int n ;
  if( q < 10 ) {
   n = 0 ;
  } else {
   n = 1 ;
  }

But I've worked with folks who find the ternary "hard to read".


Similarly, how many times have seen those who don't understand booleans well do this:
if( b == false ) {
  return true ;
} else {
  return false ;
}

Clearly, to these people, this is "easier to parse" than what to me is the far more clear:
return !b;

And those without a strong C background will probably argue that something like this is clearer:
return b == false ;

There's no one best solution: what any one coder thinks is best depends on the languages, methodologies, and idioms he's been exposed to in his scholarship and career.

As that experience can't (and shouldn't) be standardi(s/z)ed, different people will work best generally with the idioms they know; that's why I'm wary of too rigid coding standards, as I think they tend to work best for those, like managers and coding standard authors, who end up working the least with the code.
posted by orthogonality at 2:01 AM on December 1, 2008


I think the ternary conditional is much easier to read

If you don't fucking use it.

Seriously, the ternary operator is a great example of creating denser code that is less easy to read. Yes, it's only slightly less easy to read when you're used to it, but it's completely unnecessary, and the deployment of it is ironic here.
posted by rodgerd at 2:20 AM on December 1, 2008 [1 favorite]


Feelings run pretty hot when it comes to the ternary conditional. But in the end, we all love each other. Right?

If so, we agree, else go fuck yourself.
posted by twoleftfeet at 3:16 AM on December 1, 2008 [7 favorites]


Why do people call it the ternary operator? Do you call *, /, +, or - the binary operator? Or * the unary operator? I realize it is the only ternary operator in most languages, but it still seems odd to identify it that way.

Another vote for the ternary operator being far more readable for simple conditionals than a written-out if-then-else block. At a certain point programmers should be expected to actually know the language they are working with instead of expecting others to dumb down their code by avoiding useful idioms that maybe not everyone knows about or is comfortable with.
posted by zixyer at 3:23 AM on December 1, 2008 [1 favorite]


I stopped reading when he rewrote to "avoid deep nesting" but then added 4 spaces to his tabbing.
posted by DU at 4:40 AM on December 1, 2008


If you ever want to get out of the maintenance of code, write it so others can maintain it. Ideally, others who require less training than you did so that you can hire from a larger pool.

If you want to maintain your own code forever, write it exactly the way you want to read it with maximum efficiency.

Consider that you(now) and you(future) may as well be different people, and if you do the latter you may have some disaffection for you(now) unless all you've done between now and future is maintain the same code, forgetting nothing ever.

Me, I can't think of a greater hell so I write code for other peoples' readability - always - even when the other people are me(future).
posted by abulafa at 4:58 AM on December 1, 2008 [3 favorites]


The best reason for using a= q? b:c; rather than the 'if' form is, when reading the code, you immediately know that a and only a is getting a new value.

The 'if' form has to be parsed more carefully--it may say if(q)a= b; else b= c; which is somewhat different.
posted by hexatron at 5:07 AM on December 1, 2008 [2 favorites]


> While aimed at an audience that codes, this essay can be read and appreciated by a motivated non-coder.

In a somewhat labored sense of 'appreciated', since that implies interest in the significance of differing styles of expression of a language one doesn't understand.

I'm working my way through it, but since I don't know C there are bits I have to give up on. Lines like return i?a(i-1, j?a(i, j-1):1):j+1; are indistiguishable from pseudorandom strings to me, but since I'm told they're optimal I want to have a sense as to why or lose the point of something I want to learn from. When coding goes beyond clear notation and into shorthand (for (a=0;a<c;a++) b[a]=foo;) or involves common functions with cryptic names (strchr(" \t\n", c);), what's obvious even to newbies is opaque to non-coders.
posted by ardgedee at 5:15 AM on December 1, 2008


And now, this thread's equivalent of "Hey, any of you guys seen the mouse to my Amiga?"

What are these Earth customs you call "Whitespace" and "Top to bottom"?


Signed,
a Perl hacker
posted by Spatch at 5:37 AM on December 1, 2008


ardgedee writes "Lines like return i?a(i-1, j?a(i, j-1):1):j+1; are indistiguishable from pseudorandom strings to me, "

Ok, the ternary operator is a simplified if, in the form test ? true : false

If the test is true, the whole expression is the true sub-expression, otherwise it's the false sub-expression.

So the expression q == 0 ? 1 : 2 means, "yield 1 if q equals 0, otherwise yield 2".

We can assign that to a variable, as in "n = q == 0 ? 1 : 2", or use that as the argument to a function, as in "f( q == 0 ? 1 : 2 )"

In C, any non-zero value is true, and any zero value is false, so a common idiom is to omit the explicit test, thus we can write either "q != 0 ? 2 : 1" or "q ? 2 : 1"; both forms are equivalent. In C, the symbol "!" means "not", and the symbol "!=" means "not equal".

Sometimes it's easier to see what the ternary is doing by putting the test, the true result, and the false result n separate lines, like this:

q
  ? 2
  : 1


So let's break out the code you quoted, "return i?a(i-1, j?a(i, j-1):1):j+1; "; for clarity I'll break it into several lines, and replace tests in the form "variable_name" with the equivalent form "variable_name != 0":

return i != 0
  ? a(i-1, j?a(i, j-1):1)
  : j + 1;

We see that if i is not zero, we return the second line, otherwise the third line. The third line is just "j + 1", so we're done with that line. The second line is a function call, which itself contains a ternary. Let's break out the second line only:

a(
  i - 1,
  j != 0
    ? a( i, j - 1)
    : 1
  )

Ok, function a takes two arguments. The first is always "i - 1", so we're done with that. The second argument depends on th value of "j". If "j" is true (that is, non-zero), the second argument to function a is the value resulting from calling function a with the arguments "(i, j - 1)". But if j is false, the second argument to a is 1.

This complexity, by the way, is why Duff wants to rewrite the function to be more "top to bottom".

So "return i ? a( i - 1 , j ? a( i, j - 1) : 1 ) : j + 1; " means "return (as the value of calling this function), j plus 1 if i is zero, otherwise, if i is not zero, return the result of calling function a, with the arguments i minus one and ( if j is zero, one, otherwise, if j is non-zero, the result of calling function a with the arguments i and j minus one )".
posted by orthogonality at 6:17 AM on December 1, 2008


Ah, C programmers. They can write essays on how using their personal formatting preferences dramatically improves code clarity, and yet they still end up writing uncommented code filled with single-letter variables and ambiguously named functions.
posted by burnmp3s at 7:05 AM on December 1, 2008 [11 favorites]


He doesn't say much about the "single return" faction, but my understanding is that rule was frequently included in C style guides. I thought it is considered especially important in a language like C that doesn't include any automatic cleanup mechanisms (cf C++'s destructors). Otherwise each return statement has to be preceded by a series of calls to free (or whatever) and it's that much easier to forget one.
posted by Horselover Fat at 7:26 AM on December 1, 2008


I stopped reading when he rewrote to "avoid deep nesting" but then added 4 spaces to his tabbing.

If you have 8 space tabs, then deep nesting becomes so painful that you have to find ways to avoid it. From the Torvald's Linux kernel code style document:

Now, some people will claim that having 8-character indentations makes the code move
too far to the right, and makes it hard to read on a 80-character terminal screen. The
answer to that is that if you need more than 3 levels of indentation, you're screwed
anyway, and should fix your program.


I have high hopes for this thread. I personally prefer Python's "indentation indicates control flow", prefer Vim to Emacs, think people should be allowed to have a sweater with blinking lights at the airport and think Sarah Palin is an idiot.
posted by bonecrusher at 7:29 AM on December 1, 2008 [5 favorites]


I very much want to be a person who doesn't care what this essay has to say. I want to view it as the amusing anachronism that it is. Good god I hope I don't program in C for the rest of my life.
posted by rlk at 7:36 AM on December 1, 2008 [1 favorite]


Do you call *, /, +, or - the binary operator? Or * the unary operator?

No, because there's more than one binary and unary operator. I would refer to the relevant operators as binary *, binary -, unary *, and unary -.
posted by grouse at 8:16 AM on December 1, 2008


My answers to the points in these comments:

yes, yes, yes, no, no, yes, no, yes, O DEAR GOD BURN IN HELL YOU TERTIARY-FUCKING DUMB-BUCKET, no, yes, sometimes, a strong no.
posted by sleslie at 8:50 AM on December 1, 2008


He mentioned its name but not its background, & he wrote this before Wikipedia. If you're wondering what that i & j function is, it's the Ackermann function, which is a standard non-trivial function to demonstrate recursivity. It's easy to define but it works out compilers, languages, & other algorithms (such as memoization), so it's somewhat usable as a quick benchmark.
posted by Pronoiac at 9:08 AM on December 1, 2008


You can pry the conditional operator from my cold, dead hands.

Did you know that in PHP, the ternary operator is left-associative? I'm not kidding.
posted by you at 9:27 AM on December 1, 2008


> So "return i ? a( i - 1 , j ? a( i, j - 1) : 1 ) : j + 1; " means "return (as the value of calling this function)...".

Thanks for that explanation.

By working at the essay, I think I learned useful things from reading it, but I also had sufficient experience in C-derived languages that I could muddle through. I doubt anybody without that much would find it comprehendible at all. It's clearly worth learning from. Duff's obviously written it as a sketch or memo about a particular phenomenon rather than as a prepared tutorial, so I don't begrudge the essay for being what it is.

At the same time, your response emphasizes the point that if Duff's essay is to be (as advertised) interesting to noncoders it will have to either require more footnotes / sidebars / links to make clear what's going on, or be rewritten so that the examples use a pseudocode that can be followed by anybody with sufficient skill in math and logic, and not sufficient skill in C specifically.
posted by ardgedee at 9:33 AM on December 1, 2008


blasdelf: Multiple return statements don't do that for you? Maybe if he indented the final return to line up with the first it would drive it home for you? The else clause isn't free either: barring compiler optimizations it costs a local variable, an assignment, and a jump.

Firstly, no they don't. Return statements are not meant for conditional flow control. I understand this to be a controversial point, however I feel their use in this manner to be abuse.

Secondly, I would spare any mention of the "o" word in a discussion of readability and code maintenance.
posted by butterstick at 10:16 AM on December 1, 2008


Return statements are not meant for conditional flow control.[citation needed]
posted by you at 10:40 AM on December 1, 2008 [1 favorite]


In my experience, having worked together with both Tom Duff and hexatron, it's a tossup who's the most agile, ambitious, and arcane programmer. But since hexatron is a MeFi member, I vote for and heartily endorse him as the winner.
posted by StickyCarpet at 10:49 AM on December 1, 2008


DU: I stopped reading when he rewrote to "avoid deep nesting" but then added 4 spaces to his tabbing.

I don't think textual nesting is a very big deal, the problem is logical nesting.
posted by hupp at 10:56 AM on December 1, 2008


But deep textual nesting generally means that you're doing too much within a single function. In other words deep textual nesting points to deep logical nesting.

Multiple returns are just not a good idea. The logical flow of code should be as obvious as possible to the reader. One of the major cues a reader has is indentation and embedding a return throws off that cue. More generally multiple returns point to the same thing that deep textual nesting points to. You probably haven't broken up your code into small functional pieces which means that you don't really understand what your code is doing. Obviously there are situations where multiple returns and deep nesting are in fact the best way to do things but these situations are not very common.
posted by rdr at 1:17 PM on December 1, 2008


It makes the lifetime of the loop index be just the loop.
Modern C lets you just do for (int i=0; i<k; i++) {...}.
(Maybe I'm an old fart for this habit, which I cling to from my Fortran days [early 1970's], but limiting the lifetimes of variables makes things easier for register allocators and for people reading your code.)
Which is why I like Lisp, with explicit let blocks.

(defun example ()
  (let ((a 42))
    ; a is defined here
  )
  ; a is not defined anymore
)
Of course most of the control-flow stuff has implicit let built in one way or another.

I've done this in C-like languages but I don't see it a lot and the semantics aren't as obvious:

int example()
{
    /* ... */

    {
        int a = 42;

        /* a is defined here */
    }

    /* a is not defined anymore */
}
posted by vsync at 1:24 PM on December 1, 2008


Please ignore what MetaFilter did to my perfectly good HTML.
posted by vsync at 1:25 PM on December 1, 2008


I don't like side effects in conditionals (or embedded in most expressions, but let's just think about conditionals for now.) Code almost always reads better if you separate the side effects from the tests. For example, I write

do
c=getchar();
while(strchr(" \t\n", c));

in preference to

while(strchr(" \t\n", c=getchar()));
Does anyone want to weigh in on this? Personally, I find the first version more obvious because it's such a common idiom. Is avoiding side effcts in conditionals something I should be thinking about?
posted by fingo at 3:26 PM on December 1, 2008


Avoiding side effects in conditionals something you should be thinking about.
posted by aspo at 3:43 PM on December 1, 2008 [1 favorite]


In fact, 99% of the time any side effect at all is a problem.
posted by aspo at 3:43 PM on December 1, 2008


but then we're talking about C anymore...
posted by rdr at 4:43 PM on December 1, 2008


s/talking/not talking/
posted by rdr at 5:17 PM on December 1, 2008


Avoiding side effects in conditionals something you should be thinking about.

Eliza has now weighed in.
posted by Crabby Appleton at 10:36 AM on December 2, 2008


How do you feel about Eliza has now weighed in?
posted by Pronoiac at 12:13 PM on December 2, 2008


Is it because do I feel about eliza has now weighed in that you came to me?
posted by Crabby Appleton at 6:31 PM on December 2, 2008 [1 favorite]


I'm kind of amazed at how low-level this is. I expected a bunch of C best practices about organizing functions and data structures in a non-OO language, and just got a few variations on "The maxim that functions should have a single return point is invalid" without the important "unless the function deals with memory allocation" caveat.
posted by Harkins_ at 10:41 AM on December 3, 2008


Why do you say that because do I feel about eliza has now weighed in that you came to me?
posted by Pronoiac at 11:03 AM on December 3, 2008


Would you like to play a game?
posted by Crabby Appleton at 12:21 PM on December 3, 2008


Oh, i like to play a game.
posted by Pronoiac at 6:01 PM on December 3, 2008


How does it feel to want?

(I swear that's what the GNU Emacs implementation of "doctor" said when I typed in your comment. Kind of snarky for a "psychotherapist", no?)

And I sure ain't gonna play Global Thermonuclear War with you, no way, no how!
posted by Crabby Appleton at 6:45 PM on December 3, 2008


« Older Gamblers Anomalous   |   Bagels and Bongos Newer »


This thread has been archived and is closed to new comments