The infernal semicolon
April 16, 2012 8:59 AM   Subscribe

This February, Twitter released Bootstrap 2 a rewrite of their earlier Bootstrap code. It's basically a framework that offers barebones styles and functionality. What's of interest, though, is that it uses almost no semicolons (just 15 in over 1k lines of code), which are normally used to separate lines of code. Instead, the code relies on automatic semicolon insertion (ASI). Unfortunately this code breaks when minified using JSMin. This was reported as an issue on Bootstraps's github page which led to a heated discussion on the topic of ASI.

More analysis on this blog that suggests this might be an aesthetic choice to make it more familiar with Ruby developers. It's also what drove the creation of CoffeeScript a language that compiles to JavaScript but more closely matches Ruby's language syntax rather than JavaScript's (which more closely resembles C).

More discussion on ycombinator.

It's important to note that Douglas Crockford (previously, previously) is considered something of a JavaScript guru/curmudgeon. He's pretty anti-ASI:
One of the mistakes in the language is semicolon insertion. This is a technique for making semicolons optional as statement terminators. It is reasonable for IDEs and shell programs to do semicolon insertion. It is not reasonable for the language definition to require compilers to do it. Use semicolons.
Others are more ambivalent about ASI, and some embrace it.

Others have pointed out that the real problem is messed up coding (NSFW language), although some of that bad coding might actually come about from trying to avoid semicolons altogether.
posted by Deathalicious (128 comments total) 15 users marked this as a favorite
 
I saw this the other day; I decided to use more semicolons in my day to day life; I'm not sure how that's working out so far;
posted by Catblack at 9:04 AM on April 16, 2012 [15 favorites]


I thought the real problem was a couple of emotionally stunted programmers having a go at each other.

Even Microsoft will adapt to bad program design if the potential for fallout is large enough:
I first heard about this from one of the developers of the hit game SimCity, who told me that there was a critical bug in his application: it used memory right after freeing it, a major no-no that happened to work OK on DOS but would not work under Windows where memory that is freed is likely to be snatched up by another running application right away. The testers on the Windows team were going through various popular applications, testing them to make sure they worked OK, but SimCity kept crashing. They reported this to the Windows developers, who disassembled SimCity, stepped through it in a debugger, found the bug, and added special code that checked if SimCity was running, and if it did, ran the memory allocator in a special mode in which you could still use memory after freeing it.
posted by klanawa at 9:06 AM on April 16, 2012 [18 favorites]


Half my twitter stream has been having semicolon fights over the weekend and I was wondering what the heck was going on. This is the best simple summary of what transpired that I've seen, so thanks for the post.
posted by mathowie at 9:06 AM on April 16, 2012 [2 favorites]


Catblack: "I saw this the other day; I decided to use more semicolons in my day to day life; I'm not sure how that's working out so far;"

You don't need the last semicolon. Online snarking always includes automatic punctuation insertion at the end of the last statement
posted by Deathalicious at 9:07 AM on April 16, 2012 [9 favorites]


I'm a curmudgeony programmer who's been through so many languages that I can't even perceive syntax changes most of the time. (ok, not true.. it's annoying whenever things are C-like.. really ruby elsif what the standard else if with two extra characters annoyed you or you just didn't want to interpret tokens with a white space.)

Until python that is. I work with a group who all love python for slicing and dicing text. So being the helpful, type, I sat down and taught myself the basics of python and then got mad and annoyed that whitespace was suddenly significant again. Poor co-workers had to endure a few rants on the fact that we left whitespace significance behind years ago when people realized it made Make a source of endless frustration. *grumble*

I think that pretty firmly describes what I feel about ASI.
posted by drewbage1847 at 9:12 AM on April 16, 2012 [3 favorites]


Even Microsoft will adapt to bad program design if the potential for fallout is large enough

"Even" Microsoft? Somebody's never seen Raymond Chen's blog. Keeping old, broken programs that blatantly ignored official specifications and good coding practices working is an entire infrastructure in Windows.
posted by kjh at 9:13 AM on April 16, 2012 [10 favorites]


W; T; F?

Who thought ASI was a good idea? Let Ruby be Ruby, let Python be Python, let C-style syntax languages be fucking C-style syntax languages.
posted by kmz at 9:14 AM on April 16, 2012 [4 favorites]


Here's the only proof you need that JavaScript is a horrible collection of mixed metaphors:

});
posted by modernserf at 9:15 AM on April 16, 2012 [15 favorites]


I adore Bootstrap, agree that minifiers should insert semicolons if they aren't there, but ASI is a steaming pile. Use semicolons already.

And Crockford is a God.
posted by zoo at 9:22 AM on April 16, 2012 [10 favorites]


Huh, no mention of Go.
posted by mkb at 9:23 AM on April 16, 2012


To be clear, semicolons aren't "optional". They are, in fact, required in many places in JavaScript. There are some cases where they are not required to terminate a statement, and the rules surrounding when and where the semicolon is optional are complex and require a deep understanding of the language specification.

I think it's fair to say that most people working with JS don't know or care about ASI. The danger, then, is that newcomers to the language will see some code written somewhere without a semicolon and think that they're not required in situations where they are.

JavaScript is perhaps more flexible than it should be. Depending on how you look at it, ASI is either a huge error in the language design, or a deliberate choice to allow for flexible syntax that saves one from useless tokens. This is true of a lot of aspects of the language, and it is is Crockford's perspective. Programming languages shouldn't be riddled with exceptions and edge cases, as it makes code brittle and prone to bugs.

But it also can't be said that ASI is "wrong". Like it or not, it's part of the language, and programmes who understand how to avoid semicolons aren't breaking any rules by doing so. You might also say that, by pretending that C-like semicolon syntax is part of JavaScript, we're doing a disservice to newcomers to the language by not disclosing all of its idiosyncrasies. Isaac Schuelter, the current owner of Node.js, sums up this perspective quite well.

There's a long history of developers trying to make JavaScript look like or behave like another programming language. A popular case is trying to cover up JavaScript's prototypal inheritance model with a classical inheritance model. Coffeescript is the most recent popular example of this. I think this happens because JavaScript is a legitimately strange language, which is part of why I like it. So syntax is a bit of a sore point for the JS community. Do we hide JavaScript's weirder aspects to make it look more like Ruby or Java? Do we pretend it has a classical inheritance model when it doesn't?
posted by deathpanels at 9:29 AM on April 16, 2012 [7 favorites]


we left whitespace significance behind years ago when people realized it made Make a source of endless frustration. *grumble*

Make has served long and well, but at some point we shouldn't let the old tool dictate how new languages work - that's putting the compiler before the horse, or something.

I think that pretty firmly describes what I feel about ASI.

Here I really disagree - they are very different situations. Meaningful whitespace in python adds a unique and to my eye beautiful relationship between What Something Looks Like and What It Does. ASI doesn't really add anything, other than a debatable simplification and decluttering by removing a character from most lines.
posted by freebird at 9:32 AM on April 16, 2012 [7 favorites]


But it also can't be said that ASI is "wrong". Like it or not, it's part of the language, and programmes who understand how to avoid semicolons aren't breaking any rules by doing so.

Agreed -- though I do think it is a design mistake, it is part of the spec, and if you use it properly, then it's certainly not incorrect. It is a fairly serious syntax trap, though, and it means you have to spend time to understand where semicolons are actually explicitly required and where they'll be implicitly added.
posted by eriko at 9:36 AM on April 16, 2012


Opinions aside, you do really have to admire someone who figured out how to write 1k lines of working JavaScript code while using only 15 semicolons.

I'm so, so tempted to report this line as a bug. I'm almost certain that's an unnecessary semicolon.

Also, while people often can choose to use or not use a given compressor, sometimes a compressor is chosen for them.
posted by Deathalicious at 9:45 AM on April 16, 2012 [1 favorite]


Punctuation has no purpose
!LOL
posted by lubujackson at 9:46 AM on April 16, 2012 [2 favorites]


Agreed -- though I do think it is a design mistake, it is part of the spec, and if you use it properly, then it's certainly not incorrect. It is a fairly serious syntax trap, though, and it means you have to spend time to understand where semicolons are actually explicitly required and where they'll be implicitly added.
Agreed. And to isaac schlueter's point, experienced JS programmers in the spotlight with influence in the community should really represent what the language actually is, not a filtered syntax that picks and chooses language features.
posted by deathpanels at 9:47 AM on April 16, 2012


There's plenty of language features that are parts of specs which are rather explicitly shunned though. Has GOTO been officially deprecated by the latest C/C++ standards yet? Register_globals was only deprecated in PHP 5.3, even though everybody knew it was a disaster for ages before that.
posted by kmz at 9:53 AM on April 16, 2012 [1 favorite]


And to isaac schlueter's point, experienced JS programmers in the spotlight with influence in the community should really represent what the language actually is, not a filtered syntax that picks and chooses language features.
posted by deathpanels at 11:47 on April 16 [+] [!]


Crockford is all about a filtered syntax that picks and chooses language features; he even wrote a book called "Javascript: The Good Parts".
posted by a snickering nuthatch at 9:59 AM on April 16, 2012 [5 favorites]


I liked raganwald's take. It's strange to me to come to grips with the fact I'm an old man now and not a Young Turk, but this seems like one of those things where a bunch of old guys who have made the same type of mistake in the past are trying to tell younger guys to avoid a mistake. Instead the Turks take it as a sign of old, stale thinking that they read so much about in the Echo Chamber they've created from a set of like-minded developer blogs.

And the whole thing is made worse by Crockford's insistence on being an ass and the wonderful spirit of "If you're right, make sure to yell and be dismissive, preferably while swearing" that DHH made so incredibly cool.
posted by yerfatma at 10:09 AM on April 16, 2012 [1 favorite]


Syntax will continue to suck as long as we're stuck with various forms of line noise and whitespace as critical components.

Modern IDEs are shockingly old-fashioned in terms of interface and usability compared to what's out there now for other realms of software. It's like opening the hood of an Aston Martin to find a coal-burning steam engine. A language with syntax and idiom designed expressly for an interactive, graphical IDE is the next logical step. Using parentheses and tabs and semicolons are utterly unnecessary with modern UX design.
posted by Slap*Happy at 10:09 AM on April 16, 2012 [2 favorites]


A language with syntax and idiom designed expressly for an interactive, graphical IDE is the next logical step.

Have we already had a post about Light Table?

Using parentheses and tabs and semicolons are utterly unnecessary with modern UX design.

I think you are giving programming languages and IDE writers a little too much credit. I'd rather have to insert punctuation than hack around my IDE's insistence I meant to start a new block here. Smelly like Clippy 2.0 to me.
posted by yerfatma at 10:12 AM on April 16, 2012 [9 favorites]


Crockford is all about a filtered syntax that picks and chooses language features; he even wrote a book called "Javascript: The Good Parts".
Who says we have to like Crockford's syntax? He's not the single authority on the language or how it should be written.
posted by deathpanels at 10:17 AM on April 16, 2012


Who says we have to like Crockford's syntax? He's not the single authority on the language or how it should be written.

Er, probably Crockford...? I wasn't saying that his particular vision is the right one; I was pointing out that of course Crockford is going to pick and choose Javascript syntax- it's what he's famous for.
posted by a snickering nuthatch at 10:21 AM on April 16, 2012


If anyone wants to make a LightTable post, they should do it quickly, because the author is working on getting a kickstarter up. Once the kickstarter is up, any new posts on it would probably get nixed.
posted by a snickering nuthatch at 10:23 AM on April 16, 2012 [2 favorites]


There are reasons for writing ; when your trying to compact the express the relation amongst statements by placing them together, with C's for (i=0; i < n; i++) { .. } being the classic example.

I've found that languages that utilize whitespace significance usually offer enough expressivity that you'd never want to pack additional structure into one line, well usually single expressions stretch over multiple lines in Haskell and ML.

In Haskell, you write simply  foldl1' lcm  for the least common multiple of a list. In C, you'd need around five assignments and two flow controls with conditionals.
posted by jeffburdges at 10:26 AM on April 16, 2012


So Lighttable is Eclipse, basically?
posted by deathpanels at 10:28 AM on April 16, 2012


drewbage1847: Until python that is. I work with a group who all love python for slicing and dicing text. So being the helpful, type, I sat down and taught myself the basics of python and then got mad and annoyed that whitespace was suddenly significant again. Poor co-workers had to endure a few rants on the fact that we left whitespace significance behind years ago when people realized it made Make a source of endless frustration. *grumble*

It's all arbitrary. The semicolon itself is a convention that had to win people over. Back in the days of BASIC statements were separated by colons. When C became Mr. Big Stuff, its conventions and idiosyncrasies became the default flavor. Well, I hate C syntax, I think it makes things unnecessarily cryptic and is a barrier to new coders, and the thinking that gave it to us finds its utmost expression gave us Perl, which can get really line noisy. It is an artificial wall that tells non-coders "Hey, we're so cool we can understand and work with this, and you can't." I think Python and its required whitespace was the first step towards giving programming languages back to more ordinary human beings.

But still, if your language requires semicolons, there is no reason that I can see to get rid of them. I have no idea why ASI is a thing.

modernserf: Here's the only proof you need that JavaScript is a horrible collection of mixed metaphors:

});


What, it makes a mustache guy sad?

yerfatma: I think you are giving programming languages and IDE writers a little too much credit. I'd rather have to insert punctuation than hack around my IDE's insistence I meant to start a new block here. Smelly like Clippy 2.0 to me.

It doesn't necessarily have to be Clippy. I kind of see what Slap*Happy is saying. Programming has stuck to the old text file paradigm for surprisingly long. I thought by this time we'd be manipulating interactive flowcharts and stuff.
posted by JHarris at 10:28 AM on April 16, 2012 [4 favorites]


Slap*Happy: "A language with syntax and idiom designed expressly for an interactive, graphical IDE is the next logical step."

That's pretty much what Java is - a language almost explicitly designed for code generation and IDEs.

But if you're actually willing to fully embrace graphical programming, there's LabVIEW.
posted by vanar sena at 10:31 AM on April 16, 2012


So Lighttable is Eclipse, basically?

God I hope not. I recently abandoned Eclipse for working on my own project; it takes like a full minute to start up, and about as long to shut down, and it requires Java which is rapidly becoming the security hole of choice for budding young hackers to exploit.
posted by JHarris at 10:31 AM on April 16, 2012


Programming has stuck to the old text file paradigm for surprisingly long. I thought by this time we'd be manipulating interactive flowcharts and stuff.

Totally my bias, but if you're looking for this, just get a copy of Visual Studio and use it in drag & drop mode. The funny thing is you will most likely run into some corner case that forces you to go open up a text file and edit it because the visual IDE won't let you do what you want. And that assumes the item's controlled by a text file and not some compiled stuff hidden in a temporary directory somewhere.

I have nothing against VS and think it's a fantastic IDE.
posted by yerfatma at 10:49 AM on April 16, 2012 [2 favorites]


I never imagined I would participate in a long-running holy war, but I recognize it beginning. Here's what I said the last time graphical programming was mentioned. I'll try not to grind this axe too much.
posted by a snickering nuthatch at 10:53 AM on April 16, 2012


It's interesting to me that Python and Ruby get described as readable languages, when missing indentation kills scripts, and "Pythonic" or "Ruby-esque" code is a minification-like process that can do a lot to excise readability in exchange for making shorter scripts. Shorter got equated with more readable, and I'm not sure that's always true. At least with C-style blocks, the units of functional code are immediately clear.
posted by Blazecock Pileon at 10:57 AM on April 16, 2012


Anyway, regarding graphical programming, I liked the hybrid idea behind TermKit, even if the execution is still in its infancy.
posted by Blazecock Pileon at 10:59 AM on April 16, 2012


when missing indentation kills scripts

How is this any worse than missing semi-colons killing scripts? #fanningtheflames
posted by Popular Ethics at 11:02 AM on April 16, 2012 [1 favorite]


Programming has stuck to the old text file paradigm for surprisingly long. I thought by this time we'd be manipulating interactive flowcharts and stuff.

As Jpfed says in his linked post, there's more to coding than the flowcharting side of it, and drag-and-drop only take you so far. We have drag-and-drop environments, we just don't have an IDE that has yet hit the right balance between typing it out and drawing it on-screen, and so we default to handling text for greater control.
posted by fatbird at 11:03 AM on April 16, 2012


Piet, a graphical but incomprehensible programming language.
posted by BungaDunga at 11:07 AM on April 16, 2012 [4 favorites]


#fanningtheflames

I think you meant
/**
 * fanningTheFlames
 */
posted by sodium lights the horizon at 11:09 AM on April 16, 2012 [2 favorites]


Well, I don't know what sort of semicolon related disease is the case here, but the kerfuffle brought to mind the old saw:

"Syntactic sugar causes cancer of the semicolon." — Alan Perlis

As for the "sad mustache guy" }); , I think you are going to meet him if you program in C with some specific features of the 99 standard (or the more recent one), and especially if you program in C++11, where he's all over the place thanks to all the new uses of curly braces in initialization lists and lambda functions. I've heard he also lurks about and delights C# programmers who use lambdas too.
posted by tykky at 11:12 AM on April 16, 2012 [1 favorite]


I played with drag and drop programming first with Lego Mindstorms.

The following a sequence of written instructions from top to bottom of the screen style makes way more sense to me, probably because that's closest to what the computer is actually doing.

Let's not muddy the waters with multithreading.

I started with Python, and found it a challenge to remember semicolons and braces at first when I started Java, but now I find them kind of comforting as I know I don't have to deal with whitespace errors and everything's in a nice little container.
posted by mccarty.tim at 11:14 AM on April 16, 2012


Programming has stuck to the old text file paradigm for surprisingly long. I thought by this time we'd be manipulating interactive flowcharts and stuff.

This just doesn't make sense. It's like saying human literature has "stuck to the old written word paradigm for surprisingly long." We have evolved from stone tablets to fountain pens to movable type to Selectrics to iPads; and similarly from ed to [IDE of your preference]. The tools change but the medium stays the same, not in spite of its limitations, but in fact because it is fundamentally limitless.
posted by kjh at 11:14 AM on April 16, 2012 [6 favorites]


Light Table is not Eclipse, rather it is more like a Smalltalk environment from 20 years ago, for JS and Clojure. This is not a bad thing.
posted by thedaniel at 11:26 AM on April 16, 2012 [1 favorite]


drewbage1847: Until python that is. I work with a group who all love python for slicing and dicing text. So being the helpful, type, I sat down and taught myself the basics of python and then got mad and annoyed that whitespace was suddenly significant again. Poor co-workers had to endure a few rants on the fact that we left whitespace significance behind years ago when people realized it made Make a source of endless frustration. *grumble*

Whitespace is not significant in Python.
posted by 3.2.3 at 11:28 AM on April 16, 2012 [5 favorites]


The written word has undergone immense modernization and modification since its inception. Tho Unicode now allows it, I very much doubt you'd want to program in cuneiform or hieroglyphics.

Hey, here's a thought - construct a programming language using a modern character-based language like Mandarin. You'll find very, very, very quickly that programming languages aren't actually languages. They don't communicate with the machine - if they do communicate, it's with other programmers. You are ordering logical operations and setting conditions in a (mostly) human-comprehensible way. That semicolon is a convenient way of flipping a switch to light a diode - it's the enter key in a text file, nothing more. Why so much of this is dependent on the ASCII charset and not more obvious, concise and clear idiom isn't clear to me.

Hyperlinks are an example of technology improving on the human word, as are mouse-over tooltips and the right-click that brings up a dictionary entry (on a Mac.)
posted by Slap*Happy at 11:33 AM on April 16, 2012


So, while I understand this is totally irrelevant to most of us, and that it's probably more interesting to talk ABOUT the argument, rather than actually considering it, this is how I break it down:
  1. The various Javascript compilers need semicolons as statement separators.
  2. The various Javascript preprocessors can add the needed semicolons if they're missing, but this is guesswork.
  3. As a programmer, you can often, but not always, omit semicolons, if you guess correctly about what the compiler will guess.
  4. If you always use semicolons, things just work.
In my experience, depending on algorithms to guess what I meant is usually a fast route to broken systems. They might guess wrong. I might guess wrong about what THEY will guess. And new compilers may change their guesses, breaking old programs. That's a lot more guessing than I'm happy with, when trying to make an algorithm work.

It seems to me that omitting semicolons has a very tiny immediate benefit, but the potential long-term cost for that benefit is disproportionately large. You may have to revisit otherwise perfectly-good code to fix up semicolon bugs in later generations of compilers, where if you just use semicolons now, that won't happen.

Reworking old code is a fact of programming life, but since it's one of the least interesting and most tedious parts of the profession, a tiny investment of syntax now, to potentially save boring troubleshooting later, seems like a smart choice.
posted by Malor at 11:51 AM on April 16, 2012 [6 favorites]


I wish I could fathom why anyone finds including semicolons problematic, to me as a relative hack they really help make code less ambiguous. Maybe that's why their detractors don't like them?
posted by maxwelton at 12:05 PM on April 16, 2012 [5 favorites]


God, for a language that has as many cool parts as it does, javascript is so, so terrible about so many things. There are way too many ways to check for null or undef. You can't iterate through a dictionary without guarding against monkey-patches up the protoype chain. It was only a few months ago that I realized it doesn't have block-local scope - you can declare a variable anywhere, but its scope is bound to the function it's in, not the block it's in. For a supposedly web-focused language it's ridiculously inconvenient to get a particular URL parameter from a string. The ASI stuff is really just the tip of the iceberg.

The whole thing is in dire need of a cleanup, but instead it seems as though the JS design committee is intent on turning the language into python. I love Python, don't get me wrong, but I'm not sure that adding array comprehesions to javascript is really as important as trying to clean some of the cruft out of it. Even stuff like "strict mode" doesn't go nearly far enough, as far as I'm concerned.

In summary, if I were Douglas Crockford, I'd be grumpy too. Running jslint is a definite eye-opener.
posted by whir at 12:13 PM on April 16, 2012 [1 favorite]


Modern IDEs are shockingly old-fashioned in terms of interface and usability compared to what's out there now for other realms of software.

Try programming a PLC one time. Their IDE's make Visual Studio look like a sleek jetfighter.
posted by lohmannn at 12:13 PM on April 16, 2012 [2 favorites]


God, for a language that has as many cool parts as it does, javascript is so, so terrible about so many things.

this
posted by a snickering nuthatch at 12:16 PM on April 16, 2012 [1 favorite]


You'll discover a much more sensible viewpoint on "drag & drop" programming in Haskell's Control.Arrows module and accompanying do notation, mccarty.tim. Arrows aren't drag & drop currently, but they make drag & drop make sense.
posted by jeffburdges at 12:21 PM on April 16, 2012


whir, the issue (as with so many things) is not that the committee doesn't want to improve the language by removing cruft, but rather that no sane implementor of the language will implement a specification that makes previously-legal code illegal. Doing so would break the web.

If the JS design committees want to remain relevant - that is, producing specifications that are actually implemented - they can't do much about what is already legal code.
posted by Fraxas at 12:26 PM on April 16, 2012 [1 favorite]


I wish I could fathom why anyone finds including semicolons problematic

I'm not a JS coder, but my understanding is that the fewer characters you include, the fewer need to be transmitted over the network, and the faster a page loads. So it's an optimization thing. As with many other design considerations, I could see this turning into a fetish for some people. The more I code, the more I see there are tradeoffs in many decisions I make that previously would have seemed straightforward. So I'm throwing in my lot with the old cranky people and saying this guy should have put the damn semicolon in, maybe with a comment to indicate that ASI didn't work here. (I assume JSMin or something similar automatically removes comments before code is transmitted over the network.)
posted by A dead Quaker at 12:28 PM on April 16, 2012


50 comments in, and no Vonnegut quote yet?
posted by schmod at 12:28 PM on April 16, 2012


There are way too many ways to check for null or undef.

Yeah, this:

if (typeof(foo) === "undefined") ...

is just ridiculous.
posted by Blazecock Pileon at 12:28 PM on April 16, 2012 [1 favorite]


It's been my experience that a lot of anxiety around coding is a result of programmers who don't touch-type.

They may be great logicians or mathematicians or both, but if you've ever watched someone try to code something by typing, looking at the screen, typing, looking at the screen, backspacing, looking at the screen... Well, you can see where something like a semicolon - which is nowhere near the middle of the keyboard - could become the subject of heated contention.
posted by mmrtnt at 12:29 PM on April 16, 2012 [3 favorites]


I'm not a JS coder, but my understanding is that the fewer characters you include, the fewer need to be transmitted over the network, and the faster a page loads.

Oh dear god no. If you care about what's being transmitted over the network, use a javascript compiler which will mash your javascript down to a minimal size, and in doing so probably add back all your semicolons because you need SOME character to denote the end of statement. Just because that statement can be a carriage returns take up just as much space as semicolons.
posted by aspo at 12:35 PM on April 16, 2012 [2 favorites]


Shorter got equated with more readable, and I'm not sure that's always true. At least with C-style blocks, the units of functional code are immediately clear.

Shorter certainly isn't always more readable, but C is practically the poster child for that fact! I assure you this is not always the case that C blocks are always clear.

Regardless of the language, it's possible to write code it a way that difficult to read. One can write perfectly legible Perl, C or Python code, or one could write it in such a way that it's a disaster for someone who's not a compiler to understand. Depending on how long lines are and where you hide your closing and ending braces, it can become surprisingly difficult to tell where a C-style block begins, and if indentation is inconsistent then you can do a lot to mess up readers -- and that reader may be yourself a few months down the line.

Anyway, in my experience Python's indentation works as expected in all situations that don't involve tabs. You absolutely cannot mix tab characters and spaces when indenting in Python; you must always use one or the other. A necessary step to setting up an editor for Python is to make sure it uses spaced tabs, or else you will come to know true pain.
posted by JHarris at 12:38 PM on April 16, 2012


One can write perfectly legible Perl
[CITATION NEEDED]
posted by deathpanels at 1:03 PM on April 16, 2012 [15 favorites]


Arrows aren't drag & drop currently, but they make drag & drop make sense.

The problem is that arrows themselves are not so easy to grok.

The functions Control.Arrow exports can nicely shorten, or render point-free, higher-order functions, using the instance for (->), but I find them just really opaque in other cases—just not clear what they do. (Someone much better informed seems to agree!)
posted by kenko at 1:08 PM on April 16, 2012 [1 favorite]


As a total novice programmer, I'm sorry, but what's the advantage to ASI? Why not just add semicolons as you go? We're talking about a tiny, tiny effort here.
posted by GilloD at 1:37 PM on April 16, 2012 [1 favorite]


I'd agree that lenses look & feel like they should be arrows, kenko, interesting thanks! Lenses appear extremely important for exploiting Haskell for "real world stuff".
posted by jeffburdges at 1:39 PM on April 16, 2012


The same people who will omit semicolons (to save interpreter time?) seem to think nothing of throwing in several newlines between functions and blocks (in lieu of comments?)
posted by fredludd at 1:44 PM on April 16, 2012


I wish I could fathom why anyone finds including semicolons problematic, to me as a relative hack they really help make code less ambiguous. Maybe that's why their detractors don't like them?

I personally don't think they add much in terms of readability, if 99% of the time the end of a line of code is going to end with a semicolon then 99% of the time a newline is just as good as a semicolon in terms of readability. And to me at least a line continuation character is a more readable way to point out the lines that look like a single statement but actually continue than omitting a semicolon. There's always a trade-off between having simple syntax and making the code structure more obvious. VB.NET for instance uses completely unambiguous block terminators like End If or End Class instead of curly braces like C# uses, but I doubt many programmers would argue that VB's syntax is a sane way to design language.
posted by burnmp3s at 1:53 PM on April 16, 2012 [1 favorite]


[CITATION NEEDED]

Your joke has gone array.
posted by srboisvert at 2:05 PM on April 16, 2012 [15 favorites]


If you want to compress javascript, look at Google's Closure Compiler, which actually compiles JavaScript down into smaller JavaScript. And if you JS code is valid, the output will work - although there are some interesting hitches - Closure will remove any code that isn't associated with an external object (like window). So it may remove code you don't intend it to remove.


If minifiying working code breaks it, that means the minifier is invalid.
posted by delmoi at 2:24 PM on April 16, 2012 [1 favorite]


and so cannot be minified except by parsing fully

Wait a minute. People are trying to write source code transformation tools that don't "parse fully"? What are they smoking? There is no other sane way to do it.

Douglas Crockford is trying to write a source code transformation tool that doesn't "parse fully"? His prettyprinter doesn't parse javascript according to the standard, and he thinks this is the input programs' fault???

I am seriously disappointed.
posted by Mars Saxman at 2:25 PM on April 16, 2012


VB.NET for instance uses completely unambiguous block terminators like End If or End Class instead of curly braces like C# uses, but I doubt many programmers would argue that VB's syntax is a sane way to design language.

People tend to be swayed by custom. BASIC was created at a time when clarity was favored over brevity. C-style languages go to the other extreme. Before Python, not a lot of programmers would have advocated its enforced indentation scheme, but I really like it. Who knows what is really best? It might not be what people will argue for at this moment.
posted by JHarris at 2:33 PM on April 16, 2012


if (typeof(foo) === "undefined")

I didn't say it was difficult to do, I said there are too many ways to do it, many of them with subtle dangers. Here's an example of the confusion that results from this. Looking at code in the wild you might find any number of variations on !var, var == null, var === null, var == undefined, typeof(var) === 'undefined', etc. And for the record, while I agree that the way you put it is the most correct, comparing the return value of a system function to a string constant defined in code strikes me as an extremely hacky way to go about performing a very basic and frequently-used operation in a dynamic language. I mean, javascript is dynamic and nobody expects it to be incredibly type-safe, but there's no way easy for a compiler / interpreter / IDE to detect a mispelling of "undefined" in the snippet above.

no sane implementor of the language will implement a specification that makes previously-legal code illegal

Definitely, but you could make the improvements in a language version "Javascript 2.0" (though ideally with a less politically fraught name than that), and only apply them to javascript snippets that explicitly state they adhere to the new standard; all other code would be interpreted as JS 1.7 or whatever browsers are supporting these days. This is more or less the approach they've taken with strict mode, and that's why I'm disappointed that strict mode does so very little.

I suppose that since everybody wants to run JS through a compiler anyways these days, largely to make up for ways in which HTTP isn't ideal for the modern web, it won't be that difficult for everyone to move on to using CoffeeScript or Dart or something; maybe Javascript-the-language will eventually fade away to become Javascript-the-platform, much as I'm hopeful Java is about to do.
posted by whir at 2:41 PM on April 16, 2012


God I hope not. I recently abandoned Eclipse for working on my own project; it takes like a full minute to start up, and about as long to shut down, and it requires Java which is rapidly becoming the security hole of choice for budding young hackers to exploit.
Light Table is written in Clojure, which runs on java. It looks like he set it up as a webapp, but I don't even see a download link or anything. It's just an idea apparently. But if he were to release his code, you would need the JVM to run it.

Apple left it's version of Java unpatched for a few months. But that has nothing to do with the overall security of Java if you're running an official release on a supported platform.
This just doesn't make sense. It's like saying human literature has "stuck to the old written word paradigm for surprisingly long." We have evolved from stone tablets to fountain pens to movable type to Selectrics to iPads; and similarly from ed to [IDE of your preference]. The tools change but the medium stays the same, not in spite of its limitations, but in fact because it is fundamentally limitless.
Yes, but we also have comic books, movies, audiobooks and other ways of creating and consuming ideas. The question is, are plain text ASCII files really the best way to store and edit source code? At the very least people could be using more interesting mathematical operators. Why limit yourself to a handful of symbols, just because they aren't on keyboards? There are some flowchart languages out there, but they're not widely used. There's the Piet language that takes input as an actual image, but that wasn't intended to be a practical language.

Or on the other hand, why not try to remove as much punctuation as possible, so that you could use voice to text to program and thus avoid all that typing? Voice to text works great on my phone for sending text messages. It's not flawless, but it's much nicer then typing on the tiny keyboard or using swipe -- but it's unrealistic for programming because of all the punctuation.

What I'd like to see is something where you can take a word like 'union' and have it translated by voice to text to ∪, which could then be stored as Unicode. Or you could type &union; or something like that - but you wouldn't need to store the information directly as the same bytes typed from your keyboard.
posted by delmoi at 2:49 PM on April 16, 2012


You will find sad mustache guy }); in Java, C#, C++11, and Objective C. It's fine, and by far not the strangest thing in JavaScript.

At least this non-semicolon using thing is consistent about it. I've had to deal with code that inconsistently chooses not to use semicolons. That's true insanity.
posted by jeffamaphone at 3:02 PM on April 16, 2012


If minifiying working code breaks it, that means the minifier is invalid.

This is one of those times you would have benefited from reading the article since that's the WHOLE GODDAMN ARGUMENT.

So it's an optimization thing.

If it is, it's the only one Twitter seems to care about. Take a look at the page size of the web interface. Rather than skipping a hundred semi-colons, how about when I visit someone's profile page, rather than loading all of my background image, etc and then drawing over it with theirs, maybe just show me theirs to begin with?
posted by yerfatma at 3:05 PM on April 16, 2012 [2 favorites]


This is one of those times you would have benefited from reading the article since that's the WHOLE GODDAMN ARGUMENT.

Well, if that's the argument then one side is obviously wrong.
posted by delmoi at 3:07 PM on April 16, 2012 [1 favorite]


why not try to remove as much punctuation as possible, so that you could use voice to text to program and thus avoid all that typing

That sounds a little like HyperCard's HyperTalk, which aimed to read like English.

I remember it as being fun to begin with but ultimately frustrating: there was still a rigid syntax to the language which often put it subtly at odds with English's more flexible syntax. Non-trivial code would often end up reading as fairly fractured and repetitive English: flabbier and harder to understand than a more concise punctuated language.
posted by We had a deal, Kyle at 3:13 PM on April 16, 2012


FWIW, bootstrap has been patched so that it runs correctly in JSMin, although they are still puckishly avoiding a semicolon.
posted by whir at 3:25 PM on April 16, 2012


This entire argument is such a profound embarrassment for both parties that I hardly know where to begin. Next thing you know Bootstrap is going to use spaces for indentation and tabs for alignment. After that, dogs and cats getting married. You heard it here first.

But seriously, the Bootstrap guys come off so poorly on this topic that it's made me suspicious of the whole thing, which sucks, because I use it and find it incredibly convenient and have recommended many people check it out.
posted by feloniousmonk at 3:34 PM on April 16, 2012 [1 favorite]


Doug Crockford wrote JSMin. Doug Crockford most certainly knows how to write a fully featured JS parser, as that is what JSLint is. But JSMin does not use a parser, it is a set of simplistic replacement rules. It can and will change the meaning of code that he does not deem unambiguous enough -- it says so right there on the tin. This was not a decision made out of laziness or difficulty, it was a deliberate design decision. I've seen people say that this was so that JSMin can be fast enough that it can be used to minify JS assets in real-time and still reduce the total load time. I don't know if that is the official reason or not, but there are other options like Google closure compiler if you want a true offline minification tool.
posted by Rhomboid at 3:35 PM on April 16, 2012


The question is, are plain text ASCII files really the best way to store and edit source code? At the very least people could be using more interesting mathematical operators. Why limit yourself to a handful of symbols, just because they aren't on keyboards?

Most modern languages support Unicode text for string literals and whatnot, but as for the actual language using those characters, what would be the point? APL has a bucket-load of wacky non-standard symbols, and that sort of made sense back when having a weird non-standard keyboard was less of a big deal, but it doesn't really provide any real benefits in a modern language. The main reason why a language feature like the "equal to" operator uses the "=" symbol in some way is that the "=" symbol already exists and is recognized by people. Which is the same conflicting reason why assignment also tends to use the "=" symbol. Using "foo ¢ 52" to mean "assign the value of 52 to foo" or otherwise abandoning a standard symbol for an arbitrary new one would be a dumb idea, which only leaves operators that are both needed for a programming and not generally represented in programming by a standard ASCII character. I don't think there are a lot of those cases, where it would make sense to bring in a lot of non-ASCII symbols to represent a brand new operators in a programming language.

Or on the other hand, why not try to remove as much punctuation as possible, so that you could use voice to text to program and thus avoid all that typing? Voice to text works great on my phone for sending text messages. It's not flawless, but it's much nicer then typing on the tiny keyboard or using swipe -- but it's unrealistic for programming because of all the punctuation.

Dealing with unpronounceable punctuation is the least of your problems if you want to use voice as a way to write code. For one thing, unlike most other text to speech use cases source code is modified in place quite a bit. The usual IDE these days is designed around using a keyboard and mouse, which gives you both an unambiguous way to enter hundreds of different symbols and an unambiguous way to point to any position on the screen. Take away the mouse and there are generally a decent number of hotkeys or other meta commands that get you to the same level of ease moving around and editing the text. If you instead have to use the human voice to control everything, you have what, a couple dozen fairly ambiguous phonemes to work with? For the English language, we already know an exceedingly complex and relatively well-structured way to represent a given sentence in sounds so that another human can understand them, and a computer can use a dictionary and AI to figure out what specific words someone is speaking. But to tell a computer to edit some random character in the middle of your text? There's no simple efficient system everyone already knows to do that, and the one you would have to invent to make it better than a pointing system would be too difficult for most people to learn. It would be like getting rid of the steering wheel and gas pedal from a car and replacing them with a voice control system, sure you could do it but it wouldn't be any easier or more efficient to drive a car that way.

What I'd like to see is something where you can take a word like 'union' and have it translated by voice to text to ∪, which could then be stored as Unicode. Or you could type &union; or something like that - but you wouldn't need to store the information directly as the same bytes typed from your keyboard.

That seems like an overly complicated system for no real benefit. Most programming languages don't have an operator for Union because it's not usually a supported operation for the native types (which are generally things like integers, fixed length arrays, linked lists, etc. rather than something like sets where union would be very relevant) and if you're writing your own code or library that involves Union the fact that you call superset = subset1.union(subset2) instead of superset = subset1 ∪ subset2 does not seem to be a big problem. In languages where you can mess with operators to do similar sorts of tricks, it's generally frowned upon because the new operators you make up are less readable and obvious than typing out a new method name.
posted by burnmp3s at 3:40 PM on April 16, 2012 [1 favorite]


Well, if that's the argument then one side is obviously wrong.

As Rhomboid said before I finished writing up pretty much the same thing. There is a place for tools that run slow and correctly, and there is also a place for tools that run quickly. One of the problems with javascript is that the gammer makes the second much harder. (I know nothing about jsmin, I don't really care much about the issue, but I do know that if someone wrote code like

!foo && doIfNotFoo()

I'd beat them up until they changed that to

if (!foo) doIfNotFoo()
posted by aspo at 3:42 PM on April 16, 2012 [1 favorite]


I mean at the very least they could have written

foo || doIfNotFoo()

which still irks but at least gets rid of that !.
posted by aspo at 3:49 PM on April 16, 2012 [2 favorites]


And I should know better than to wade into a land war in Asia programming language debate, but I'll never understand people that have a problem with Python treating whitespace as significant. In C and all the languages that borrowed syntax from it (Java, Perl, etc.), you have two completely independent methods of specifying the lexical structure of the program: the { } tokens for the benefit of the compiler, and whitespace for the benefit of the human reading the source code. You had better keep those two in sync, otherwise there is a special place in hell waiting for you. Nobody likes to read code where the indentation doesn't match the braces, and if you're defending the ability to do that, well, then we should just end the conversation now and go our separate ways.

Why therefore have two different sets of syntax? Why not unify them into one, and have both the compiler and the human work off the same tokens to determine the lexical structure? There's nothing to get out of sync, and it eliminates a whole heap of problems such as when you accidentally leave off a closing } somewhere and it takes a while to figure out what happened from the wretched error messages, or endless debates about whether you can omit the braces for a statement body that's a single line.

Obviously you can't do this for C and C++ because they still depend on macros (ideally to a virtually nonexistent extent in C++, but still) and writing macros becomes a lot easier if you don't have to worry about whitespace. Please don't misinterpret this as me saying that there is something wrong with C or C++ for not treating whitespace as significant. Where I'm really pointing the finger is all those people that thought it would be a good idea to copy C but also eliminate that unhygienic preprocessor. I guess the argument goes that the reason for making your new language have a C-ish syntax is that it looks warm and comfortable to those already familiar with C, and doing C-but-without-braces-and-with-significant-whitespace would be an abomination, a fair enough point. But you're also inherently saying that you want to take on some of C's warts, so it's not like this is a decision without ramifications.

I think the Python decision to excise all of that baggage is a masterstroke. I love working in Python and I love how clean it looks. I can't understand people who react so viscerally to the concept. Aren't you already writing code in your C-syntax languages where the indentation voluntarily matches the lexical structure?
posted by Rhomboid at 3:56 PM on April 16, 2012 [1 favorite]


I'd beat them up until they changed that to if (!foo) doIfNotFoo()

I dunno, short-circuit evaluation is an extremely common idiom in C and most C-derived languages (sh, Perl, javascript, etc). Even Python has it, though I've seen it used much less frequently in Python than other languages. I tend to write the if (...) version myself, but I'm fine reading code that uses condition && consequences.

On the other hand, I'd prefer you put braces around it myself: if (!foo) { doIfNotFoo(); }. De gustibus non est disputandum.
posted by whir at 4:00 PM on April 16, 2012


If you want to compress javascript, look at Google's Closure Compiler, which actually compiles JavaScript down into smaller JavaScript.

When Closure library first came out, there was some significant criticism about it being full of problematic JavaScript idioms -- it certainly didn't inspire confidence in the idea that the compiler would emit optimized code. I've pretty much stayed away from anything with the name since. Is there any indication that Google got over those problems?
posted by weston at 4:05 PM on April 16, 2012


Rhomboid, I think the frequent argument from C people is that relying on whitespace is fragile, because we're conditioned by normal language usage to regard whitespace as relatively unimportant ("none" or "at least one"). We can use whitespace markers in our editors, but it's just very hard for some people to retrain themselves to "quantity of whitespace changes function."

(At my previous job I was bitten several times by green hires checking in code that had errors in whitespace: adding a command at the wrong indent level, thus grouping it with the conditional code above. It took far, far longer for people to notice that in the commit stream than checking in something inside the closing brace in C code.)
posted by introp at 4:10 PM on April 16, 2012


short-circuit evaluation is common, and I have no problem with it in the right cases. But that's not the right case.

foo || doIfNotFoo() has a place. Especially when inside another expression. For instance

if (foo != null && foo.getBar()) makes sense. It's short circuiting the getBar call in the null state so that it's not getting a NPE. In the same vein you can have while (doCheapTest() && doExpensiveTest()) or some such.

see also: doX() || die("Oh noes! X failed!")

!foo && doIfNotFoo() sitting there in it's own is much much uglier. That ! is more likely to be missed, and parsing it takes a bit more mental overhead in an already slightly awkward piece of code. If statements exist for a reason. Plus it's not even doIfNotFoo(), but !foo && doSomething(). Even worse. That's code that people are going to read incorrectly. Why do that when if statements exist and make it much clearer what your intent is?
posted by aspo at 4:16 PM on April 16, 2012 [2 favorites]


Why therefore have two different sets of syntax? Why not unify them into one, and have both the compiler and the human work off the same tokens to determine the lexical structure?

(Upfront note: I'm not arguing against Python. I like working in it, even if I don't get the BEST LANGUAGE EVAR warm fuzzy feeling some people do, nor do I begrudge them their feelings as long as they're able to recognize them as such).

1) Whitespace has some inherent problems as a meaningful token, particularly when not everyone is using editors/modes that treats whitespace forms the same way (yay! now we can have the editor and tabs vs spaces wars too!).

2) There are some situations where it sometimes seems/feels better to be able to break the formatting conventions that grow up around a language. The JavaScript minification process that's part of the ostensible topic of the thread is one type of case.

There are also, as you say, some advantages to syncing up the formatting conventions and syntax. Some people find them compelling, some people find the disadvantages compelling. YMMV.
posted by weston at 4:17 PM on April 16, 2012


But the "editors can be configured differently" problem applies equally regardless of whether whitespace is significant or not. The only difference is that when whitespace is not significant, it's a mere nuisance and not an outright failure. But I guess that I'm coming from the point of view that a programmer being able to read and follow the code should be important above all else, and so in that light the fact that broken indentation is only a nuisance is itself a bug -- making it significant forces us to deal with the problem (e.g. PEP8) instead of sweeping it under the rug and letting it become an intractable holy war that festers for decades.
posted by Rhomboid at 4:28 PM on April 16, 2012


Most modern languages support Unicode text for string literals and whatnot, but as for the actual language using those characters, what would be the point?

So I guess you wouldn't approve of something like this (used in actual code by me):

(¢) = flip ($)

(in use: lookup c one ¢ return <?> (lookup c two ¢ (`fmap` action) <?> go))

I thought it was, if nothing else, typographically clever.
posted by kenko at 4:51 PM on April 16, 2012


I've seen people say that this was so that JSMin can be fast enough that it can be used to minify JS assets in real-time and still reduce the total load time. I don't know if that is the official reason or not, but there are other options like Google closure compiler if you want a true offline minification tool.
Well, looking at that -- I'm not really sure I get the point. Something like bootstrap wouldn't need to be 'compressed' in real time. I'm not really sure why any JS would need to be compressed in real time at all.
he main reason why a language feature like the "equal to" operator uses the "=" symbol in some way is that the "=" symbol already exists and is recognized by people. Which is the same conflicting reason why assignment also tends to use the "=" symbol. Using "foo ¢ 52" to mean "assign the value of 52 to foo"
Okay, but what about the union of a set? That was the example I gave. Or a summation? Most languages don't have symbols for those at all, you would need to do something like sum(function,a,b) rather then then Σ(function,a,b), so you're not replacing a common symbol with something else.
I don't think there are a lot of those cases, where it would make sense to bring in a lot of non-ASCII symbols to represent a brand new operators in a programming language.
Well, you only gave an example of something we already have an ASCII symbol for. If there was no ASCII symbol for = would you want to type equals(a,b) all the time? That way you could have an expression life if(a ∈ b) ... instead of something like if(b.contains(a)) If you only care if a is one of two sets you could do if(b ∈ a ∪ c) ...

And you could have neat things like saying z = ∪ ∀x ∈ s1 : ∀y ∈ s2 y ∈ x, which is to say the union of all sets in s1 that contain all the items in all the sets in s2.

How would you express that using normal ASCII characters? The obvious way would be to create infix operators like &union; or something - but that would look pretty awful.

I didn't say we needed to replace the equals sign with something else. It's certainly true that the equals sign is "recognized" by people, but ∀,∃, ∪ and ∈ are also recognized by people and can't be used because they're not ASCII characters.

I mean come on - that's such an absurd counterexample. Of course you wouldn't replace the equals sign with something else, the whole point is to avoid replacing other mathematical equations with something else, the same way you don't want to write a.equals(b) or set(a,b) instead of a == b and a = b;
When Closure library first came out, there was some significant criticism about it being full of problematic JavaScript idioms
The closure library and the closure compiler are two different things. I haven't even really looked at the Closure library. Personally I don't really care if something "breaks" JavaScript "conventions". IMO JavaScript blows. It can be fun to use JS's functional nature to write functions in strange ways, but overall debugging large programs in JS is a huge pain. I spend way more time testing and debugging in JS compared to Java.

What I like about Closure's use of JSDoc annotations to declare static-ish types is that it makes JavaScript a lot less like JavaScript. You can still do whatever you want, but when you want static type checking it's there... to a certain extent, it's not perfect and there are no generics, but it definitely cuts down on debugging time.
posted by delmoi at 4:54 PM on April 16, 2012


In many languages you end up typing a.equals(b) all the damn time. Because = and equals are two different concepts.

But really? Your union example? Complicated enough and rare enough that a few extra lines of code are not going to kill you, and probably make your code easier to understand. Unions aren't that common. Source code serves two purposes: communication with the machine, and communication with the user. There's still people today who swear by APL, but there's a good reason that it never caught on.

(~R∊R∘.×R)/R←1↓⍳R

Is proof that terseness does not equate to ease of understanding.
posted by aspo at 5:10 PM on April 16, 2012 [4 favorites]


The closure library and the closure compiler are two different things.

Right. And I suppose it's possible they were created by entirely separate groups of people, but it doesn't seem very likely, and if not, warning flags in hte library would be an indication that the compiler might be emitting problematic code.

Personally I don't really care if something "breaks" JavaScript "conventions".

This isn't "breaks" in quotes. The complaints were about choices where lower-performance expressions were chosen over higher ones (like repeated property lookups in loops), and expressions that could introduce bugs (like not checking hasOwnProperty on Object property traversal).

I like about Closure's use of JSDoc annotations to declare static-ish types is that it makes JavaScript a lot less like JavaScript

An increasingly time-honored tradition, particularly for Java developers.
posted by weston at 5:20 PM on April 16, 2012


I'm not really sure why any JS would need to be compressed in real time at all.

In the name of decreasing page load time by reducing the number of roundtrip requests, it's common to have some kind of "JS combiner" API where you specify what you want as query params and the result is the minified, concatenated result, e.g.
http://example.com/js/packer?scripts=jquery-1.6.4,jqueryui-1.7.3,fooplugin-1.2,barplugin-3.0
Sure, you could cache the results, but then some people are going to want to use this technique to combine their own sources with their static third party external libraries, and they're going to complain when they make a change to their own code and a cached copy is served that doesn't contain that change, so then you have to add code that always stats each of the assets specified and works out whether the cached copy is still valid. But then you need metadata for each of your cache entries to spell out the timestamps of each file in the concatenation, and you need something to GC the cache regularly, and then you have to decide whether to share one cache among all your instances or make a local cache for each, and this is turning into a complicated distributed system with lots of moving parts. If you can instead efficiently minify on the fly you save a lot of work and avoid corner cases that tend to bite complicated distributed systems (e.g. a person gets served from cache A for one request and cache B for a second request, and those caches are out of sync, leading to strange unreproducible page behavior.)

It's arguable that this is just papering over poor workflow and change management policies. But people want to shoot themselves in the foot.
posted by Rhomboid at 5:38 PM on April 16, 2012


I'll never understand people that have a problem with Python treating whitespace as significant.

And I'll never understand people who don't get it. The Zen of Python says that explicit is better than implicit... but apparently not with block endings, where implicit is better.

Besides the compiler or a human reader, there's something else that needs to recognize block endings: your editor. In a language with explicit block endings, you tab once to get to the correct indent level; if you close the block, the editor knows to move the ending back left a level. With Python, the editor can't auto-indent and get it right the first time because any number of open blocks may have implicitly ended before your new line of code. Hit tab, and maybe some number of backspaces. Decide after the fact to insert another new line of code between the implicit block ending and what had been the next line? Tab and some number of backspaces again. Number of times you have to tell your editor a block is over in a language with explicit block endings? Once. Number of times you have to tell it with Python? Potentially unlimited.

Cutting and pasting a code snippet you found on the web? In a language with explicit block ends, mark a region and tell your editor to re-indent it. Done. In Python, encounter nuisance. Python's REPL becomes needlessly hard to use because a single screw-up in whitespace can break the code.

And for what? The upsides are: 1) you can fit a tiny bit more code in the same number of lines; 2) you can go on about how great significant whitespace it is and how your code doesn't look like C or Perl.

I dislike Python's significant whitespace not for any philosophical reason, but because it's wasted my time.

Yes, a good indentation style makes for more readable code. But it can be and thus should be automated.
posted by Zed at 5:46 PM on April 16, 2012 [1 favorite]


once to get to the correct indent level; if you close the block, the editor knows to move the ending back left a level.

How is typing } to close a block any different than me typing shift-tab to deindent one level? They are both the same number of keystrokes.
posted by Rhomboid at 5:54 PM on April 16, 2012 [1 favorite]


Dealing with unpronounceable punctuation is the least of your problems if you want to use voice as a way to write code. For one thing, unlike most other text to speech use cases source code is modified in place quite a bit. The usual IDE these days is designed around using a keyboard and mouse, which gives you both an unambiguous way to enter hundreds of different symbols and an unambiguous way to point to any position on the screen. Take away the mouse and there are generally a decent number of hotkeys or other meta commands that get you to the same level of ease moving around and editing the text. If you instead have to use the human voice to control everything, you have what, a couple dozen fairly ambiguous phonemes to work with? For the English language, we already know an exceedingly complex and relatively well-structured way to represent a given sentence in sounds so that another human can understand them, and a computer can use a dictionary and AI to figure out what specific words someone is speaking. But to tell a computer to edit some random character in the middle of your text?
Eh, you're not being imaginative enough. I'm not talking about throwing away the mouse and keyboard, rather adding voice as well. You could use the mouse to highlight one token, and speak it's replacement. In the case of ambiguity, the IDE could show a menu, and you'd say the menu number corresponding to the one you want.
There's no simple efficient system everyone already knows to do that
Last I checked, people aren't born knowing C, or even Python. In fact, they may not even know how type.
In many languages you end up typing a.equals(b) all the damn time. Because = and equals are two different concepts.
Yeah, and it's kind of annoying. Especially when either might be null and you have to do something like if(a == null && b == null || a.equals(b)).
Right. And I suppose it's possible they were created by entirely separate groups of people, but it doesn't seem very likely, and if not, warning flags in hte library would be an indication that the compiler might be emitting problematic code.
Why does it matter if the compiler emits 'problematic' code, so long as the code still runs in a standards compliant browser? You're not supposed to be reading it.

Here's an example of some code it spat out:
elements.decodeTable = function(a) {
  var b;
  b = a.Sb;
  var c = [];
  for(i in b.Ga) {
    for(var d = c, e = i, f = new n(b.Ga[i]), h = new l(b.pb), o = f.B(), q = [], s = 0;s < o;s++) {
      q[s] = new p(f.next(), f.next(), h.get(f.B()), f.next())
    }
    d[e] = new r(q)
  }
  b = new aa(new l(b.pb), new m(c));
  c = new n(a.elements);
  d = [];
  for(e = 0;c.bb();) {
    d[e++] = y(c, b.P)
  }
  return new ba(d.length / a.L, a.L, d, a.$)
};
And that's with 'print-pretty' turned on so that it adds newlines and indents so you can get some idea of what it might be doing. (Also, no semicolons!) All the short 1-2 letter names are renamed functions/variables. Is that code idiomatically problematic? Why would it matter? Also, a lot of the functions are inlined, the code that generates that is just six lines and calls a bunch of other functions. And it's written to run fast.

In this article complaining about closure, they were complaining that the code, as written, might be slow. But they are completely missing the point that all of it gets optimized into something else anyway.

BTW it works perfectly well with code that uses JQuery, you can use JQuery and then run it through closure.
Sure, you could cache the results, but then some people are going to want to use this technique to combine their own sources with their static third party external libraries, and they're going to complain when they make a change to their own code and a cached copy is served that doesn't contain that change, so then you have to add code that always stats each of the assets specified and works out whether the cached copy is still valid. But then you need metadata for each of your cache entries to spell out the timestamps of each file in the concatenation, and you need something to GC the cache regularly,
Well, if you can't cache it on the server, you can't cache it in the browser. And if you can't cache it in the browser then you're definitely Doing It Wrong, IMO. Especially if those those external libraries are large (But then you couldn't compile them in real time anyway)

A simple way to avoid problems is to use version numbers with your JS files, so if you make a change, you update your html to point to compiled_12321.js or whatever. That's what metafilter does, I think.
posted by delmoi at 5:55 PM on April 16, 2012


Hit tab, and maybe some number of backspaces. Decide after the fact to insert another new line of code between the implicit block ending and what had been the next line? Tab and some number of backspaces again.

What editor do you use? In emacs' python mode, a new line begins on the same indentation level as the previous line (unless it couldn't be part of the same block—the previous line was a return, break, continue, or raise statement (dedented) or unless the previous line ended with a colon (indented)). If you want it to be dedented, just … hit tab, and it successive tabs will cycle you backwards through possible indentation levels. Yeah, it's not automatic, but it's not the horrible burden you're making it out to be.

I'm with you on copying and pasting, but really, an intelligent editor ought to be able to cope with that as well—taking current indentation as the baseline relative to which pasted code should be indented, and working out the indent/dedent structure of the pasted code from its own characteristics. I'm not sure any editor does do that, though.
posted by kenko at 5:56 PM on April 16, 2012


(er, I should say the output is designed to run fast by the closure compiler, not the original function from which it was compiled)
posted by delmoi at 5:57 PM on April 16, 2012


Eh, you're not being imaginative enough. I'm not talking about throwing away the mouse and keyboard, rather adding voice as well.

Hopefully, that'll work better than last time around.
posted by Blazecock Pileon at 6:16 PM on April 16, 2012 [7 favorites]


Ah, that's such a classic. I can only get through a couple minutes of that before I start laughing my ass off again. Still, voice recognition has come a long way in five years. Maybe we'll be able to speak LOGO by 2020.
posted by Blazecock Pileon at 6:26 PM on April 16, 2012


Ha, that really is amazing.
posted by whir at 7:02 PM on April 16, 2012


And I'll never understand people who don't get it. The Zen of Python says that explicit is better than implicit... but apparently not with block endings, where implicit is better.


Yes, and a dedent is explicit. It would be implicit if, for example, it assumed that break and return always ended a block and without any further syntax, treated following lines as such.


Besides the compiler or a human reader, there's something else that needs to recognize block endings: your editor. In a language with explicit block endings, you tab once to get to the correct indent level; if you close the block, the editor knows to move the ending back left a level. With Python, the editor can't auto-indent and get it right the first time because any number of open blocks may have implicitly ended before your new line of code. Hit tab, and maybe some number of backspaces. Decide after the fact to insert another new line of code between the implicit block ending and what had been the next line? Tab and some number of backspaces again. Number of times you have to tell your editor a block is over in a language with explicit block endings? Once. Number of times you have to tell it with Python? Potentially unlimited.


Why do you hit tab? Normally, any decent editor is going to dedent automatically when it sees a statement like break, continue, return, etc, and indent automatically when a line ends with a semicolon. If a block ends, you have to press enter and backspace. Quite rarely, two blocks will end so you will have to hit backspace twice. Compare this with a language with braces where you have to hit } then enter then } again.


Cutting and pasting a code snippet you found on the web? In a language with explicit block ends, mark a region and tell your editor to re-indent it. Done. In Python, encounter nuisance. Python's REPL becomes needlessly hard to use because a single screw-up in whitespace can break the code.


Any decent editor will have a way to paste relative to current indent. And there are many other good reasons to use a decent editor.

People make python's whitespace into a bigger issue than it really is. Over 10 years of programming, I've run into small problems with it perhaps 2 times and the first time it took me 20 minutes to fix things up, and then second time - 10 minutes.

The advantages of significant indentation are abundant: there are no competing bracing styles, there is no issue of 'code says one thing, indent says another'; there is less visual noise. Every time I switch between python and JS, I'm irritated by noisy semicolons and braces. I want to focus on the logic of my code, not on meaningless markup which duplicates indentation.

Saving vertical space is also a nice bonus: less need to scroll up and down.
posted by rainy at 7:34 PM on April 16, 2012


Every time I switch between python and JS, I'm irritated by noisy semicolons and braces

Ta-da! We've gone full circle.
posted by whir at 8:02 PM on April 16, 2012


really ruby elsif what the standard else if with two extra characters annoyed you

Yeah, everyone knows it should be elif.

fi
posted by furtive at 8:20 PM on April 16, 2012 [3 favorites]


Why does it matter if the compiler emits 'problematic' code

We're not talking "problematic" in the oh-noes-its-different-than-I'm-used-to sense, we're talking about code that has some careless assumptions -- for example, the assumption that nobody's touched Object.prototype.

But they are completely missing the point that all of it gets optimized into something else anyway.

At least at the time of the writing of that article, things like that for-loop lookup were hardly certain to be optimized into something else, and there still are often real performance differences in various expressions. The question is if those developing the Closure compiler are paying attention to that.

so long as the code still runs in a standards compliant browser? You're not supposed to be reading it.

Aren't you? The web isn't just another VM. A significant contributing factor to its success as a platform has been the fact that "view source" can take you a long way toward understanding how something's done. Minifiers bring some real benefits to the table that are particularly important with the way we're doing HTTP over wireless right now, but they're also working against that, and I'm not sure the long-term consequences are going to be negligible.
posted by weston at 8:21 PM on April 16, 2012


Every time I switch between python and JS, I'm irritated by noisy semicolons and braces


And I miss the clear guidance of their winks and hugs.
posted by weston at 8:27 PM on April 16, 2012 [1 favorite]


I think it might be how much time you spend in each language.. for me it's probably 97% python, 3% JS, so I'm really quite used to significant indent by default. But even before I started programming and was reviewing which languages are available and are recommended, python's clean and neat layout immediately jumped out for me. It's been said that perfection is not when there is nothing left to add, but when there is nothing left to take away.. I've always felt python was designed with more of an aesthetical eye than most other languages. I only wish it allowed var-names with dashes instead of var_names! (and enforced white space around mathematical operators).
posted by rainy at 9:11 PM on April 16, 2012


And you could have neat things like saying z = ∪ ∀x ∈ s1 : ∀y ∈ s2 y ∈ x, which is to say the union of all sets in s1 that contain all the items in all the sets in s2.

Again, this seems like a minor problem though. Sets are not a common data type that would be built into any language. We don't have the × or ÷ symbol on a standard keyboard so we use * and / instead. If set arithmetic was common in programming languages, there would probably be sensible operators for it. Since it's not, there aren't any operators for it, unicode or not. The vast majority of code you write and use in a language is going to be using methods rather than operators, and personally I think that's a good thing because once you get to more obscure functionality I would rather see a name than a symbol.

Last I checked, people aren't born knowing C, or even Python. In fact, they may not even know how type.

My point is that if you're coming up with a communication system between a person and a computer, using speech is probably the worst way to do it. It's very difficult for even humans to differentiate between similar speech sounds, the only way language works in practice is that they can tell from context what the other speaker is saying. At best voice recognition is a workaround for situations where you can't use a better system and can fall back on speech. Personally I've never seen a system where adding speech to a input setup that includes a keyboard and mouse actually makes things better, only systems where for whatever reason the input system is not very good and speech fills in the gap.
posted by burnmp3s at 5:03 AM on April 17, 2012


To add to the flames, we now have Semicolon: a language of semicolons. Here's hello, world in Semicolon, from that page:
;;;;⁏;;⁏;;;
⁏ ;;;;;;⁏⁏;;⁏;⁏
⁏ ;;;;;;⁏⁏;⁏⁏;;
;;⁏⁏ ;;⁏ ;;;;;;⁏⁏;⁏⁏⁏⁏
⁏ ;;;;;;⁏;;;;;
⁏ ;;;;;;⁏⁏⁏;⁏⁏⁏
⁏ ;;;;;;⁏⁏;⁏⁏⁏⁏
;;⁏⁏ ;;;;;;⁏⁏
⁏;;⁏ ;;;;;;⁏⁏;⁏⁏;;
⁏ ;;;;;;⁏⁏;;⁏;;
⁏ ;;;;;;⁏;;;;⁏
⁏ ;;;;;;;⁏;⁏;
⁏ ;;  ;
Sets are not a common data type that would be built into any language.

Sets are a built-in data type in Python and have a number of operations defined.
posted by grouse at 11:55 AM on April 17, 2012 [3 favorites]


Sets are a built-in data type in Python and have a number of operations defined.

Yeah by built in I meant a primitive type that would generally have unique operators, you're right that many languages have some form of sets built into their libraries. Depending on the language there's a pretty hazy distinction between what's in the language itself and what's in the standard libraries of the language, but I'm talking more along the lines of Python's list type that is a fundamental part of the language and thus might have unique operators rather than normal methods like every other class. Sets in python don't have symbols for operators for the same sorts of reasons that most of the classes in NumPy don't.
posted by burnmp3s at 12:15 PM on April 17, 2012


Not to get pedantic about it, but sets are built-in types (eg, language features, not library modules) since Python 2.6 or so, and they support various common operations by operator overloading.

>>> a = set([1, 2, 3, 4, 5])
>>> b = set([1, 3, 5])
>>> a > b
True
>>> a & b
set([1, 3, 5])
>>> a - b
set([2, 4])


I agree that the line between libraries and built-in features is a hazy one; sets did start out as library modules (with the operators intact as above) before they got gobbled up by the language proper.

About unicode, there is a recent language I was reading about on Lambda the Ultimate that used a variety of fancy-looking unicode symbols for stuff, such as (if I recall correctly) using the "⇒" character to denote function production:

f a ⇒ a × a

The language could also use a sort of fallback 7-bit ASCII syntax, which seems like a sane approach to me, so the above could also be written as:

f a => a * a

Unfortunately I haven't been able to dig up the link to the language in question, but knowing Lambda the Ultimate was probably one of those theorem provers everyone there seems to be so keen on. One thing I like about the idea of having a dual syntax such as the above is that you could implement it quite easily in the lexer. On the other hand I'm not certain I'd want to actually write using the extended symbols, at least not without some editor / IDE support.

Incidentally, I'm enjoying how semicolon the language is written in javascript, though its author does not seem to rely on ASI himself.
posted by whir at 1:05 PM on April 17, 2012 [2 favorites]


Pascal also has built-in sets, I think.
posted by weston at 1:12 PM on April 17, 2012


there is a recent language I was reading about on Lambda the Ultimate that used a variety of fancy-looking unicode symbols for stuff

Are you thinking of Fortress?
posted by a snickering nuthatch at 2:01 PM on April 17, 2012


Ah, yes, Fortress was it, thanks. (Examples can be seen in this PDF if anyone is interested.)
posted by whir at 2:13 PM on April 17, 2012


Sets are not a common data type that would be built into any language.

CFSet and its Cocoa counterpart NSSet are foundation classes in Objective C.
posted by Blazecock Pileon at 2:53 PM on April 17, 2012


A set is basically a hash/dictionary consisting only of keys. Any language with a hash feature can easily offer sets as a subset of functionality.
posted by JHarris at 4:07 PM on April 17, 2012 [1 favorite]


Having a core data structure as part of the language libraries and baking the data structure into the language definition are two different things.

For instance, if your language supported something syntax like ["foo", "bar"] to make a list of two elements, "foo" and "bar" lists are fundamentally part of the language. If your language's core library set includes a List object, then lists are not.

Many languages include some sort of shorthand for list or map generation, I can't think of any off the top of my head that have the same for sets, but I wouldn't be surprised if one exists. It's another step to include shorthand operations like union. There are languages where it's easy to add new operators, and in those it wouldn't be hard to make a library that adds a domain specific language adding what delmoi wanted for instance. Still, as a core language feature? Sees a bit overkill.
posted by aspo at 4:10 PM on April 17, 2012


Many languages include some sort of shorthand for list or map generation, I can't think of any off the top of my head that have the same for sets,

$ python
Python 2.7.1+ (r271:86832, Apr 11 2011, 18:05:24)
[GCC 4.5.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> {1,2,3}
set([1, 2, 3])


Anyway, I don't see why special syntax for a construct makes it more built-in to a language than merely being in the core library.
posted by kenko at 4:12 PM on April 17, 2012 [2 favorites]


Having a core data structure as part of the language libraries and baking the data structure into the language definition are two different things.

I'm not positive about other languages, but you're not really going to be able to do much in Objective C without the core library, so essentially that object container is "baked in" if you plan to do any real work. It's really a pedantic distinction, anyway. The set is more or less part of that language and has been for a couple decades, now.
posted by Blazecock Pileon at 4:31 PM on April 17, 2012 [1 favorite]


I'm not sure what kind of rigorous distinction between "language definition" and "language libraries" can be maintained, anyway.
posted by kenko at 4:34 PM on April 17, 2012


No it's not. Because the issue isn't "here's a core set of libraries that do x" the issue is "why isn't there a special syntax as part of the language to handle x." The core library is built on top of the language spec. The core library is not burned into the language spec. Your Objective C compiler isn't doing anything special to handle core libraries, to it they are just another library.
posted by aspo at 4:35 PM on April 17, 2012


Maybe I'm not explaining myself properly. But in any case, without the core library, you're not going to do a whole lot with Objective C, and whether something is baked in or not is mostly irrelevant to using the language to do something useful. A set is a set and a set operation is a set operation. As far as abstraction goes, Python doesn't deal with sets any differently than other languages that build in support for sets through a core library that you're almost certainly going to have to use anyway.
posted by Blazecock Pileon at 4:54 PM on April 17, 2012


We rarely name a core data type 'set' because the mathematical term set covers an awesomely wide swath of data structures with vastly different implementations. Arrays, vectors, tuples, records, lists, skip lists, etc. are not sets because they distinguish members by position. An associative array, a hash, most trees, etc. are all conceptually a key set with a mapping, not just the mapping by itself. A union-find data structure is a set optimized for discovering disjoint partitions. A bitfield and a bloom filter are sets optimized for different membership problems on known ranges. etc. You must pick the right set-like data structure for your problem, not hide the details away behind polymorphism.
posted by jeffburdges at 5:27 PM on April 17, 2012 [2 favorites]


Ah, that's such a classic. I can only get through a couple minutes of that before I start laughing my ass off again. Still, voice recognition has come a long way in five years. Maybe we'll be able to speak LOGO by 2020.
So I assume you don't think Siri actually works, right?

Seriously, I have voice recognition on my android phone. It works fine for sending text messages, and is faster then using the on screen or physical keyboard.
We're not talking "problematic" in the oh-noes-its-different-than-I'm-used-to sense, we're talking about code that has some careless assumptions -- for example, the assumption that nobody's touched Object.prototype.
I generated code? I seriously doubt it. If you're not writing code that depends on Object.prototype being modified, I seriously doubt compiler would generate code that does.

In any event, minify breaks code that closure doesn't, so you can hardly say that minify works better then closure.
My point is that if you're coming up with a communication system between a person and a computer, using speech is probably the worst way to do it.
It's better, for me personally, then using a tiny on-screen keyboard on a cellphone to compose text messages. Obviously a full sized keyboard would be better but it's hardly the "worst" way to do it so far. Lots of people actually use dictation rather then typing on even on full-sized keyboards. But again, there is a serious lack of imagination here. Dictation works in my experience, it's not 100% accurate, but you would have to design the editor and UI to quickly correct errors - or let you select from multiple possibilities in the event of an ambiguity.

Plus, lots of programmers have wrist problems. One of the best coders I knew in college was already having issues.
Yeah by built in I meant a primitive type that would generally have unique operators,
I think they were in pascal. But why not? I use sets and maps all the time in my code, often enough that set operations would be useful. SQL is almost all about operations on sets, and people use it all the time.

As other people mentioned, lots of languages have support for maps, and maps are basically just sets of mappings. You can use a map just to store keys and lots of languages have syntax for maps. But you typically can't do union or intersection on sets of mappings purely via syntax. But a union would be the most useful, essentially just merging too sets together.

But anyway "I personally don't use sets very much" is not at all the same thing as "No one uses sets very much, so no one needs set syntax".

The real issue, though is parallelism. You can write an loop version of whatever set operation you need pretty easily, but writing code that can take advantage of multiple cores would be much more difficult.
You must pick the right set-like data structure for your problem, not hide the details away behind polymorphism.
It would be great if the compiler could figure out what you need by analyzing the code, either at compile time or even based on profiling data.
posted by delmoi at 6:50 PM on April 17, 2012


So I assume you don't think Siri actually works, right?

Siri works fine for basic stuff, like transcribing text messages or not finding abortion clinics in Manhattan. But I just wouldn't use it to do programming work, a situation so highly context-dependent that straightforward transcription is too literal to do the job properly. It wasn't a jab at you, personally. Just a witty observation that the technology is clearly still too young to do the job correctly.
posted by Blazecock Pileon at 9:58 PM on April 17, 2012


I watched a demo at a user group, given by a guy who'd moved entirely to voice driven coding because of his RSI. It worked quite smoothly for him, largely because he'd taken the time to develop macros that worked well with the voice recognition software. It was less dictating code than generating it with a lot of verbal keystrokes that were relatively unambiguous.
posted by fatbird at 10:12 PM on April 17, 2012 [2 favorites]


There isn't afaik anyone who believes in large-scale automated compile-time decisions about data structures.

You could obviously mask the data structure behind abstractions that changed implementations dynamically based upon run-time profiling. I suspect however that lazily determined data structure represent an enormous cost without much benefit under most situations, well much like lazy evaluation generally.

You could otoh write profiling tools that recommend changes based upon usage patterns, well presumably this exists for MySQL, etc.

GAP changes data types dynamically when you conclude that a group has a representation that permits more speedy operations.
posted by jeffburdges at 1:22 AM on April 18, 2012


You could obviously mask the data structure behind abstractions that changed implementations dynamically based upon run-time profiling. I suspect however that lazily determined data structure represent an enormous cost without much benefit under most situations, well much like lazy evaluation generally.

I believe MATLAB does this with matrices, dynamically switching out sparse, banded, or dense representations of the data depending on what elements in the array are nonzero. Pretty cool.
posted by a snickering nuthatch at 5:54 AM on April 18, 2012 [1 favorite]


An associative array, a hash, most trees, etc. are all conceptually a key set with a mapping, not just the mapping by itself. A union-find data structure is a set optimized for discovering disjoint partitions. A bitfield and a bloom filter are sets optimized for different membership problems on known ranges.

You're overstating the issue here.

If a language is going to include a set type, it just needs to cover most use cases and it shouldn't surprise people too much by acting not like a mathematical set would. It should assume as little as possible about the types of data it's going to contain. Just those requirements eliminate the bloom filter (weird and surprising because it reports false positives- not desirable in a default set type) and bitfields (assumes a known range, which won't even make sense for most types of values you want to the set to contain).

So then it comes down to associative arrays (which usually use hashes internally- not sure why you listed them separately...?), and trees (which could then be divided into trees whose structure is determined by the keys vs. trees whose structure is determined by disjoint sets).

But I think we can also get rid of disjoint-sets as a possible default set type for a language, because iirc just testing for membership is O(n). This is not an issue for union-find algorithms because for those, we already know all of the members up front; we just want to keep track of their structure. But it's a big issue for most other uses of sets.

So of the possibilities you listed, we're down to associative arrays and trees based on keys. Which one would make most sense depends on the rest of the language (e.g. what's the approach to concurrency, if applicable?); it's not like the choice is some overwhelming profusion of possibilities. In fact, it might not even be a choice between mutually exclusive possibilities at all, considering the fact that associative arrays need a collision-resolution strategy (so they've got to have pointers off to some other structure that will help deal with that) and tree nodes could have more than two children (and those children could then be stored contiguously in an array) - that is to say, a structure associating keys to values can be by degrees array-like or tree-like.
posted by a snickering nuthatch at 6:54 AM on April 18, 2012 [1 favorite]


Ahh, interesting that MATLAB does that with matrices, like I said GAP does that with groups, but both benefit from rigorously knowing their algorithms performance characteristics, GAP hierarchically and MATLAB based upon the matrixes contents. There are presumably still situations where, depending upon whether your doing additions or multiplications or gaussian elimination, you actually want a different data structure than MATLAB selects.

I never intended to claim that bitfields, bloom filters, or union-find were serious contenders for a default set type, but students shouldn't imagine that every sets they see in their algorithms class is an associative array either. You probably shouldn't place these weird data structure into a SetLike typeclass in Haskell or interface in Java, while you do want abstractions like Haskell's Data.Foldable and Data.Traversable.
posted by jeffburdges at 8:16 AM on April 18, 2012


Blazecock Pileon: "Maybe I'm not explaining myself properly. But in any case, without the core library, you're not going to do a whole lot with Objective C."
NSString *string = @" no kidding. ";
NSString *trimmedString = [string stringByTrimmingCharactersInSet:
                                  [NSCharacterSet whitespaceAndNewlineCharacterSet]];
posted by Deathalicious at 8:53 AM on April 18, 2012 [1 favorite]


« Older Sex, Lies and Cyber-crime Surveys   |   True Adventures in Better Homes Newer »


This thread has been archived and is closed to new comments