Massively Parallel & Infinitely Tiny
January 31, 2012 2:15 PM   Subscribe

While Moore's Law continues to drive consumer and manufacturer expectations of technological advancement, frequency scaling has given way to parallel scaling and our most visible indicator of ever increasing transistor density is ever multiplying cores. Welcome to the Parallel Jungle where heterogeneous cores and ultimately the cloud offer far faster growth rates in parallelism than even described by Moore's Law.

Meanwhile, progress in software algorithms (especially important for the utilization of parallel processing) has been shown to also outpace Moore's Law. One example:

". . . a benchmark production planning model solved using linear programming would have taken 82 years to solve in 1988, using the computers and the linear programming algorithms of the day. Fifteen years later – in 2003 – this same model could be solved in roughly 1 minute, an improvement by a factor of roughly 43 million."

It's not as if materials haven't been keeping up, either. IBM has announced that it has reduced the amount of atoms required to store one bit of data from one million to twelve, or, at room temperature, closer to 150. On a similar scale, the University of New South Wales has created the worlds thinnest silicon wire, only four atoms wide.

Finally, graphene has a new competitor in the form of molybdenite, a similar substance without the bandgap issues that plague graphene transistors. This means that as the physical limits of silicon are pursued, even now we have the ability to make transistors only 1nm in size. Graphene, not to be outdone, has demonstrated many new properties recently, including piezoelectric, permeability, photovoltaic and the ability to drastically improve the performance of lithium-ion batteries.
posted by I've wasted my life (31 comments total) 22 users marked this as a favorite
 
Using a graph to describe such a vague concept as technological sophistication makes sense in games like Civilization and pretty much nowhere else.
posted by LogicalDash at 2:34 PM on January 31, 2012 [2 favorites]


Moore's Law specifically deals the rate at which the "number of transistors that can be placed inexpensively on an integrated circuit doubles". It's mechanically quantifiable, not abstract or vague in the slightest.
posted by I've wasted my life at 2:39 PM on January 31, 2012 [4 favorites]


has anyone seen the 100-core processors that are coming out? they're sort of obcenely brilliant...basically, the archetecture consists of a 4-core processor/gpu/system-on-a-chip, but with an additional area of the chip (about 1/3 to 1/2) divided into 60 or so micro-cores...each one dedicated to one extension (like .jpg or .mp3 or .etc). mostly for mobile platforms these chips use startlingly tiny amounts of power...say you're just listening to music...then the only part of the processor powered on is a tiny tiny bit of it...i've been reading about them on engadget...i'll find a link...brb
posted by sexyrobot at 2:55 PM on January 31, 2012 [2 favorites]


ah...here!
posted by sexyrobot at 2:57 PM on January 31, 2012 [2 favorites]


In the late 90s, Intel gambled (and lost) on the ability of compilers to automatically find instruction-level parallelism (ILP) in good old serial code; this is why the Itanium architecture never really caught on (well, one reason anyway). Luckily, they were too big to fail—instead, they took the instruction set designed by their direct competitor (amd64 / x86_64) and implemented it on their newer chips (under a different marketing name, naturally)

It's interesting to contemplate the alternate world, where Itanium won and by now is the instruction set used by most computers with more than 2GB of RAM (i.e., any new desktop or laptop). Would there be a second source for chips that run Itanium instructions? Would it be AMD, or another player? Unlike x86_64 (as far as I have been able to determine from a few cursory searches), would Intel have asserted patents on the Itanium instruction set if a second source moved to bring compatible chips to market?

Itanium and ILP are a bit of a derail, except for this: In the age of Itanium, there was little or no buy-in to the idea that average developers would be rewriting their applications to take advantage of Itanium; instead, the promise was that compilers would bridge the gap, turning your C, C++ or Fortran source code into well-performing object code for the Itanium architecture; actually providing this compiler is where Intel stumbled. I never read of an "itanium-specific" language. On the other hand, successive vector extensions to x86 and x86_64 do see people writing their performance-critical code in a way that the compiler can find the parallelism, and the same goes for thread-level parallelism through language extensions like OpenMP. Not only that, but we're also seeing new languages (like google's golang) that add new language-level features for parallelism.

So, my question is this: why the change in attitude between then and now? Another decade for the implications of end of CPU frequency scaling to sink in? Or is there no change in attitude, but that no language would actually be fast on Itanium? Or something else altogether?
posted by jepler at 3:00 PM on January 31, 2012 [6 favorites]


Remember when the transmeta crusoe self programing learning chip type things were all the rage?
posted by Chekhovian at 3:12 PM on January 31, 2012


Thanks for the great post! Here are some other interesting recent papers on graphene (not exactly related to transistors, but still nifty).

Anyone know when a commercial device utilizing graphene is expected on the market?
posted by beepbeepboopboop at 3:21 PM on January 31, 2012


Is it worth noting that progress in algorithms is much less predictable than progress in hardware?
posted by benito.strauss at 3:53 PM on January 31, 2012 [2 favorites]


This is one the many reasons functional languages will increase in popularity. If you learn to write programs that aren't cavalier about state, parallel execution becomes much easier.
posted by phrontist at 3:54 PM on January 31, 2012


Welcome to the Parallel Jungle where heterogeneous cores and ultimately the cloud offer far faster growth rates in parallelism than even described by Moore's Law.

I don't think Moore's law means what you think it means.
posted by ZenMasterThis at 3:59 PM on January 31, 2012


So, my question is this: why the change in attitude between then and now? Another decade for the implications of end of CPU frequency scaling to sink in? Or is there no change in attitude, but that no language would actually be fast on Itanium? Or something else altogether?

A crap-ton of work's gone into parralel programming since then... The OS and Language tech wasn't there. Also, the tyranny of C/C++ has been broken... Developers got used to the idea of coding at the system level on alternative languages, thanks to ObjectiveC. Stuff like Haskell, and a rebirth of Lisp, have brought functional language mojo into the mainstream, and wiith it parralell programing techniques that have made their way to other languages.
posted by Slap*Happy at 4:09 PM on January 31, 2012


ZenMasterThis: Oops, that could have been phrased better! That particular sentence is a reference to Figure 13 in the linked article but I did word it poorly as Moore's Law does pertain only to transistor density on die, not virtually. Sorry.
posted by I've wasted my life at 4:15 PM on January 31, 2012


Stuff like Haskell, and a rebirth of Lisp, have brought functional language mojo into the mainstream

Explosion of "real programming" in Javascript == rebirth of Lisp
posted by thedaniel at 4:59 PM on January 31, 2012


heterogeneous cores and ultimately the cloud offer far faster growth rates in parallelism than even described by Moore's Law.
If by "faster" you mean "slower" in terms of the actual increase in computation per dollar. I pointed this out the other day. In 1990 you had, basically 25mhz 486s. By 2000 you had 2.8ghz Pentium 4s. A hundred fold increase in clock speed. A chip, costing the same dollar amount was 100 times faster not counting the speedups you got through more transistors (a 1ghz core i7 core would be much, much faster then a 1ghz 386)

On the other hand, what's the fastest consumer chip you can get today, vs 2000? You could get an 3.6ghz 8core AMD chip or 4 cores at 3.7ghz.

So realistically, you're looking at a maybe a 10-12 fold increase in speed, given perfect parallelism in 10 years.

Calling a 10 fold increase in speed more then a 100 fold increase is obviously incorrect.
posted by delmoi at 5:24 PM on January 31, 2012 [1 favorite]


Moore's Law specifically deals the rate at which the "number of transistors that can be placed inexpensively on an integrated circuit doubles". It's mechanically quantifiable, not abstract or vague in the slightest.
Moore's law was just a quantification lazyness, IMO. There's no physical reason why 22nm chips couldn't have been created in the 1970s. It was really more about the trade-off between investing in research vs. the payoff. Had Intel spent more on research, they could have pushed things faster. Actually Moore's law was violated in the 90s, chip speed actually increased faster then Moore's law predicted, when Intel and AMD were fiercely competing with each other.

People act like it's the speed of light or something.
posted by delmoi at 5:28 PM on January 31, 2012 [2 favorites]


Oh and the improvement in speed from the 486 to the p4 was a lot more then 100, because the 486 took multiple clock cycles to run individual instructions, while the p4 could crunch through more then one per clock. Those benefits are still accruing though.
posted by delmoi at 5:30 PM on January 31, 2012


Venray announces a new spin on an old (electrical engineer's) dream - processor-in-memory.
posted by newdaddy at 5:51 PM on January 31, 2012 [2 favorites]


thedaniel: wat? How is JS lispy?
posted by phrontist at 5:51 PM on January 31, 2012


Also scala is neat.
posted by flaterik at 6:03 PM on January 31, 2012


Haskell is way neater though. :P
posted by jeffburdges at 6:13 PM on January 31, 2012 [1 favorite]


I find "usefulness" neat, so I tend to disagree ;)
posted by flaterik at 9:51 PM on January 31, 2012


How is JS lispy? How is closure formed?
posted by flabdablet at 3:44 AM on February 1, 2012 [1 favorite]


Closures are one of the most powerful features of ECMAScript (javascript) but they cannot be property exploited without understanding them…

and it looks like this:
js> make_adder = function(x) { return function(y) { return y + x } }
(function (x) {return function (y) {return y + x;};})
js> plustwo = make_adder(2)
(function (y) {return y + x;})
js> plustwo(3)
5
posted by jepler at 5:50 AM on February 1, 2012


Closures are just functions. In JavaScript you can pass 'raw' functions around, except weirdly functions and objects are like the same thing. Functions have properties, like objects in c/java/etc. And they also are functions - so you have the result of having functions that have properties that can be used in themselves like this:


//assigns the variable helloworld to the function we just defined
var helloworld = new function(hi){
   this.hi = hi;//assigns this.hi to the parameter hi;
   this.sayhi = new function(planet){ return hi + planet;}
}



So now we have something like a class it's a function that, when called assigns it's 'this.hi' and 'this.sayhi' variables to a string and another new function.

Here's where it gets weird though. In javascript you can both call a function and use them as constructors. so
helloworld("earth"); and var himars = new helloworld("mars") both do slightly do slightly different things. The second one works like calling a constructor in java, the first one acts in a way that's actually very different from anything in C/C++/Java -- the line where you declare the var helloworld is actual ordinary code that gets run, creating a variable 'helloworld' which is an instance of the object. Except, since it's never been called it has no properties yet. (I think that's how it works) So the first time you call it, those properties get defined.

Very strange, IMO. I guess when you get used to it, it makes sense but it can be kind of confusing, especially since most javascript documentation is written for people who are just dabbling and they don't really go into it much. I've kind of had to just figure a lot of this stuff out by experimentation :P.

One cool thing is the closure compiler for javascript. It lets you add annotations to javascript code, and then compiles your code into a more compact form (like minify but way more intense)
posted by delmoi at 8:03 AM on February 1, 2012


(btw the 'closure compiler' is a terrible name - the compiler itself has nothing to do with 'closures' in general other then that javascript has them, so it's an example of overloading a term - you can't find it by googling 'javascript closure' because then you just get stuff about closures in javascript. Ironically, it was created by google. You'd think they'd pick a more google friendly name)
posted by delmoi at 8:09 AM on February 1, 2012


@jepler: Would there be a second source for chips that run Itanium instructions?

HP would, in theory, be able to make them. Noone else, however.

we're also seeing new languages (like google's golang) that add new language-level features for parallelism.

This is not new. Dijkstra proposed the foundation concepts in 1968 and were expanded on by Hoare and Hansen. Practical languages supporting these concepts were around by the late-1970s/early-1980s (e.g. Mesa, Concurrent Pascal, Modula-2, Ada, Occam, Chill).

And if you want to cover functional languages here, Lisp has been around since 1958.

why the change in attitude between then and now?

My take on this (having lived through it) is that Intel was never able to show *enough* of a performance gain with Itanium & ILP over x86 and coarse parallelism to convince the industry jump ship off of four or five decades of experience with "traditional" architectures/languages/concepts. You'd think by now, Intel would have learned that a paradigm shift must be radically, orders-of-magnitude, objectively better than incremental improvement to the status quo for the industry to drop what it's doing and follow (see: iAXP-432, i960, i8089).
posted by kjs3 at 11:05 AM on February 1, 2012


I came in here for the parallelism. I stayed for the rebirth of Lisp.
posted by DU at 5:03 AM on February 2, 2012 [1 favorite]


Closures are just functions

Not quite; a closure is a function and its execution context i.e. the same collection of non-local variables it had access to when it was defined.

So if I'm using a language that supports closures, and I have variables that are local to a function, and that function's return value is itself an inner function that makes use of those variables as well as its own, then the inner function is a closure, and it's closed over those variables.

A JavaScript example (I used spidermonkey to make and test this):
js> function counter() {
    var count=0;
    return function () { return ++count; };
}
js> foo=counter();
function () {
    return ++count;
}
js> bar=counter();
function () {
    return ++count;
}
js> foo()
1
js> foo()
2
js> foo()
3
js> foo()
4
js> bar()
1
js> bar()
2
js> foo()
5
js> 
In this example counter, foo and bar are all functions, but only foo and bar are closures, and they're closed over count.

Note in particular that foo and bar get access to different count variables, because count is not like the static locals you might have seen doing vaguely similar jobs in C; it's a proper local variable, created anew on each call to counter. In a language with first-class functions but no support for closures, count would be deallocated on exit from counter. With closure support, both instances of it stay accessible for as long as foo and bar exist.
posted by flabdablet at 9:47 PM on February 2, 2012


I'll elaborate slightly, closures are simply objects created by the absence of arguments, this often doesn't even require mutable variables.

You must create an entirely knew object in Java, complete with a half a page of retarded boilerplate, but you simply omit an excess argument in a functional language.

In Haskell, you can implement a caching lookup table around an arbitrary function using a custom couple lines of code because you're creating the necessary objects implicitly.

You accept the benefits of placing indexes on appropriate SQL columns, yes? Imagine doing that for an arbitrary computation inside your compiled code. And employing any cashing algorithm imaginable, bloomier filters, whatever.
posted by jeffburdges at 6:28 PM on February 3, 2012


Jeff, could you give an example of the missing-argument construction in Javascript?
posted by flabdablet at 9:28 PM on February 3, 2012


Your counter function has type void -> void -> int, meaning it takes two voids and returns an int, but does so curried, meaning the call to counter() fills in only the first argument.

All functions are curried by default in functional languages. You create a function that adds one to it's argument by writing say (+1), which partially applies addition to create a function type Num -> Num.

You'd achieve the same effect in C++ by creating an object implementing operator() that possesses the variable count. You're javascript example creates this object implicitly by employing a lambda expression on third line of counter().

You can achieve considerably more object oriented functionality than operator() through simply currying functions of course, but you must understand how the machine interprets your code.

In practice, you must explicitly use lambda expressions to signal the compiler to optimize for currying even in a functional language. Imagine I want an lookup table based cache object for my expensive_function of type Int -> Meow. I might implement this in Haskell by writing :

import Data.Vector (generate, (!))

cacher n f = \x ->if i >= n then f i else t ! i
where t = generate m f

cached_expensive_function = cacher 100 expensive_function

You'll observe that cacher has the polymorphic type Int -> (Int -> a) -> Int -> a because it takes two arguments itself, the n of type Int and the f of type Int -> a, but also returns a lambda expression \x -> .. which takes another Int, i.e. \ is Haskell's analog of javascript's function keyword. And cached_expensive_function has exactly the same type Int -> Meow as expensive_function, but it caches the first 100 values in a lookup table.

If I'd written cacher f n x = if ... instead then I'd get a function which ultimately returned the same values, but perhaps the compiler won't optimize for calls that fill in only the first two arguments, thus creating the object with a lookup table t.

You might imagine a JIT compiler learning from experience to produce the version applied to two arguments, but that's a pretty deep optimization for runtime.
posted by jeffburdges at 12:44 AM on February 4, 2012


« Older Dolly Parton's Other Voice   |   Sheep Cyclone Newer »


This thread has been archived and is closed to new comments