Comments on: Breaking the Coppersmith-Winograd barrier
http://www.metafilter.com/109901/Breaking-the-CoppersmithWinograd-barrier/
Comments on MetaFilter post Breaking the Coppersmith-Winograd barrierTue, 29 Nov 2011 13:53:43 -0800Tue, 29 Nov 2011 13:53:43 -0800en-ushttp://blogs.law.harvard.edu/tech/rss60Breaking the Coppersmith-Winograd barrier
http://www.metafilter.com/109901/Breaking-the-CoppersmithWinograd-barrier
<em>For twenty years, the fastest known algorithm to multiply two n-by-n matrices, due to Coppersmith and Winograd, took a leisurely O(n^2.376) steps. Last year, though, buried deep in his <a href="http://www.maths.ed.ac.uk/pg/thesis/stothers.pdf">PhD thesis</a>, Andy Stothers discussed an improvement to O(n^2.374) steps. And today, <a href="http://www.cs.berkeley.edu/~virgi/">Virginia Vassilevska Williams</a> of Berkeley and Stanford, released a <a href="http://www.cs.berkeley.edu/~virgi/matrixmult.pdf">breakthrough paper [pdf]</a> that improves the matrix-multiplication time to a lightning-fast O(n^2.37<strong>3</strong>) steps. [<a href="http://www.scottaaronson.com/blog/?p=839">via</a>]</em> <br /><br />First section of the Introduction copied from the paper:
<em>The product of two matrices is one of the most basic operations in mathematics and computer science. Many other essential matrix operations can be efficiently reduced to it, such as Gaussian elimination, LUP decomposition, the determinant or the inverse of a matrix [1]. Matrix multiplication is also used as a subroutine in many computational problems that, on the face of it, have nothing to do with matrices. As a small sample illustrating the variety of applications, there are faster algorithms relying on matrix multiplication for graph
transitive closure (see e.g. [1]), context free grammar parsing [19], and even learning juntas [12].
Until the late 1960s it was believed that computing the product C of two n x n matrices requires essentially a cubic number of operations, as the fastest algorithm known was the naive algorithm which indeed runs in O(n^3) time. In 1969, Strassen [18] excited the research community by giving the ﬁrst subcubic time algorithm for matrix multiplication, running in O(n^2.808) time. This amazing discovery spawned a long line of research which gradually reduced the matrix multiplication exponent omega over time. In 1978, Pan [13] showed omega < 2.796. The following year, Bini et al. [4] introduced the notion of border rank and obtained omega < 2.78. Schonhage [16] generalized this notion in 1981, proved his tau-theorem (also called the asymptotic sum inequality), and showed that omega < 2.548. In the same paper, combining his work with ideas by Pan, he also showed omega < 2.522. The following year, Romani [14] found that omega < 2.517. The ﬁrst result to break 2.5 was by Coppersmith and Winograd [9] who obtained omega < 2.496. In 1986, Strassen introduced his laser method which allowed for an entirely new attack on the matrix multiplication problem. He also decreased the bound to omega < 2.479. Three years later, Coppersmith and Winograd [10] combined Strassen's technique with a novel form of analysis based on large sets avoiding arithmetic progressions and obtained the famous bound of omega < 2.376 which has remained unchanged for more than twenty years.
In 2003, Cohn and Umans [8] introduced a new, group-theoretic framework for designing and analyzing matrix multiplication algorithms. In 2005, together with Kleinberg and Szegedy [7], they obtained several novel matrix multiplication algorithms using the new framework, however they were not able to beat 2.376.
Many researchers believe that the true value of omega is 2. In fact, both Coppersmith and Winograd [10] and Cohn et al. [7] presented conjectures which if true would imply omega = 2. Recently, Alon, Shpilka and Umans [2] showed that both the Coppersmith-Winograd conjecture and one of the Cohn et al. [7] conjectures contradict a variant of the widely believed sunﬂower conjecture of Erdos and Rado [11]. Nevertheless, it could be that at least the remaining Cohn et al. conjecture could lead to a proof that omega = 2.</em>post:www.metafilter.com,2011:site.109901Tue, 29 Nov 2011 13:48:43 -0800albrechtmatrixmultiplicationcoppersmithwinogradmathcomputersciencecsBy: NerdcoreRising
http://www.metafilter.com/109901/Breaking-the-CoppersmithWinograd-barrier#4052928
Aren't the static coefficients on the algorithms huge?
What I'm getting at is, isn't calling them "fast" a bit of a mislabel for normal sized matrices? They are lower asymptotically, yes, but there would need to be astronomically sized matrices for it to be any faster in practice than algorithms with a higher Big-O?
Probably a nitty comment, but I think calling them "fast" indicates a misunderstanding of the basic concepts discussed here.comment:www.metafilter.com,2011:site.109901-4052928Tue, 29 Nov 2011 13:53:43 -0800NerdcoreRisingBy: the painkiller
http://www.metafilter.com/109901/Breaking-the-CoppersmithWinograd-barrier#4052931
I can honestly say with all sincerity that I never saw this coming, not in a million years.comment:www.metafilter.com,2011:site.109901-4052931Tue, 29 Nov 2011 13:54:21 -0800the painkillerBy: beagle
http://www.metafilter.com/109901/Breaking-the-CoppersmithWinograd-barrier#4052937
Question: other than the fact that the previous omega had remained in place for over 20 years, why is it a breakthrough to go from 2.376 to 2.373, which is an improvement of less than 0.13 percent? If the potential target is 2, wouldn't a breakthrough be something that moved the mark a lot closer to 2?comment:www.metafilter.com,2011:site.109901-4052937Tue, 29 Nov 2011 13:56:41 -0800beagleBy: 0xFCAF
http://www.metafilter.com/109901/Breaking-the-CoppersmithWinograd-barrier#4052938
I was going to ask about static coefficients as well. I don't have the exact quote, but I think Knuth said something like "O(k<sup>n</sup>) is generally worse than O(n<sup>k</sup>), but in this case, <i>k</i> is 64".comment:www.metafilter.com,2011:site.109901-4052938Tue, 29 Nov 2011 13:57:36 -08000xFCAFBy: theodolite
http://www.metafilter.com/109901/Breaking-the-CoppersmithWinograd-barrier#4052940
The epigraph on that Stothers thesis is pretty heavy duty. I was hoping there would be another one at the beginning of every section, and then at the end of the paper it turns out that matrix multiplication disproves the possibility of G > 1 wherein G represents the number of deities found in the set of real numbers.comment:www.metafilter.com,2011:site.109901-4052940Tue, 29 Nov 2011 13:58:38 -0800theodoliteBy: GallonOfAlan
http://www.metafilter.com/109901/Breaking-the-CoppersmithWinograd-barrier#4052941
What direct implications will this have in computing?comment:www.metafilter.com,2011:site.109901-4052941Tue, 29 Nov 2011 13:59:11 -0800GallonOfAlanBy: yoink
http://www.metafilter.com/109901/Breaking-the-CoppersmithWinograd-barrier#4052943
So...um...why is this important, exactly? I'm not doubting its importance, I hasten to add, but if someone could point out some of the practical applications (or even theoretical applications) for those of us who haven't a clue it'd be nice.comment:www.metafilter.com,2011:site.109901-4052943Tue, 29 Nov 2011 14:00:26 -0800yoinkBy: jedicus
http://www.metafilter.com/109901/Breaking-the-CoppersmithWinograd-barrier#4052947
<em>other than the fact that the previous omega had remained in place for over 20 years, why is it a breakthrough to go from 2.376 to 2.373, which is an improvement of less than 0.13 percent?</em>
<a href="http://www.wolframalpha.com/input/?i=x%5E2.374+-+x%5E2.373+from+0+to+1000">Check out this graph of the difference between x^2.374 and x^2.373</a>. It doesn't take a very large x (in computer science terms) before the efficiency gain becomes very significant.comment:www.metafilter.com,2011:site.109901-4052947Tue, 29 Nov 2011 14:02:32 -0800jedicusBy: aubilenon
http://www.metafilter.com/109901/Breaking-the-CoppersmithWinograd-barrier#4052949
beagle: Well, it's an exponent. So, disregarding all lower order coefficients, if n=1000, it's a full 2.1% faster. And for n=1000000, it's 4.1%. And so on.
But yeah, you need a pretty enormous data set to actually make a big difference.comment:www.metafilter.com,2011:site.109901-4052949Tue, 29 Nov 2011 14:03:44 -0800aubilenonBy: qxntpqbbbqxl
http://www.metafilter.com/109901/Breaking-the-CoppersmithWinograd-barrier#4052950
<i>What direct implications will this have in computing?</i>
None. Like NerdcoreRising said, it's probably not practical. You know how it is when a new Olympic sprinting record is set? This is like that. It will not make a difference for ordinary joggers.comment:www.metafilter.com,2011:site.109901-4052950Tue, 29 Nov 2011 14:04:31 -0800qxntpqbbbqxlBy: jepler
http://www.metafilter.com/109901/Breaking-the-CoppersmithWinograd-barrier#4052957
<a href="http://rjlipton.wordpress.com/2011/11/29/a-breakthrough-on-matrix-product/">this blog on the "breakthrough"</a> posits that the new algorithm doesn't do better than the old algorithm for matrices smaller than n✕n where n=2<sup>1000</sup>, in other words not in any real problems where the new method runs faster than the old one: <blockquote>This value is far above the Bekenstein Bound for the number of particles that could be fit into a volume the size of the observable universe without collapsing it into a black hole. In this sense the algorithm itself is even beyond galactic. </blockquote>comment:www.metafilter.com,2011:site.109901-4052957Tue, 29 Nov 2011 14:08:03 -0800jeplerBy: aspo
http://www.metafilter.com/109901/Breaking-the-CoppersmithWinograd-barrier#4052959
jedicus: <a href="http://www.wolframalpha.com/input/?i=x%5E2.374+%2Fx%5E2.373+from+0+to+10000">This</a> the more interesting version of that graph. That's not even 1% faster at 10k. Even at 1m it's still only about 1.3% faster.comment:www.metafilter.com,2011:site.109901-4052959Tue, 29 Nov 2011 14:08:06 -0800aspoBy: jepler
http://www.metafilter.com/109901/Breaking-the-CoppersmithWinograd-barrier#4052962
errr, fail on my part: 2<sup>1000</sup> is the matrix size where the new method is twice as fast as the old best method.comment:www.metafilter.com,2011:site.109901-4052962Tue, 29 Nov 2011 14:09:25 -0800jeplerBy: TheJoven
http://www.metafilter.com/109901/Breaking-the-CoppersmithWinograd-barrier#4052973
Almost all engineering computational analysis is based on matrix math. This will make a huge difference on computational fluid dynamics as well as finite element analysis. On my desktop computer I do FEA analysis with 100,000 node systems. This will speed all of this up significantly, especially the supercomputer based systems.comment:www.metafilter.com,2011:site.109901-4052973Tue, 29 Nov 2011 14:17:17 -0800TheJovenBy: RobotVoodooPower
http://www.metafilter.com/109901/Breaking-the-CoppersmithWinograd-barrier#4052975
I think jepler's right, and <a href="http://www.scottaaronson.com/blog/?p=839#comment-34668">this comment</a> seems to indicate that these recent improvements aren't breakthroughs but just ricing* the CW algorithm -- which no one uses in practice either. Mere mortals like you and I will keep on multiplying our matrixes with rusty tire irons, as always.
<small>*apologies for using 'rice' as a verb in a mathematical context, but it was funny</small>comment:www.metafilter.com,2011:site.109901-4052975Tue, 29 Nov 2011 14:18:02 -0800RobotVoodooPowerBy: dephlogisticated
http://www.metafilter.com/109901/Breaking-the-CoppersmithWinograd-barrier#4052976
Meanwhile, the fastest known Al Gore rhythm clocks in at a tepid 60 BPM, and is frankly just embarrassing to listen to.comment:www.metafilter.com,2011:site.109901-4052976Tue, 29 Nov 2011 14:19:40 -0800dephlogisticatedBy: Jon_Evil
http://www.metafilter.com/109901/Breaking-the-CoppersmithWinograd-barrier#4052977
Computer science types: how big do <em>n</em>s get for things that we do, like encoding video, or transmitting game data on the internet?
Moreover, maybe there's cool stuff that previously wasn't practical because it involved multiplying huge matrices, but now that it's easier to do, we can rely on it more for computing stuff.comment:www.metafilter.com,2011:site.109901-4052977Tue, 29 Nov 2011 14:19:59 -0800Jon_EvilBy: homotopy
http://www.metafilter.com/109901/Breaking-the-CoppersmithWinograd-barrier#4052988
<em>This will speed all of this up significantly, especially the supercomputer based systems.</em>
No it won't. Coppersmith-Winograd (and this improvement) is never used in practice, because the matrices on which it performs better are so large that they can't be stored on modern computer hardware. These algorithms are of theoretical interest, and that is it.
As NerdcoreRising mentioned right at the top of the thread, there are coefficients in front of the n^omega term, and they are HUGE.comment:www.metafilter.com,2011:site.109901-4052988Tue, 29 Nov 2011 14:27:10 -0800homotopyBy: sixohsix
http://www.metafilter.com/109901/Breaking-the-CoppersmithWinograd-barrier#4053004
Here's something embarrassing: my training as an engineer taught me that matrix multiplication was O(n^3) and I never had any reason to doubt it.
Anyway, these algorithms may be impractical for most systems, but advancing new mathematical arguments that lead to theoretically-better algorithms can help solve practical problems in other domains.
It's also good when work brings a problem closer to a theoretical bound. Mathematicians are "pretty sure" that the perfect algorithm is O(n^2) but they can't prove it. Similarly, mathematicians were "pretty sure" that x^n + y^n /= z^n for all n > 2 but it took a hundred years of work to prove it. The theorems discovered along the way were tremendously valuable.
Math is not always about eureka moments. Incremental advances are worth noticing.comment:www.metafilter.com,2011:site.109901-4053004Tue, 29 Nov 2011 14:35:27 -0800sixohsixBy: louigi
http://www.metafilter.com/109901/Breaking-the-CoppersmithWinograd-barrier#4053044
I wasn't expecting this. Even had I been, I would not have expected to learn about it on Metafilter, but on the arXiv. Thanks.comment:www.metafilter.com,2011:site.109901-4053044Tue, 29 Nov 2011 15:01:02 -0800louigiBy: atrazine
http://www.metafilter.com/109901/Breaking-the-CoppersmithWinograd-barrier#4053051
<em>Here's something embarrassing: my training as an engineer taught me that matrix multiplication was O(n^3) and I never had any reason to doubt it.</em>
To be fair, the entire mathematical and CS establishment was quite sure of that until the 60's. I'm not sure if the actual algorithms used in practice are the naive O(n^3) ones, but they don't use these asymptotically fast ones.
For multiplication of numbers on the other hand, the naive 'long multiplication' method is not used. Instead, modern maths libraries use methods based on fast fourier transforms.comment:www.metafilter.com,2011:site.109901-4053051Tue, 29 Nov 2011 15:07:18 -0800atrazineBy: ryanrs
http://www.metafilter.com/109901/Breaking-the-CoppersmithWinograd-barrier#4053057
RobotVoodooPower, you realize that's a racial slur, yes?comment:www.metafilter.com,2011:site.109901-4053057Tue, 29 Nov 2011 15:10:49 -0800ryanrsBy: delmoi
http://www.metafilter.com/109901/Breaking-the-CoppersmithWinograd-barrier#4053061
Does anyone use the better then O(n<sup>3</sup>) algorithms in practice?comment:www.metafilter.com,2011:site.109901-4053061Tue, 29 Nov 2011 15:13:33 -0800delmoiBy: Wolfdog
http://www.metafilter.com/109901/Breaking-the-CoppersmithWinograd-barrier#4053072
Many undergraduates use a well-known O(n^2) algorithm, regarding the increased speed (and simplicity) as a favorable tradeoff for getting the wrong answer.comment:www.metafilter.com,2011:site.109901-4053072Tue, 29 Nov 2011 15:19:18 -0800WolfdogBy: scose
http://www.metafilter.com/109901/Breaking-the-CoppersmithWinograd-barrier#4053076
delmoi: Strassen's algorithm (the n^2.78 one) is faster IRL for large matrices, but it's less numerically stable.comment:www.metafilter.com,2011:site.109901-4053076Tue, 29 Nov 2011 15:23:04 -0800scoseBy: fistynuts
http://www.metafilter.com/109901/Breaking-the-CoppersmithWinograd-barrier#4053102
it'll be fine, just throw more sand at it
interesting post thanks .. having a fully geeked up day ... brilliantcomment:www.metafilter.com,2011:site.109901-4053102Tue, 29 Nov 2011 15:40:03 -0800fistynutsBy: formless
http://www.metafilter.com/109901/Breaking-the-CoppersmithWinograd-barrier#4053124
<em>but if someone could point out some of the practical applications (or even theoretical applications) for those of us who haven't a clue it'd be nice.</em>
In addition to the fluid dynamics and other apps others have mentioned, matrix and vector multiplication is also very important in many modern social algorithms.
For example, apps like PageRank for computing relevancy in search. Recommender Systems like Amazon's and Netflix's use matrix operations to compute recommendations.
Anytime you're dealing with large graphs, chances are you're going to boil it down to a matrix problem. And the web and social relationships on social networking sites are really just several giant graphs / networks.comment:www.metafilter.com,2011:site.109901-4053124Tue, 29 Nov 2011 15:57:24 -0800formlessBy: Pyry
http://www.metafilter.com/109901/Breaking-the-CoppersmithWinograd-barrier#4053146
Those tend to be very sparse matrices, though, which have their own specialized sets of algorithms.comment:www.metafilter.com,2011:site.109901-4053146Tue, 29 Nov 2011 16:08:57 -0800PyryBy: monkeymadness
http://www.metafilter.com/109901/Breaking-the-CoppersmithWinograd-barrier#4053149
I definitely understand why folks are wondering why this matters in practice, but to a Discrete Mathematician this sort of news is awesome. Algorithms have been around far longer than computers, and algorithmic complexity/asymptotics are really cool to theoreticians.
There is some algorithm that does this in the fastest possible way for arbitrarily large matrices. We just haven't found it yet. This algorithm is closer than we've ever gotten, and that's cool.comment:www.metafilter.com,2011:site.109901-4053149Tue, 29 Nov 2011 16:11:25 -0800monkeymadnessBy: Wolfdog
http://www.metafilter.com/109901/Breaking-the-CoppersmithWinograd-barrier#4053155
<i>There is some algorithm that does this in the fastest possible way for arbitrarily large matrices.</i>
Just in the interest of hair-splitting - in the spirit of the thread about a 0.001 improvement in an asymptotic constant - that's not necessarily true. There might be O(n^(2+e)) algorithms for every e>0 but no O(n^2) algorithm.comment:www.metafilter.com,2011:site.109901-4053155Tue, 29 Nov 2011 16:15:03 -0800WolfdogBy: monkeymadness
http://www.metafilter.com/109901/Breaking-the-CoppersmithWinograd-barrier#4053156
...and for a perspective on something similar, I was at a talk recently by Donald Knuth. Someone in attendance asked him if he thinks <a href="http://en.wikipedia.org/wiki/P_versus_NP_problem">P=NP</a>. He said he supposes the answer is `yes' but we'll probably never actually find a polynomial time algorithm for an NP-complete problem. We'll just be able to show one exists. It might not be practically useful, but it would be extremely awesome.comment:www.metafilter.com,2011:site.109901-4053156Tue, 29 Nov 2011 16:17:41 -0800monkeymadnessBy: monkeymadness
http://www.metafilter.com/109901/Breaking-the-CoppersmithWinograd-barrier#4053169
<b>Wolfdog</b>: I'm not sure. I mean, we're not looking for a smallest number strictly greater than x here. We're looking for an algorithm that is `the best'. If there is a requirement that an algorithm be expressible in, say, at most a billion kazillion words, then among all of those possible there is a best one. But, yeah, who knows? Now you have me wondering if for every <i>e</i> there could be an algorithm in O(n^(2+<i>e</i>)).comment:www.metafilter.com,2011:site.109901-4053169Tue, 29 Nov 2011 16:24:49 -0800monkeymadnessBy: feloniousmonk
http://www.metafilter.com/109901/Breaking-the-CoppersmithWinograd-barrier#4053179
I don't know about large matrices being transmitted, Jon_Evil, but in terms of their use in computation, where something like this might theoretically be pretty useful, <em>n</em> will vary widely in size, but generally speaking, to a computer, a million is not a big number.
I think something like this sort of improvement, which only provides benefit under certain circumstances, is less likely to have a direct impact on workloads on a home PC or gaming console. I am not a scientific computing guy, but this seems like something that would be useful in things like weather simulation.comment:www.metafilter.com,2011:site.109901-4053179Tue, 29 Nov 2011 16:43:48 -0800feloniousmonkBy: ryanrs
http://www.metafilter.com/109901/Breaking-the-CoppersmithWinograd-barrier#4053180
<i>you have me wondering if for every e there could be an algorithm in O(n^(2+e))</i>
Yeah, bubble sort with strategically placed nops.comment:www.metafilter.com,2011:site.109901-4053180Tue, 29 Nov 2011 16:44:28 -0800ryanrsBy: delmoi
http://www.metafilter.com/109901/Breaking-the-CoppersmithWinograd-barrier#4053181
<blockquote><i>Just in the interest of hair-splitting - in the spirit of the thread about a 0.001 improvement in an asymptotic constant - that's not necessarily true. There might be O(n^(2+e)) algorithms for every e>0 but no O(n^2) algorithm.</i></blockquote>
Well, first of all monkeymadness was talking about hypothetical unknown algorithms. (actually the algorithm in the paper is just a re-application of the same algorithm 8 times, previously it was proven that re-applying it twice made it faster but unknown if more re-applications made it go faster. It would be impossible to practically measure it)
The second thing is that we are talking about is 'arbitrary', that is the algorithm is faster on <i>arbitrary</i> matrices even if it's not faster on 'practical' matrices or 'matrices that could be encoded given all the mater in the universe'comment:www.metafilter.com,2011:site.109901-4053181Tue, 29 Nov 2011 16:44:52 -0800delmoiBy: sixohsix
http://www.metafilter.com/109901/Breaking-the-CoppersmithWinograd-barrier#4053189
<em>> For multiplication of numbers on the other hand, the naive 'long multiplication' method is not used. Instead, modern maths libraries use methods based on fast fourier transforms.</em>
atrazine, I think you just blew my mind. I really should have taken some computational algorithm courses.comment:www.metafilter.com,2011:site.109901-4053189Tue, 29 Nov 2011 16:53:10 -0800sixohsixBy: Canageek
http://www.metafilter.com/109901/Breaking-the-CoppersmithWinograd-barrier#4053202
When I took linear algebra back in my first year of Uni my prof mentioned that he used to work for a bank on gigantic matrices that they had to do all sorts of diagonalizations and such to work with: Would this help that type of work? Also, isn't pagerank based on matrices? What about scientific software such as Guassian or other quantum mechanical simulations?comment:www.metafilter.com,2011:site.109901-4053202Tue, 29 Nov 2011 17:02:29 -0800CanageekBy: qxntpqbbbqxl
http://www.metafilter.com/109901/Breaking-the-CoppersmithWinograd-barrier#4053316
<i>For multiplication of numbers on the other hand, the naive 'long multiplication' method is not used. Instead, modern maths libraries use methods based on fast fourier transforms.
atrazine, I think you just blew my mind. I really should have taken some computational algorithm courses.</i>
My BS degree has "computational mathematical sciences" in its title, and I didn't know that. <a href='http://numbers.computation.free.fr/Constants/Algorithms/fft.html'>Neat.</a>comment:www.metafilter.com,2011:site.109901-4053316Tue, 29 Nov 2011 18:41:07 -0800qxntpqbbbqxlBy: delmoi
http://www.metafilter.com/109901/Breaking-the-CoppersmithWinograd-barrier#4053318
<blockquote><i>My BS degree has "computational mathematical sciences" in its title, and I didn't know that. Neat.</i></blockquote>
Woah, me either. Of course for most mathematics you're going to use hardware, this stuff would only be used for arbitrary precision digits. Still interesting.comment:www.metafilter.com,2011:site.109901-4053318Tue, 29 Nov 2011 18:45:55 -0800delmoiBy: Pyry
http://www.metafilter.com/109901/Breaking-the-CoppersmithWinograd-barrier#4053329
It's because multiplying two numbers is like a <a href="http://www-structmed.cimr.cam.ac.uk/Course/Convolution/convolution.html">convolution</a> in their digits, and the FFT allows large convolutions to be done <a href="https://ccrma.stanford.edu/~jos/st/Convolution_Theorem.html">efficiently</a>.comment:www.metafilter.com,2011:site.109901-4053329Tue, 29 Nov 2011 19:00:49 -0800PyryBy: indubitable
http://www.metafilter.com/109901/Breaking-the-CoppersmithWinograd-barrier#4053330
<i>Those tend to be very sparse matrices, though, which have their own specialized sets of algorithms.</i>
So are CFD and FEA systems, for that matter.comment:www.metafilter.com,2011:site.109901-4053330Tue, 29 Nov 2011 19:03:14 -0800indubitableBy: quanta and qualia
http://www.metafilter.com/109901/Breaking-the-CoppersmithWinograd-barrier#4053351
I'll wait for the quantum computers to arrive.comment:www.metafilter.com,2011:site.109901-4053351Tue, 29 Nov 2011 19:26:03 -0800quanta and qualiaBy: pwnguin
http://www.metafilter.com/109901/Breaking-the-CoppersmithWinograd-barrier#4053460
<a href="http://www.metafilter.com/109901/Breaking-the-CoppersmithWinograd-barrier#4053329">Pyry</a>: "<i>It's because multiplying two numbers is like a <a href="http://www-structmed.cimr.cam.ac.uk/Course/Convolution/convolution.html">convolution</a> in their digits, and the FFT allows large convolutions to be done <a href="https://ccrma.stanford.edu/%7Ejos/st/Convolution_Theorem.html">efficiently</a></i>"
I never thought I'd have the opportunity to discuss this on mefi, but convolutions baffle me. Is there any intuitive notion of the concept, or have I simply reached my limit in mathematics?comment:www.metafilter.com,2011:site.109901-4053460Tue, 29 Nov 2011 21:22:52 -0800pwnguinBy: Harvey Kilobit
http://www.metafilter.com/109901/Breaking-the-CoppersmithWinograd-barrier#4053462
The best remark I can imagine on this result involves reusing a quote from <a href="http://www.informit.com/articles/article.aspx?p=1193856">an interview with Donald Knuth</a>:<blockquote>A lot of the recent literature is academic one-upmanship of limited interest to me; authors these days often introduce arcane methods that outperform the simpler techniques only when the problem size exceeds the number of protons in the universe. Such algorithms could never be important in a real computer application. I read hundreds of such papers to see if they might contain nuggets for programmers, but most of them wind up getting short shrift.</blockquote>comment:www.metafilter.com,2011:site.109901-4053462Tue, 29 Nov 2011 21:24:46 -0800Harvey KilobitBy: twoleftfeet
http://www.metafilter.com/109901/Breaking-the-CoppersmithWinograd-barrier#4053475
For people joining the conversation late, I will attempt to summarize. I believe that "big O" refers to orgasms and that some nerds have figured out how to improve these.
<small>Let's get more people through the door! Free cocaine!</small>comment:www.metafilter.com,2011:site.109901-4053475Tue, 29 Nov 2011 21:45:03 -0800twoleftfeetBy: troublesome
http://www.metafilter.com/109901/Breaking-the-CoppersmithWinograd-barrier#4053499
<a href="http://www.youtube.com/watch?v=TH6Acpzurb0">My immediate mental reaction upon reading this headline. </a>comment:www.metafilter.com,2011:site.109901-4053499Tue, 29 Nov 2011 22:16:04 -0800troublesomeBy: jepler
http://www.metafilter.com/109901/Breaking-the-CoppersmithWinograd-barrier#4053945
<blockquote><i>Computer science types: how big do </i>n<i>s get for things that we do, like encoding video, or transmitting game data on the internet?"</i></blockquote>
In 3D graphics, the most frequent matrix multiplication is with n=4, because points in 3d space are represented as (x,y,z,1) and the matrices that operate on them are 4x4. (Introducing a 4th value makes the 3d coordinates into <a href="http://en.wikipedia.org/wiki/Homogeneous_coordinates">homogeneous coordinates</a>, which "are ubiquitous in computer graphics because they allow common operations such as translation, rotation, scaling and perspective projection to be implemented as matrix operations")comment:www.metafilter.com,2011:site.109901-4053945Wed, 30 Nov 2011 07:43:29 -0800jeplerBy: IAmBroom
http://www.metafilter.com/109901/Breaking-the-CoppersmithWinograd-barrier#4054434
<em>I never thought I'd have the opportunity to discuss this on mefi, but convolutions baffle me. Is there any intuitive notion of the concept, or have I simply reached my limit in mathematics?
</em>
pwnguin, I'm an optical engineer, so my metaphor to make it "intuitive" will come from that field. Feel free to propose your own analogy...
Imagine two pictures, printed on B&W slide film (that is, "white" areas are transparent and "black" areas are opaque). Place one picture on top of photographic paper, turn on the exposure light, and move the other picture across at a steady rate. Turn off the light, and develop the paper.
The resulting "picture" is the convolution of the two images, left-to-right. A different convolution image would occur if you moved the second picture top-to-bottom.
It should be obvious that right-to-left is the same as left-to-right, and fairly clear that picture 2 stable while picture 1 moves is no different (convolution is commutative).
If picture two has a white line across its center, any white spots on the center of picture one will show up as white. Anything less (a broken white line, or only a white spot) will result in a less-than-perfectly-white spot on the convolution image (mathematically, you are representing infinity here - the integral of 1x1 over all infinity, where "infinity" = (left side -> right side)).
Analogously, a black line across either picture will result all black spots on the other picture showing up as "true black", instead of dark gray: the integral of 0 x f(x) over all infinity = 0 (where f(x) is real).
The <a href="http://en.wikipedia.org/wiki/Convolution">animations atop this Wikipedia page</a> demonstrate this, for a 1-D case.comment:www.metafilter.com,2011:site.109901-4054434Wed, 30 Nov 2011 11:37:00 -0800IAmBroomBy: wjzeng
http://www.metafilter.com/109901/Breaking-the-CoppersmithWinograd-barrier#4055489
Let's not forget that behind every theorem (or algorithm improvement) lies a human story. A check back at the original blog post contains an <a href="http://www.scottaaronson.com/blog/?p=839"> update and cautionary warning </a> from Andrew Stother's experience.
<em>On the other hand, I warned Andrew that his LinkedIn profile, which unselfconsciously mentions improvements to his Word and Excel skills as one of the benefits of his PhD research breaching the Coppersmith-Winograd barrier, might have earned him a place in scientific folklore forever!</em> - Scott Aaronsoncomment:www.metafilter.com,2011:site.109901-4055489Thu, 01 Dec 2011 02:56:28 -0800wjzengBy: Jpfed
http://www.metafilter.com/109901/Breaking-the-CoppersmithWinograd-barrier#4063093
<em>I never thought I'd have the opportunity to discuss this on mefi, but convolutions baffle me. Is there any intuitive notion of the concept, or have I simply reached my limit in mathematics?
posted by pwnguin at 23:22 on November 29 [+] [!]</em>
There are a couple intuitive ways you can approach convolution. But they both have the same math behind them.
Let's say you're singing into a microphone in an anechoic chamber. There's the pressure waves coming out of your mouth; the pressure at the microphone as a result of those waves forms a function of time we'll call V(t) (for Voice).
Ok, now let's have you produce the same pressure waves with your mouth, but in St. Peter's Basilica, which is a huge cathedral with crazy acoustics. There are all sorts of echoes and stuff. The microphone is getting a new pressure signal now, which we'll call B(t) (for Basilica).
Now, we're getting a very fast stream of fluctuating pressure readings from that microphone, but there's a certain regularity to the signal. We can explain why the B(t) signal is the way it is. But we're going to need one more function to help us figure it out- a function that encodes the characteristics of the space.
Let's say that, for where we're standing relative to the microphone inside the cathedral, there's a function E(t) that encodes the strength of the echoes. If we made a very short impulse at our mouth, a little "pop" sound, E(t) would have a spike at the beginning and decaying spikes later on as progressively more- and more-damped echoes reached the microphone.
Ok, now we can explain the values of B(t). B(t) gets some direct contribution from V(t) - our voice going straight into the microphone. But it could also get some contribution from V(t-1) (some audio you emitted in the past) that took a millisecond to echo back. We know the strength of the echo at 1 ms; that's E(1). The total contribution of your echoed voice from 1ms ago is V(t-1)*E(1). But there will also be echoes from further in the past; an echo from 2ms ago would be V(t-2)*E(2), and 3ms ago would be V(t-3)*E(3), etc. To cover all the bases, we should sum up all of the echoes that could have occurred since the moment we started singing (t = 0).
To write this concisely, I'm going to say that B(T) = sum t from 0 to T of V(T-t)*E(t). Well, looky there, the formula for convolution! Yet more concisely, B = V * E
-----------------------------------
Another fun example of convolution would be finding the probability distribution of the sum of two independent random variables (e.g. die rolls).
Let's say we have two independent random variables X and Y; we'll say that X(v) is the probability that X takes on the value v, and Y(v) is the probability that Y takes on the value v.
What is the probability distribution S(v) where S = X + Y? In other words, for any given value v, what is the probability that X and Y sum to v?
Well, for any given value v that S could take on, there are a bunch of scenarios that could have resulted in that value. Maybe X = v and Y = 0. Maybe X = (v-1) and Y = 1. Or X = (v-2) and Y = 2. And so on. Each of these scenarios is mutually exclusive, so we can just add up their probabilities to find the probability that S = v.
Since X and Y are independent, the probability that X = v and Y = 0 is just X(v) * Y(0); the probabilities for the other scenarios can be computed analogously (just taking the product of their constituent probabilities).
So we would express this as S(v) = X(v) * Y(0) + X(v-1) * Y(1) + X(v-2) * Y(2) ... etc.
Equivalently, S(v) = sum of t from 0 to v of X(v-t) * Y(t)
BAM!
Convolution, baby.comment:www.metafilter.com,2011:site.109901-4063093Mon, 05 Dec 2011 21:33:11 -0800Jpfed