The faster Fourier transform
January 18, 2012 10:33 PM   Subscribe

A quicker picker-upper. "[A] group of MIT researchers will present a new algorithm that, in a large range of practically important cases, improves on the fast Fourier transform."
posted by Ardiril (34 comments total) 22 users marked this as a favorite
 
/me looks forward to audiophiles arguing that mp3s sound better than files encoded with the faster fourier transform.
posted by kaibutsu at 10:48 PM on January 18, 2012 [13 favorites]


2 Fast 2 Fourier
posted by chavenet at 10:50 PM on January 18, 2012 [44 favorites]


It's an interesting approach, a variation on some application-specific arithmetic optimizations I've seen in synthesis papers. One thing to note, though, is that any frequency domain decomposition (DCT, DFT, FFT, and probably even wavelets) is dominated by where the data comes from, especially in hardware MP3 players or cell phones. We're able to pack arithmetic into such small, low-power chunks that the actual computations are practically irrelevant.
posted by spiderskull at 10:54 PM on January 18, 2012


I read somewhere recently that improvements in algorithms are responsible for as much improvement in computing speed as improvements in hardware, but can't find the reference. Anybody know if that's truly the case?

In any case, a very cool development, and improvements in fundamental, old, and widely used algorithms is incredibly impressive and challenging.
posted by Llama-Lime at 10:59 PM on January 18, 2012


@LLama: I'd say there is no reference needed. This is "background knowledge" across all fields of computer science, and any first-level course on algorithms and complexity teach that.
posted by knz at 11:05 PM on January 18, 2012


With processor speed relatively tapped out, I'd think improved algorithms would be the only way to improve computing speed for non-parallelizable problems.
posted by BrotherCaine at 11:15 PM on January 18, 2012 [5 favorites]


Reference definitely needed. I could believe that argument in regard to bandwidth and storage, but not compute power.

That is to say, I believe advances in video conferencing and portable music players have as much (or more) to do with improved compression algorithms than increasing bandwidth and storage capacity.
posted by ryanrs at 11:17 PM on January 18, 2012 [2 favorites]


A recording, on the other hand, of all possible instruments each playing all possible notes at once wouldn’t be sparse — but neither would it be a signal that anyone cares about.
they assume much

mp3 uses dct, but in practice the dct is just a varient on the fft (using cosine vs. sine since it is an even function so you don't get complex numbers in certain operations - or something like that). in the audio synthesis world many programs do dct analysis and informally call it an fft or even just a fourier transform.

A radically improved fft would have the following applications (among others):
audio / image / video compression
voice recognition
amp modelling / room simulation
pitch correction / autotune

regarding "sparse signals" - practically speaking that means it compresses smoother/thinner sounds (like a lone saprano or lead guitar) better than rougher/thicker ones (like a pan flute or Slayer).
posted by idiopath at 11:18 PM on January 18, 2012 [1 favorite]


Don't they improve on performance by throwing away data at random? Is the improvement in speed greater than the resulting loss in quality?
posted by Blazecock Pileon at 11:30 PM on January 18, 2012


Thts n xtrmly ntrstng rtcl.
posted by hypersloth at 11:38 PM on January 18, 2012 [5 favorites]


Well, a lot of the time, the next thing you do after the FFT (or DCT) is throw away (or quantize away) a lot of the less-strongly-represented frequencies. That's where the bulk of the lossiness in lossy image compression usually comes from, for example.
posted by hattifattener at 11:44 PM on January 18, 2012 [2 favorites]


I'd say it's improvements or refinements in heuristics specifically. We know what we cannot solve exactly (rather, what does not scale well), and as far as I know, there have only been incremental changes to the theoretical side of CS in the last two decades, rather than groundbreaking developments.
posted by spiderskull at 12:10 AM on January 19, 2012 [1 favorite]


That is to say, I believe advances in video conferencing and portable music players have as much (or more) to do with improved compression algorithms than increasing bandwidth and storage capacity.
People have been using the same audio codecs since the 90s. In fact, compression on audio is actually going down, as people encode mp3s for higher quality.
posted by delmoi at 12:18 AM on January 19, 2012


(Oh, and video has gotten a lot better, h264 over MPEG. But that also has to do with CPU speed, an old computer that would have no trouble playing a VCD probably couldn't handle h264)
posted by delmoi at 12:19 AM on January 19, 2012


People have been using the same audio codecs since the 90s. In fact, compression on audio is actually going down, as people encode mp3s for higher quality.

Not true. There have been significant codec advances in pretty much every application of compressed audio: online music, bluray discs, digital broadcast radio, cell phones, voip, digital cable and satellite tv, etc. MP3 is inferior to modern audio codecs and there is no technical reason to use it today.
posted by ryanrs at 12:50 AM on January 19, 2012 [2 favorites]


Blazecock: it isn't about throwing away data at random. It is about making assumptions about the data and optimizing in terms of CPU or quality for that case. It could mean worst case input is larger or more CPU demanding, or it could be mangled by a lossy algorithm - there are various ways a technique could be applied. For example there are existing techniques that look for orderly and chaotic components of a signal and process each seperately - this would probably be useful in that sort of application without quality loss.
posted by idiopath at 1:01 AM on January 19, 2012


@LLama-Lime

The quadratic sieve technique for factoring large numbers is one example of an algorithmic improvement that allows much less powerful hardware convincingly beat a big computer running an older algorithm, at least according to this blog post.

For some classes of algorithms we know that improvements of this kind are no longer possible. For example, we've figured out how fast comparison-based sorting can be, and any future improvements for these specific algorithms cannot be as major as the example above.
posted by tykky at 1:13 AM on January 19, 2012 [1 favorite]


This is interesting, thanks for posting. I glanced at their preprint but I don't really speak computer science... does anyone understand what the actual performance differences might be? Like, say, for a signal with a single frequency component, a gaussian background and a signal:noise of 10, how much faster would this be than FFT? How does that scale with signal length?
posted by Pre-Taped Call In Show at 2:29 AM on January 19, 2012


Does this mean we'll be able to listen to Aphex Twin in colour now?
posted by davemee at 2:52 AM on January 19, 2012 [2 favorites]


Tangent regarding the concept of "a single frequency".

A single frequency component can be compressed to use O(1) space - all the information present is summed up by frequency, phase, and amplitude (each being a single scalar variable). But this is only a perfect model if your signal is of infinite duration. Mathematically, any signal with a duration other than infinity is guaranteed to be a mixture of frequencies, and the shorter the duration the larger the number of frequencies present.

Frequency is weird.
posted by idiopath at 2:55 AM on January 19, 2012 [5 favorites]


With processor speed relatively tapped out, I'd think improved algorithms would be the only way to improve computing speed for non-parallelizable problems.

It really isn't. Wait for 14nm.
posted by jaduncan at 4:14 AM on January 19, 2012 [1 favorite]


I'd think improved algorithms would be the only way to improve computing speed for non-parallelizable problems.

Hand coding in assembly. (assumption: The hand crafted code would be of good quality)
Profiling the code and optimizing the most common parts.
Code reduction to a size that fits on the CPU's local memory.

(was going to say better compilers - but that is "improved algorithms")
posted by rough ashlar at 6:16 AM on January 19, 2012


Whoa, dude. I'm at SODA right now; I just had lunch with Piotr.

Don't they improve on performance by throwing away data at random? Is the improvement in speed greater than the resulting loss in quality?

Yes, that's exactly what they did, and yes, the improvement is worthwhile, at least for the cases they consider. In fact, that's the whole point of the paper — most of the "signal" is actually noise.

With processor speed relatively tapped out, I'd think improved algorithms would be the only way to improve computing speed for non-parallelizable problems.

This has been true for a long time. To answer Lamma-lime, here's a quote from Report to the President and Congress: Designing a Digital Future: Federally FUnded R&D in Networking and IT:
In the field of numerical algorithms, however, the improvement can be quantified. Here is just one example, provided by Professor Martin Grötschel of Konrad-Zuse-Zentrum für Informationstechnik Berlin. Grötschel, an expert in optimization, observes that a benchmark production planning model solved using linear programming would have taken 82 years to solve in 1988, using the computers and the linear programming algorithms of the day. Fifteen years later – in 2003 – this same model could be solved in roughly 1 minute, an improvement by a factor of roughly 43 million. Of this, a factor of roughly 1,000 was due to increased processor speed, whereas a factor of roughly 43,000 was due to improvements in algorithms! Grötschel also cites an algorithmic improvement of roughly 30,000 for mixed integer programming between 1991 and 2008.
posted by erniepan at 7:55 AM on January 19, 2012 [9 favorites]


Yes, that was the report, thanks erniepan!
posted by Llama-Lime at 8:20 AM on January 19, 2012


Hand coding in assembly. (assumption: The hand crafted code would be of good quality)

Not really. These days a good optimizing compiler can usually beat anything a good assembly programmer would come up with by hand.
posted by Dr. Eigenvariable at 8:50 AM on January 19, 2012


An important direction for algorithms research is parallelization. CPUs aren't getting faster but we're getting more of them, cheaply. 4 or 8 threads are common now on desktop computers. Parallelizing whole programs is hard, but parallelizing algorithms can be easy.

My favourite example is compression: gzip, bzip2, and friends are all CPU-bound programs that are single threaded. You have to go to esoteric alternatives like pbzip2 and pigz to get parallel versions.
posted by Nelson at 9:00 AM on January 19, 2012 [1 favorite]


> I read somewhere recently that improvements in algorithms are responsible for as much improvement in computing speed as improvements in hardware, but can't find the reference.

That's not in general correct.

The Apple II, a machine on which I wrote programs, worked at 0.96MHz. It was an 8-bit processor, single core. It was much faster than the machines from ten years before - at the time.

Today we have 64-bit, 8 core machines running at over 2GHz. That's over 10,000 times as many cycles per second.

In a few cases, like the linear programming problem indicated above, there have been comparable increases in algorithm speed. For the work that most processors, and most programmers, are doing, there simply haven't been breakthroughs like that at all.

> Hand coding in assembly. (assumption: The hand crafted code would be of good quality)

While people still occasionally do this, it's really not worth it any more 99.9% of the time. You're better off writing in C++ and using the compiler's optimizer to its best advantage. Optimizing compilers got better than humans at writing fast code a decade or so ago.

At Google, the working number used was that optimized programs were roughly six times as fast as unoptimized builds. That's pretty impressive!

If you have the time to spare to optimize by hand, you're almost always better off doing it at the highest level.

> Code reduction to a size that fits on the CPU's local memory.

Much more important is local access and sequential access to data - making sure your data is loaded into a data cache on the processor.
posted by lupus_yonderboy at 9:49 AM on January 19, 2012 [1 favorite]


With global optimizing compilers (e.g. GCC v4 and later), hand-coded assembly doesn't really yield improvements. The exception, of course, is in size -- I haven't seen optimizing compilers compete with the demoscene yet.
posted by spiderskull at 12:15 PM on January 19, 2012


I'd agree that it's true in most cases that compilers beat hand-optimized assembly, but there are always cases like what Mike Pall talked about here. He's the author of LuaJIT, a performant just-in-time compiler for the Lua language which includes a fast interpreter for the language. In the linked message he talks about the kind of control flow you get in an interpreter and the trouble compilers have with it. Sure, it's a special case, but a rather important one.
posted by tykky at 1:13 PM on January 19, 2012


Ah, my kinda post...

Processor speed isn't tapped out, not by any measure. The sort of processor speed where you've got one synchronous core with everything being clocked at the same frequency (or multiples/divisors thereof), sure - that's hit a plateau. For a while, anyway. We can make 10GHz CPUs (and faster; we have terahertz transistors), but we don't.

That's because the amount of power needed to keep pushing the clock up makes pushing it up further a bad idea. You can't get the heat out easily, the performance increase per extra watt is not interesting, and for all sorts of related reasons it makes more sense to get n cores at f(c) than one core at f(c*n). Dullsville, man.

But that's not what's happening. You get increasingly fine control of increasingly small sub-core components, and you run some things cool (or off) so you can run others at a higher frequency than the whole core will safely manage. Thermal management lets you tweak voltage and speed from moment to moment as a task wants, so where and when it matters you get more raw throughput.

At the same time, a lot (a LOT) of work is going into reducing the power per mips overall, in reconfigurable hardware, in hardware-assist and so on. If you have a hundred watts to play with and suddenly you only need ten to run at the same speed that the hundred got you yesterday, sure, you can boost the clock until you're filling your thermal envelope. (Although the tradeoffs still get exponential, so it's not as useful as you'd like.)

The limiting factor, as far as I can make out, is in the software (microcode, hardware design, what have you: at various interesting intersections, software and hardware stop being useful distinctions) that makes this stuff deliver on its promise. There are a whole lot more variables to consider, and they have to react to infinitely variable tasks in intelligent ways which defy simple algorithmic definition.

Perhaps the best example of this in use these days is FPGA supercomputing, where everyone *knows* that being able to utterly reconfigure your logic to an optimal configuration for a particular task is game-changingly luscious. Just nobody knows how to do this in any way remotely usable for the general case, and precious few know how to do it for even the few special case tasks where it gets properly useful. You do not compile your code with the optimise flag set and go home for your tea. Every useful FPGA computer comes with a rather strange person or two with their own peculiar logic in wetware.

In all this, yeah, processor speed on your PC hasn't moved much for a few years. That's not where the action's at. Annoys the hell out of Intel: it knows better, but its heart is still back in the days when the highlight at IDF was that big ol' frequency count on stage.

As for compression algorithms: the first time I saw someone work out a radio link budget while characterising the (various) digital coding stages by how many dB they added was a real a-ha moment. This is genuine enabling work with implications far beyond making your music smaller and nicer, which is why Viterbi should be more famous outside the fraternity.

(Caveat: all of the above may be subtly wrong. I haven't had the time recently to immerse myself in the field as much as I would like or as I used to. Corrections eagerly appreciated)
posted by Devonian at 2:15 PM on January 19, 2012 [2 favorites]


regarding "sparse signals" - practically speaking that means it compresses smoother/thinner sounds (like a lone saprano or lead guitar) better than rougher/thicker ones (like a pan flute or Slayer).

A decent rough sketch, idiopath, although in practice (AFAICT from what I've read) a pan flute and Slayer* are still far more like a lone soprano than is street noise, or other "white noise".

* More true of the pan flute than Slayer, but ultimately, still valid.

The implication is that for many signals the information capacity (understandability of speech, music quality, etc) is not degraded too much (stays "good") by assuming that 99% of the energy is contained in only a few discrete bands (instead of processing every band from bass to treble, only including those that are statistically likely to be important).
posted by IAmBroom at 2:30 PM on January 19, 2012


I think I may demonstrate this another time, but there is an essential noise component to a pan flute tone, and without that wideband noise the melody remains but the instrument becomes unrecognizable.

Also, why the scare quotes on "white noise"? It is a strictly defined and quite well understood signal.
posted by idiopath at 9:05 PM on January 19, 2012


Those weren't scare quotes. They were approximation quotes.

Street noise isn't white noise - few things are, in fact. White noise is fairly rare in the world; pink noise is much more common (or, to be precise, more often closely approximated).
posted by IAmBroom at 5:08 AM on January 20, 2012


More to the point at hand, idiopath: this algorithm isn't intended for preserving the entertainment value of a musical recording. It's for ultra-fast FFT analysis. I'm using the pan flute example simply because an analogy about optical recognition algorithms that are rotation- and distortion-insensitive wouldn't explain much to most readers.
posted by IAmBroom at 5:11 AM on January 20, 2012 [1 favorite]


« Older Dog Wars   |   RIP Johnny Otis Newer »


This thread has been archived and is closed to new comments