McCullough and Wilson point out that Excel appeared to use a "calculator formula" to compute VAR. Their observation was correct and can be extended to many (but curiously, not all) functions that calculate the sum of squared deviations about a mean. The calculator formula can be executed in a single pass through the data.posted by benzenedream at 11:57 AM on May 23, 2011 [4 favorites]
<snip>
With infinitely precise arithmetic, both procedures yield the same results. However, because of the finite precision of Excel, the calculator formula is more prone to round off errors. In texts on statistical computing, the calculator formula is generally presented as an example of how not to compute variance.
And speaking of wolfram alpha, they have a ton of things like this available for iphone and ipad for things like calculus or chemistry.Don't the WA apps just call back to WA itself to get the answers? With this you would get your answers quicker, Wolphram Alpha seems to take several seconds to generate an answer, which is especially annoying when the 'natural language' input only works like half the time.
M-x calc RET(I only rarely use calc, so I'll cop to having to consult the online documentation to find that the key sequence for derivative was 'ad')
'6x^2+12x RET
adx RET
'(x^2-9)/(x-3) RETresults in x+3.
an RET
I started talking a lot with Thomas Barnet-Lamb about a crazy idea to create a new open source math software system with readable implementations of algorithms, and nothing hidden in some stupid proprietary layer. Thomas was then a first year Harvard grad student who had won some international computer programming competition, so I figured he would enjoy talking about software. I also talked a lot with Dylan Thurston about this crazy idea; Dylan had started gradShort version: Python is ridiculously human-readable, and with Pyrex can be fast.
school at the same time as me at Berkeley, graduated the same time, and had the same first two jobs as me, was also an Assistant Professor. Both Thomas and Dylan gave me many ideas for programming languages to consider, including OCaml (which Thomas liked), Haskell (which Dylan was a huge fan of), etc. After having used Magma for years, with its highly optimized algorithms, I desperately needed a fast language. But I also wanted a language that was easy to read, and that mathematicians could pick up without too much trouble, since I wanted people like Manjul to someday use this system and not have their research cut off. And I knew from experience that unreadable source code is no better than closed source.
I'm not going to go into negatives of any languages. Though I used Python a lot, for a long time I didn't consider it seriously at all for this crazy project, since I tried implementing some basic
arithmetic algorithms in Python and found that they were vastly too slow to compete with Magma (or C). I had also tried quite hard to use SWIG to make C++ available in Python, but SWIG is extremely frustring, and has horrible performance (due to multiple layers of wrapping), at least compared to what Magma could do.
In October 2004, I was flying back from Europe (the Paris Magma conference) and started reading the Python/C API reference manual straight through. I realized that Python is far, far more than just an interpreter. It is a C library that implements everything you need, and has a well defined and well documented API. I did some sample benchmarks on the plane, and found not surprisingly that I could write code as extensions to Python that was just as fast as
anything one could write for Magma by modifying the Magma kernel, since under the hood, both were written in C. Also, on the flight, I realized that because the Python/C interface uses reference counting, it would be vastly easier to write the C extensions I would need using
some sort of language I would design. I got home and somehow stumbled onto Pyrex, which was exactly what I was planning to write. I tried it out, did benchmarks, and realized that I had a winner.
With Pyrex and Python, I could implement algorithms and make them as fast as anything in Magma, assuming I could figure out the right algorithm. Moreover, the dozens of issues I had with Magma, many of which were simply a function of them not having the resources to do
language development, were already solved in Python. And Python would continue to move forward with no work from me. It was mid-2004 and because of Python, the overall software ecosystem was much better than in 1999, despite open source number theory software having not moved forward much.
I started going to (and sometimes hosting) the Boston Python user group meetings, which was quite large, and gave me much useful feedback. And I decided it was time to move past my test and prototype stage and get to work. My plan, as I had explained it to Thomas, was to create a complete new system from the ground up using Python + Pyrex. All the code would have an easy to read Python implementation that was well documented, in some cases there would be
a much faster Pyrex implementation of the same code, etc. With my naive plan in hand, I sat down with the main elliptic curves file of the PARI source code, and started to translate.
I think I made it through one function. Where some might have doggedly persisted for years with such an approach, I quickly ran out of patience. In fact, when it comes to software and programming I can be extremely impatient. I realized that my entire plan was insane,
and would take too long. I had discussions with Thomas, Dylan, and others, and everybody I knew who was seriously into number theory computation was using Magma, so I realized that I was going to have to do this entire project myself. So I realized translating was doomed.
Somehow, even with all my experience, I had massively underestimated the complexity of the algorithmic edifice that is any serious mathematical software system.
I read the PARI C API reference, and used Pyrex to write a wrapper so that I could call some basic PARI functions from Python. I implemented basic rational and integer types using Pyrex and GMP, and the performance was reasonable. One day, I was using Matplotlib (a Python library) to draw some plots for Barry Mazur that involved explicit computation with the incomplete Gamma function, and was frustrated because neither PARI nor Magma had an implementation of this special function at the time. Harvard had a Mathematica site license, so I had a copy of Mathematica, and I wrote code using the pexpect Python library to hold open a single Mathematica session and use it to compute the incomplete Gamma function. Problem solved. This was when the interfaces between Sage and other mathematics software systems was
born.
« Older Film on Paper documents in detail a personal colle... | Green With Envy is a new movie... Newer »
This thread has been archived and is closed to new comments
posted by hellslinger at 11:39 AM on May 23, 2011