You're doing it wrong!
October 17, 2022 2:00 AM   Subscribe

Someone improved my code by 40,832,277,770% - YouTube. A month of Python runtime down to 500 milliseconds. And it's Wordle! Which isn't as new as you think. It's a Matt Parker thing.
posted by zengargoyle (29 comments total) 14 users marked this as a favorite
 
Gah I even checked and it's 500 microseconds not milliseconds.
posted by zengargoyle at 4:15 AM on October 17, 2022


If you're interested in figuring out how Matt managed to make his code so incredibly slow, it's here (via the original video).
posted by polytope subirb enby-of-piano-dice at 4:15 AM on October 17, 2022


I found some of the code stuff kind of interesting but the quality of his code is overshadowed by the quality of his wordlist. vibex is rubbish that I'd never heard of, but at least it's in SOWPODS (the union of the british and american scrabble dictionaries) . muntz is just hooey. I assume that the one he shows at the beginning (with muntz and vibex is the wordliest of the 500-some solutions his program came up with, which really would mean there are no solutions that only use actual words. And without that constraint this just feels like a much less interesting exercise in pure computing.
posted by aubilenon at 5:14 AM on October 17, 2022


Definition of vibex

: a linear subcutaneous extravasation of blood
History and Etymology for vibex

Latin, mark of a blow, weal; probably akin to Latvian wībele weals, wīle seam, weal, scar, and perhaps to Latin vibrare to shake, vibrate
Muntz metal - Wikipedia.

Words are like that, they sneak up and morph around you. So many "words" and some of them shift. Just because you don't know them doesn't make them not exist.
posted by zengargoyle at 5:34 AM on October 17, 2022 [8 favorites]


He said that the biggest optimization was switching to bitwise operations, which surprised me a bit. I would've thought the biggest optimization would've come from one of the algorithms that restructured the problem (graph theory, algorithm X, etc.).

...and that makes me wonder whether anyone tried bitwise operations and the ordering optimizations in Python, and how they did.
posted by clawsoon at 5:37 AM on October 17, 2022 [1 favorite]


Digging through the spreadsheet, the current fastest code is a hand-tuned, parallelized take on the same basic ideas that seem to be expressed most clearly here, which includes both C++ and Python versions.
posted by clawsoon at 6:11 AM on October 17, 2022


Nah, Python sucks! It has a horrible multi-core threaded sort of model. I'm not sure if the testing accounted for that bit in particular on multi-core processors. It's a combination of pipelining I/O things and reducing things by mapping to optimal 32bit bits and using the silicone to do it fast. You just can't even do that in High Level Lanuages (especially interpreted ones, especially Python) the same way that you can in C/C++/Rust or such. It's ease of getting down to hardware level. And it's constraining the problem/solution domain (aka envelope) to native instructions.

You'll notice that more time was spent just reading in the data and writing out the results than the actual calculations.

But yeah, the bitwise operations can scream, the trick is to cram the logic into those operations.
posted by zengargoyle at 6:11 AM on October 17, 2022


The impression I'm getting from the READMEs of the various attempts is that dramatically reducing the number of comparisons was the main target of optimization, and the video may have underplayed that and overplayed bitwise comparison. Now I'm tempted to convert the GuiltyBystander Python code back to using sets to see what difference it makes...
posted by clawsoon at 6:24 AM on October 17, 2022 [2 favorites]


He said that the biggest optimization was switching to bitwise operations, which surprised me a bit.

Yeah, bitwise operations, even Python bitwise operations, are indeed a lot faster than Python set operations, but I would expect from a glance at his code (I'm not currently at my desktop to run profile on it) that the problem is how many times he's doing these operations. The original video gives his script's value number_of_doublewords as a bit over 3.2 million, with duplicate lettersets from doublewords not removed. He's then iterating over all five hundred million million pairs of doublewords (including pairs that have a word in common) and running set union on two sets of ten one-letter strings. There's only so much speedup you could get from doing that with bitwise ops (though it would be significant).

It's an interesting premise for a pair of videos and deliberately bad code is a good way to get programmers to send improvements, but the comparison of set vs bitwise ops without the acknowledgement of how much further the search space can be reduced (James Grime on search space reduction for a different problem) doesn't justify his ragging on Python's performance in the followup video.
posted by polytope subirb enby-of-piano-dice at 6:37 AM on October 17, 2022


Nah, Python sucks!

That's a bit abrasive. Python is fine.
posted by Pendragon at 7:23 AM on October 17, 2022 [16 favorites]


This would actually make a good foundation for a coding interview. First have the programmer review and generate the big O notation for the existing code. Then have them discuss optimization and refactoring. Finally give them an hour to produce an optimized version.
posted by interogative mood at 8:21 AM on October 17, 2022 [2 favorites]


You'd probably want to do that after they get past the FizzBuzz screen...
posted by clawsoon at 8:46 AM on October 17, 2022 [1 favorite]


bitwise operations can scream, the trick is to cram the logic into those operations

This sounds like the kind of problem that an academic from a UK university would spend 6-8 months producing an obsessively optimized solution in ARM assembly language. It would be utterly unintelligible to all but a very few.
posted by scruss at 9:12 AM on October 17, 2022 [1 favorite]


Did recently something like this for my wife and her addiction to Spelling Bee (NYT) and Free Bee. No real programming. Just a tiny bash script piping stuff between awk, grep, nl and less. I search a Scrabble dictionary I found on GitHub. Never timed it but it completes before I can get my finger off the enter key...
posted by jim in austin at 9:25 AM on October 17, 2022 [2 favorites]


Nah, Python sucks!

That's a bit abrasive. Python is fine.


"There are two kinds of programming languages: ones that people complain about and ones that nobody has heard of."
posted by AlSweigart at 11:01 AM on October 17, 2022 [31 favorites]


One of the nice things about SQL is that most RDBMSes have some seriously black magic in their query planners that will optimize the hell out of your queries. Back in the late 90s when PostgreSQL was new its planner wasn't very smart, so modifying/reordering/optimizing your query could almost always result in significant gains in performance.

It has gotten progressively harder over the years to get significant gains from manual query optimizations. Nowadays if a query is slow you're more likely to have missing indexes or a configuration problem. When I ask the database what it's doing and tell it no, please do it the way that seems to me like it should be faster, it almost always turns out I'm wrong.
posted by wierdo at 11:32 AM on October 17, 2022 [3 favorites]


This reminds me of a programming task I undertook several times in the last 45 years.
A friend had given me a number substitution problem, and when I ran out of ideas doing it on paper, I decided I needed a computer to do it.
The first time I succeeded, it took ~15 minutes running machine language standalone on a 64-bit mainframe.
There was no way I could do it in BASIC on a microcomputer without overflowing the stack, but I eventually ported it to run on a C-64 in 6502 machine language, and it ran in around 26 hours.

I mentioned the problem on the blue years ago, and tykky came back with a C++ version that ran on Codepad in about a second. (Much thanks to the next_permutation() function in C++)
So about a factor of 10^5 improvement in time over the C-64.
In that post, I also learned about the Steinhaus–Johnson–Trotter algorithm, which allowed me to run it as a BASIC program on a C-128, with all solutions in a few minutes.
posted by MtDewd at 1:30 PM on October 17, 2022 [4 favorites]


I had finished grad school before I properly understood that development time must be factored into the time it takes to compute the solution to a problem.

Suppose I have some problem that needs to be solved once, or intermittently. I can think of at least two ways to solve it: a stupid way that’ll be easy to write, and a clever way that’ll probably run faster but will require me to think more carefully.

My strategy is to write it the stupid way, and then set that to running. If it gives me a solution in a convenient amount of time, then I am done! The end! If I happen to find myself growing impatient, or if running repeatedly takes more time than I want, then I can start in on the cleverer way. But if my choices are to spend five minutes writing a stupid program that answers my question in ten minutes of running time, versus spending an hour writing a clever program that solves my problem in one minute of running time, then I am more than happy to take the stupid-but-answered-in-fifteen-minutes route. I’m even happy to have spent ten of those fifteen minutes starting in on a clever solution that I never end up using.

Optimizing code to speed up the solutions to a solved problem is useful only as a teaching exercise. Teaching is, of course, the point of Matt Parker’s Youtube channel. And furthermore, as he said, he was in no hurry to actually find the solution to this particular problem: he had other things to do, and could afford wait until his stupid solution was ready. Combined with the chance to trigger his audience of premature-optimizers to engage with each other, his choice to bang out some stupid code and wait for it was clearly an excellent decision.
posted by fantabulous timewaster at 3:37 PM on October 17, 2022 [11 favorites]


If I was trying to solve this problem in real life, I'd probably stop at the 15 minute solution as "good enough". A month just seems unacceptable, and speeding it up would take way less than a month of effort. Hard to know in advance though. Skimming through the code readmes linked from the video, it's interesting that the graph theory solution (the one that takes 15 minutes) was beaten by the bithack approach. The former seems like the "smarter" more theoretical approach, and it took me a couple of times reading it to properly understand the algorithm, whereas using bitwise operations to check if two words share letters is the kind of optimization I'd bumble towards if given this problem to solve. But it's faster! Take that CS nerds.
posted by L.P. Hatecraft at 6:19 PM on October 17, 2022


As a Matt Parker patreon, videos like these make me happy to support his channel. We're still optimizing the algorithm!
posted by lock robster at 7:16 PM on October 17, 2022 [3 favorites]


This is fun to see! I saw this early on, when Fred Overflow solved it in Java in 30 seconds, and then in three seconds.

jim in austin: "Did recently something like this for my wife and her addiction to Spelling Bee (NYT) and Free Bee. No real programming. Just a tiny bash script piping stuff between awk, grep, nl and less."

Are you trying to nerdsnipe us?

...

I'd go with egrep, something like "^[$letters]+$"
posted by Pronoiac at 8:05 PM on October 17, 2022 [2 favorites]


The open pull request against the original python code is also a great example of taking someone’s garbage code and refactoring and optimizing the code.

Note: the comments, the use of meaningful variable names and function to make code more self documenting. Also a descriptive commit message with the pull request.
posted by interogative mood at 9:36 PM on October 17, 2022


See, I was proud of myself for turning a problem that took an (estimated) 9 days to process and getting it down to 4.5 hours. (Calculating an adjacency matrix for driving instructions for several thousand points) but it needs a separate API so I can feel reasonably justified that I couldn't speed it up much more.

I think there's an interesting point about optimization though (which Parker makes in the video). Matt's original code only needed to be run once, and he didn't need it any sooner. There tends to be worlds of difference between actual code produced for a function compared to ultra optimised hobby code. I suspect most folks would probably have a try to optimise it a bit, but the 15 minute solution is easily good enough for something that you need to run once.
There is a sense of pragmatism around the kind of code that you tend to produce for work (especially in something like data science) that is rarely spoken of within this kind of problem space.
posted by Just this guy, y'know at 2:28 AM on October 18, 2022


Well I guess I'm this thread's obligatory (also)
posted by rhizome at 2:48 AM on October 18, 2022 [2 favorites]


Pronoiac: I'd go with egrep, something like "^[$letters]+$"

The games have some special rules. All words must be 4 or more letters in length. All words must contain the center letter. All words may contain only the 7 letters for the day, but in any amount. Two grep passes was the lazy way to do this:

#!/bin/bash

grep ? wrdz |
grep -v '[abcdefghijklmnopqrstuvwxyz]' |
less -N

wrdz is the dictionary file. The ? is replaced with the center letter from the puzzle. All seven letters are edited from the [alphabet list]. I switched to less -N from nl after I saw no need for a temporary file. Also eliminated awk by filtering wrdz for length once I had settled on a dictionary...
posted by jim in austin at 9:33 AM on October 18, 2022


Something like "[$letters]{4,}" would enforce a minimum length.

I think "search for these letters" is easier to work with than "throw out any of this much longer list of letters."

For a shell script that allows for "list_words.sh (required letter) (allowed letters)", I'd worry about quoting, and I'd have to try it to have confidence in it. I'd probably reach for a language like Ruby or Python, so I could include, say, comments.
posted by Pronoiac at 2:16 PM on October 18, 2022 [1 favorite]


Every time my script is run it's a one-off. Run it once and never run it in that configuration again (probably). Adding and removing characters takes less than a minute of editing. And when she's finished, I cat > a template file over the edited one and am ready for her next game. This is essentially a discardable one-liner...
posted by jim in austin at 3:24 PM on October 18, 2022


Is there like an article or summary so I don't have to watch a fucking half hour youtube video?
posted by lkc at 10:55 PM on October 18, 2022


Matt Parker is a semi-famous (at least in watchers of maths vidoes) UK maths explainer. He often writes up some Python code to figure things out by brute force. This one was Wordle like "are there five words with five letters each that have twenty-five of the twenty-six letters of the alphabet". He wrote some code, it took a month to actually finish. He puts his things up on GitHub or whatnot. Very quickly other nerds said "hold my beer" and got it down to fifteen minutes, then twelve, then like under a minute and now it's down to five-hundred microseconds. Different languages, most are compiled (not interpreted like Python) and use different sorts of optimizations of the problem. So It's just nerd programmer maths stuff.
posted by zengargoyle at 1:08 AM on October 19, 2022 [1 favorite]


« Older Home Was a Nightmare, Then Home Was Prison....   |   Out ridin' fences Newer »


This thread has been archived and is closed to new comments