On your mark, get set.... GO!
March 22, 2015 9:56 AM   Subscribe

"Following 9 months of computation and 4 petabyte of disk IO on a Dell PowerEdge R820 server, generously provided by Piet Hut, and administered by Lee Colbert, at the IAS School of Natural Sciences in Princeton, we determined..." (the number of legal moves on an 18x18 Go board).

Ey, you cheeky monkeys, I'm not going to spoil the surprise!

But this leads them to an estimate of a full 19x19 board, and an estimate of the requirements to get the full calculation.
Allowing for some redundancy, we need from 10 to 13 servers, each with at least 32 cores, 512GB RAM, and ample disk space (10-15TB), running for about 3-5 months.
.

If you have such a server, there's an address on the site you can use if you wish to contribute.

Either way, some more info is on that site if you are curious.
posted by symbioid (34 comments total) 15 users marked this as a favorite
 
One of those "I don't know what to make of this" situations on several levels.

Had it been a few hundred times larger or smaller, what meaning would that carry?

Could this computing power have been better spent on discovering protein folding configurations to cure diseases? Or is this one of those situations where climbing the mountain because it's there leads to all sorts of unexpected benefits? Building better computing infrastructures, shortcuts to count the states, etc.
posted by Schmucko at 10:55 AM on March 22, 2015 [1 favorite]


Of course that computing power could have been used for something more "useful", but there's plenty of computing power in the world, that wasn't very much in the scheme of things, and anything's better than using that computing power to mine Bitcoin, so it's all good.
posted by Jimbob at 11:09 AM on March 22, 2015 [8 favorites]


Computing power isn't totally fungible yet. Yeah, I would be happy to have an extra R820 for my research. But it's not like my lab would just be able to use it easily if only they would dedicate it to us.
posted by grouse at 11:12 AM on March 22, 2015


Spoiling it, sorry not sorry. It's about 6.697231e+152

(Not sorry because copying and pasting the number on mobile into Google Calculator to get scientific notation was a pain. Not even sure it worked.)
posted by tss at 11:15 AM on March 22, 2015 [2 favorites]


@Schmucko Using computing power for something like this eventually makes it's way to the more important things like protein folding, etc.
It would not be a direct benefit, but things like the process involved, if any software optimizations were made to make it process faster, etc. could be. This is also a learning experience for the people involved, who will undoubtedly go on to work on less 'trivial' tasks in the future.
posted by abbiecodes at 11:17 AM on March 22, 2015 [1 favorite]


For perspective, the servers being requested to run these calculations are somewhere within a factor of 2-4 of the "OpenConnect" boxes that Netflix would like to locate at ISPs to improve video streaming. The "I/O optimized appliance" has about half the ram, 3/8 the number of cores, and of course ample disk space.
posted by jepler at 11:22 AM on March 22, 2015


I've always been sure the day will come when all the possibilities will be sorted, the 'best' move for any given situation. When I hear stories like this I hope the game will hold out a little longer - another few more years before optimized AI can defeat even the highest ranked pros. The online bots are getting better and better. It's definitely an Everest type goal that a lot of very smart people all over the world want to reach, a labor of love. As a player I'm just happy to plink my moves out one by one on a board that has more than enough choices to make for one lifetime.
posted by damo at 11:24 AM on March 22, 2015


I totally suck at go. Love the idea and concept, but I just don't have the patience :( But whenever I hear about shit like this, it amazes me.

Interestingly, I note the author's say that doing a retrograde analysis like this for chess is unobtainable? Is that due to the complexity of the moves and possible variants? So while you can place any piece anywhere on the board in go, that makes it hard for a player to analyze, but in chess, because the pieces are so constrained in what ways they may move, the sheer size of the problem space is expanded due to the limitations placed on possible movements, such that determining if a move is legal is amplified that much more compared to a willy nilly ("if a spot is open, plop it down")...

Is that a correct reading? Am I making sense/
posted by symbioid at 11:35 AM on March 22, 2015


And yeah - I think focusing on what certain resources can be better spent towards is a fine question in some cases, but I have concern about such lines of thinking. It's the sort of thinking that leads to reactionary statements from politicians that question the utility of something like "fruit fly research"

Or, for example, Lamar Smith's statement “Our efforts will continue until NSF agrees to only award grants that are in the national interest...”

The power of knowledge and information comes from unexpected consequences. The amazing ability of information to percolate beyond the original goals of the researchers into new and potentially radical solutions to problems, new products, new algorithms, etc... Limiting oneself to "research with a practical application for reason x" means we waste new avenues of research that may arise from interesting results gained from seemingly trivial tasks. So many forms of knowledge and research seem trivial or unusable in the day-to-day running of things, but yet, all of this stuff is what brings us new and exciting information about the world we inhabit. And it is the new information that eventually allows us just a little bit better understanding how the universe works, and in turn how we can use that to our species advantage. (Or, depending on your paranoia - Government/National, or Corporate advantage).
posted by symbioid at 11:46 AM on March 22, 2015 [4 favorites]


@symbioid yes. Pretty much exactly that.
Eventually, I assume it can be figured out, as the number of turns increases, the number gets exponentially bigger. https://chessprogramming.wikispaces.com/Perft+Results
posted by abbiecodes at 11:48 AM on March 22, 2015 [1 favorite]


Also - I mean, I've no doubt these machines are very powerful, but they still seem like your typical server platforms? These aren't necessarily "supercomputers" they've got, are they?

I wonder what it would take to get one of the biggest computers in the world to run something like this. Though I suppose, part of the issue is that those are reserved, probably for more of those high-end, difficult problems with possibly more immediate utility than a combinatorical study such as this, yeah?

How does one of these computers compare to, say, my i5, 8gig desktop?
posted by symbioid at 11:48 AM on March 22, 2015


looks like they are using 32 core Xeon processors with 512GB RAM, so it would be much, much faster.
Thanks to the Chinese Remainder Theorem, the work of computing L(19,19) can be split up into 9 jobs that each compute 64 bits of the 566-bit result. Allowing for some redundancy, we need from 10 to 13 servers, each with at least 32 cores, 512GB RAM, and ample disk space (10-15TB), running for about 3-5 months.
posted by abbiecodes at 12:04 PM on March 22, 2015


Tangent, but I just learned the other day that "petabyte" doesn't quite refer to what I thought it did. I had thought it was 250 bytes, but instead it is 1,0005. A pebibyte is 250 bytes.

Further tangent: What are people going to think in 20-30 years when they start having to use terms like "yottabyte" and "zettabyte"? What kind of jokers were in charge of naming these things, anyway?
posted by A dead Quaker at 12:18 PM on March 22, 2015 [1 favorite]


thank god that's finally settled. I'll sleep better.
posted by Bonzai at 12:40 PM on March 22, 2015 [1 favorite]


this is a trivial amount of compute. The server market annual is $50B and the average server (dual socket) is around $10k; there are around 5,000,000 servers produced _per year_. Pretty sure we can spare 13 of them for this vs. idiocy like snapchat.
posted by rr at 12:50 PM on March 22, 2015 [6 favorites]


I want to go on record as resisting the idea that people have to demonstrate the utility of their research before they can do it.
posted by thelonius at 12:55 PM on March 22, 2015 [13 favorites]


Could this computing power have been better spent on discovering protein folding configurations to cure diseases? Or is this one of those situations where climbing the mountain because it's there leads to all sorts of unexpected benefits? Building better computing infrastructures, shortcuts to count the states, etc.

Well, seeing as this is the same board that Conway's Game of Life is played on, I would go out on a limb and say yes, yes this is important. Also, counting the states is easy. There's 50. Alabama, Alaska, Arkansas...
posted by sexyrobot at 1:04 PM on March 22, 2015 [2 favorites]


... except the number of legal moves on any size of Go board is always variable, because you're also not allowed to duplicate any previous board state.
posted by kafziel at 1:24 PM on March 22, 2015 [1 favorite]


"What kind of jokers were in charge of naming these things, anyway?"

Mathematicians, every one of them a clown.
posted by idiopath at 1:33 PM on March 22, 2015


Oooh, that's a big number. I would have guessed that the number of 19x19 positions was going to be more than the number of atoms estimated to exist in the observable universe, but I didn't expect it to be more than n times larger than that, where n is the number of atoms in the observable universe.

Could this computing power have been better spent?

You think that's a waste of computing power? It's nothing. Just imagine how much brainpower people have spent on actually playing the game, over the centuries.
posted by sfenders at 1:46 PM on March 22, 2015 [4 favorites]


Tangent, but I just learned the other day that "petabyte" doesn't quite refer to what I thought it did. I had thought it was 2^50 bytes, but instead it is 1,000^5. A pebibyte is 2^50 bytes.

Yeah, I worked in QA in the HPC storage industry for six years and the difference between the 1000 base and 1024 base nomenclatures could drive a person mad. Especially when the marketing folks didn't understand the difference and used them interchangeably.
posted by octothorpe at 2:23 PM on March 22, 2015 [2 favorites]


Schmucko: "Could this computing power have been better spent on discovering protein folding configurations to cure diseases?"

It's probably important to point out here that it's mostly protein folding prediction. It's a long way off from curing cancer. The best case scenario at the moment is that folding@home or similar will tell you what your sequence probably folds to. For drug design, we really want the reverse. If we can identify an active bind site, we can typically model what the nearby parts should look like. But that local structure also needs to be able to enter the human body, and survive there. So we have some other constraints to apply. Often what we want to do is figure out what sequence will result in a protein that will activate / bind / catalyze, or inhibit interactions to cure disease.

Protein folding does the opposite: given a sequence it predicts a structure. More importantly, a hypothetical one. A sequence may not even result in the protein structure predicted. And that's a big part of what folding@home and fold.it are up to: improving predictions. There's some hope that maybe as we get better at predicting outcomes, we'll start being able to run the process in reverse, and actually design drugs, rather than randomly permute existing drugs until one meets our design criteria.

One way to think of it is that all the NSF money is being spent on big bioinformatics compute clusters, and mathematicians are left to beg for scraps. A single R820 is a drop in the bucket, and comes with some benefits: the problem domain is a lot less specialized. If you're trying to improve performance, the first thing you do is rule out your shitty custom software with complex calculations that non-researchers basically dare not change, and researchers cannot analyze or improve. Searching the Go gamestate space could be a reasonable middle ground for benchmarking.
posted by pwnguin at 3:40 PM on March 22, 2015 [2 favorites]


Forty-two
posted by brevator at 4:20 PM on March 22, 2015 [1 favorite]


You can rent compute nodes almost this big from Amazon or Google.
posted by ryanrs at 4:49 PM on March 22, 2015


I'm confused at 4 petabytes of disk io, is that just "we read and wrote 4pb of data total" or did they need 4pb of data all at once? Based on the specs I think the former in which case I'm confused why they bothered to note it.
posted by Skorgu at 5:00 PM on March 22, 2015


Having lived on the IAS campus, I'm going to guess this was probably not a totally legitimate use of computing resources, but they'll let folks get away with just about anything there as long as it keeps them working. To that end, they also have a ping pong table, movie nights, and a downright amazing cafeteria. It's like summer camp for scientists!
posted by Diagonalize at 5:14 PM on March 22, 2015 [1 favorite]


Of course that computing power could have been used for something more "useful"

This server retails for under $10k, which is about as cheap as "servers good enough for real company" get. Back in my Intuit sysadmin days, these are the kind of things we'd buy by the pallet full. This isn't a supercomputer that could have been used to model nuclear weapons or hurricanes or something.

I'm confused at 4 petabytes of disk io, is that just "we read and wrote 4pb of data total" or did they need 4pb of data all at once? Based on the specs I think the former in which case I'm confused why they bothered to note it.

Calling out the machine model along with this makes me think the author isn't super technical, and therefore didn't know what was important. Or, they were technical, but they put those things in to "punch it up" for the layperson.
posted by sideshow at 6:18 PM on March 22, 2015 [2 favorites]


EC2 maxes out at 244 GB RAM per instance (the r3.8xlarge instance type). That costs about 25 cents per hour at the current spot market price. Assuming that 30 of those are as good as 13 512GB machines, it would cost $16-27k to run this computation.

Any random university supercomputing center probably has this much resources available for borrowing anyway, but of course they usually want proposals that involve finding life saving drugs or predicting global warming rather than solving a game.
posted by miyabo at 6:37 PM on March 22, 2015


18x18? What a let down - the classic game is played on a 19x19 board. In particular, even numbers of squares result in less interesting games than odd numbered squares.

> ... except the number of legal moves on any size of Go board is always variable, because you're also not allowed to duplicate any previous board state.

Your reasoning is correct but your statement is technically not - or at least, it depends on the specific ko rule you pick. In many of them, the same board position would be allowed if the other player has the move the second time.

This is somewhat academic - I've played thousands of games and it's never come up - but it is somewhat more likely to come up in high-level play than with SDK (single-digit kyus) like me...
posted by lupus_yonderboy at 10:19 PM on March 22, 2015


I look down at my go board and in no way does it look like it could host many, many more combinations than there are atoms in the observable universe. Math is very strange sometimes. To an outsider it can feel like magic.
posted by no mind at 11:13 PM on March 22, 2015


They should do a Kickstarter for the hardware they need. The first crowdfunding campaign for a number.
posted by BiggerJ at 5:29 AM on March 23, 2015


I look down at my go board and in no way does it look like it could host many, many more combinations than there are atoms in the observable universe. Math is very strange sometimes. To an outsider it can feel like magic.


Try looking at an 80-amino acid chain, the permutations therein (20^80, excluding post translational "modifications"). This is a physical object, manufacturable by human technology, when folded up into a "ball", one could fit ~20,000 of them across a human hair. There are more permutations of this made from natural amino acids than the number of atoms in the universe.

If this boggles your mind, remember that the average length of a protein is ~450 amino acids, and there are a mere ~40,000 types of them in a human body. To suggest that this is a drop in the ocean is to vastly exaggerate ocean size.....
posted by lalochezia at 6:11 AM on March 23, 2015 [1 favorite]


Whhyyyyyyy
posted by Theta States at 7:21 AM on March 23, 2015


kafziel: "except the number of legal moves on any size of Go board is always variable"

The post is misquoting the article. It's actually the number of legal positions not moves.
posted by mhum at 2:35 PM on March 23, 2015


« Older Inherent vice, memory, and glass bead disease   |   They're emulating our beer culture now, and it's... Newer »


This thread has been archived and is closed to new comments