Predicting Hearthstone decks
October 23, 2016 3:08 AM   Subscribe

Google researcher Elie Bursztein leads their anti-abuse research team. He sometimes posts articles of extreme interest to game players and computer security people. Such as using machine learning to predict Hearthstone decks: Part 1 - Part 2 - Part 3. His list of publications leads to a wealth of interesting information, for people of various technical inclinations!

While Blizzard's Hearthstone online card game allows for a huge variety of cards and decks, in practice a player's actions can be deduced through observation of what plays he makes.

More detailed posts from Elie Bursztein on Hearthstone:

Appraising card values
Finding undervalued cards
Pricing special cards

He states that the techniques used should apply to any card collecting game.
posted by JHarris (18 comments total) 14 users marked this as a favorite
This is super fascinating stuff! At the same time, in a conversation about CCG stuff, I feel like I'd be remiss in not actively bringing up Codex, which I've found easiest to describe as "like a CCG except there are no booster packs, there are no outright bad cards, and your game plan is not set before you start playing the game." It is just shockingly good overall and I love it to bits. I hope that one day there's an online version with rules enforcement instead of just the physical card version.

I'm looking forward to reading all of these articles, though : 0
posted by DoctorFedora at 5:11 AM on October 23, 2016

This "appraising card values" writeup is especially cool. Great post!
posted by DoctorFedora at 5:20 AM on October 23, 2016

BRB putting Gatherer into R
posted by PMdixon at 5:55 AM on October 23, 2016 [2 favorites]

DoctorFedora, have you considered joining the Church of Dominion?
posted by JHarris at 6:40 AM on October 23, 2016

Elie Bursztein is also featured in this recent FPP about cheating at poker.
posted by chavenet at 6:47 AM on October 23, 2016 [1 favorite]

Yeah, a version of this post went up a bit earlier with his poker article as a second part, because I had missed the video post. (I like text articles better than video generally anyway.) There was still more than enough material for a complete post with that though, so here we are. (Thanks, taz!)
posted by JHarris at 6:52 AM on October 23, 2016 [1 favorite]

As Bursztein points out, this kind of statistical approach works a lot better if you have lots of data. In this case lots of Hearthstone game replays. Unfortunately Blizzard doesn't make it easy to get millions of game records. Most game companies don't, either because it's not their business plan or maybe to keep the data proprietary.

But Riot Games has done a great job making League of Legends data available. There's a bunch of sites out there that analyze millions of games a month. So far most of the statistical treatments have been really simplistic at the big sites like or Simple averages of win rates, etc. Some folks have experimented with more complex machine learning approaches but so far no one seems to have found anything very interesting or surprising in the results. The big finding I trust is that champion pick doesn't matter nearly as much as the individual player's experience playing the champion they picked.
posted by Nelson at 7:11 AM on October 23, 2016 [1 favorite]

4. A card has an intrinsic value: Given that the number of cards drawn is constrained by the game, the simple fact of holding a card gives a player an advantage. This advantage is captured by adding an intrinsic value to each card that is constant.

Am I the only one bothered by the fact that he completely ignores this point in the example he presents, even though it's clearly a part of his model?
posted by Zalzidrax at 9:50 AM on October 23, 2016


The data doesn't necessarily have to come from Blizzard. You can get data from other sources in Hearthstone. There are deck trackers with enough data to get the information you need. It also has the side effect of serious players would take the time to use deck trackers, so you'd get better data.
posted by andryeevna at 11:10 AM on October 23, 2016

Right, and the article talks about using third party services to gather data. But it makes it a lot harder to use third party stuff. The amount of data is small compared to the global set of players. And last I checked none of the third party trackers could collect all data, the logs used only record a subset of all game actions. And a third party introduces sampling bias. You frame this as positive ("serious players") but for many purposes it's a negative.
posted by Nelson at 11:54 AM on October 23, 2016


He states in the article that the data doesn't change much between 10,000-50,000 and up. So more data isn't necessary, the sample is statistically significant enough.

I assume he did use deck tracker data because he specifically mentions not getting the full data from cards drawn and just cards played. That's how the data would be presented in a deck tracker as you wouldn't get your opponents unplayed cards.
posted by andryeevna at 12:15 PM on October 23, 2016

JHarris, there are definitely deckbuilding elements in Codex, except that it still plays like a CCG (with a giant private sideboard that you build your own deck from as you play) instead of like a puzzle to be solved effectively in parallel with the other people playing. (I do appreciate the innovation of Dominion, but don't particularly enjoy playing it compared to later deckbuilding games that feature meaningful player interaction). The Codex starter set is available for free as a print-and-play from the link above, and it comes just ever so recommended as a great combination of CCG, deckbuilding, and other clever new ideas (the patrol zone is a super brilliant innovation).
posted by DoctorFedora at 2:33 PM on October 23, 2016 [1 favorite]

Many of these tricks aren't something only a machine could do though. This is one step on the way to programming a good AI, and potentially a useful tool, but good CCG players will be using a very similar kind of logic as they play out their matches.

Start of high-level ranked game: Opponent is a warrior. Most common warrior is a control deck that uses enrage. I will be the aggressive player but try to limit my exposure to sweepers. Opponent passes turn 1. I do ~thing~. On turn 2, opponent plays Fiery War Axe and hits me in the face. This tells me that my first guess was wrong, and this is probably an aggressive 'Face Warrior' deck, and I should change my plan.
posted by Urtylug at 2:46 PM on October 23, 2016

So my skimming of the Hearthstone posts I have an alternative problematization in that the machine "learning" used is successful because it exploits a symmetry/observation in Hearthstone that's also well known in superficially diverse games if you already play World of Warcraft, Starcraft, Dota2, etc. One property they have in common is that the combinatorial complexity of talent choices, build choices and build orderings is formally large yet the herd (player base) gravitates toward a tiny subset of such builds. You see this kind of convergence in Heroes of the Storm, in WoW PvP talents, and on--it's a pattern. It's even a colloquial concept; on blogs and forums you'll see discussion but also meta-controversy around the use of so-called "cookie-cutter" builds.

So what I'm saying is what the Google thing seems to have done is exploit through scaling something that game players already know and put into practice. It's a kind of automation as opposed to some very arcane technique involving deep machine learning, etc. That's my immediate intuition about this--and it moreover informs Blizzard's attitude about this kind of automation being gamebreaking, not objectively or absolutely so, but rather having consequences within a particular context of game paying practices.
posted by polymodus at 3:53 PM on October 23, 2016

Meant to say game-playing, but paying is kind of apt in a different sense
posted by polymodus at 4:00 PM on October 23, 2016 [1 favorite]

As someone who plays Hearthstone as much as possible during waking hours and falls asleep watching twitch streams of it at night, thank you for this post.
posted by hypersloth at 7:51 PM on October 23, 2016 [2 favorites]

It's a bit of a derail perhaps, but I want to second the praise of Codex. Sirlin may be a bit obnoxious at times, but he's an amazing game designer and Codex is probably the best thing he's done since Yomi. It's well worth checking out the free print-and-play starter set if it sounds like something you might like.
posted by Proofs and Refutations at 6:45 AM on October 24, 2016 [1 favorite]

Yeah, sorry about the Codex derail. It's just such an amazingly good game that I want to shout it from the rooftops at every opportunity.
posted by DoctorFedora at 3:04 PM on October 24, 2016

« Older Double Arrow: British Rail Corporate Identity from...   |   GROW Cinderella Newer »

This thread has been archived and is closed to new comments