Statcast produces several terabytes of uncompressed data per game
December 18, 2017 4:47 PM   Subscribe

Statcast is an amazing tool created by Major League Baseball that provides immense amounts of game data and statistics that had never been previously available. So why are sabermetricians so concerned about it?
posted by Chrysostom (20 comments total) 13 users marked this as a favorite
 
No computer can truly track TOOTBLANs.
posted by delfin at 4:55 PM on December 18, 2017 [6 favorites]


but can it measure grit tho
posted by halation at 5:05 PM on December 18, 2017 [9 favorites]


One problem with route efficiency was that almost every single route fell within a narrow range of 90 to 100 percent, making it difficult to contextualize and show meaningful differences.
Right - if half the folks have to be better than average (the definition of average), and when you're cutting the slices that thin, it's hard to have faith in that boiled down #.
posted by k5.user at 5:40 PM on December 18, 2017 [1 favorite]


No computer can truly track TOOTBLANs.

Or the TOOTBLAN's kissing cousin, the FARTSLAM
posted by Hey Dean Yeager! at 5:45 PM on December 18, 2017 [5 favorites]


... if half the folks have to be better than average (the definition of average) ...

Ummmmm.... well ...
posted by Bovine Love at 5:49 PM on December 18, 2017 [5 favorites]


Reading comprehension fail. This is not the Zerg Rush I was looking for.
posted by I'm always feeling, Blue at 5:57 PM on December 18, 2017 [1 favorite]


Think of how stupid the average person is, and realize half of them are stupider than that.
George Carlin

This was a fun nerd read. I didn’t realize that there is an open source community for baseball statistics. And fielding independent pitching stats make so much sense.
posted by ActingTheGoat at 6:12 PM on December 18, 2017


This is the biggest, hairiest and most lint-filled navel ever gazed upon by human eyes.
posted by grumpybear69 at 6:23 PM on December 18, 2017 [10 favorites]


Hopefully with all this data we can finally find the Higgs Baseball.
posted by clawsoon at 6:25 PM on December 18, 2017 [16 favorites]


...how efficient a route an outfielder took to get to a ball, with 0 percent being the least efficient and 100 percent being the most.

So a ball hit directly to the outfielder and caught is 100%?
posted by thelonius at 6:28 PM on December 18, 2017


> TOOTBLAN's kissing cousin, the FARTSLAM

TIL about FARTSLAM. Although really it sounds like FARTALARM is the more useful statistic since the critical aspect is the fielder's sustained lapse of attention. FARTSLAM is just a FARTALARM that scores.
posted by ardgedee at 7:02 PM on December 18, 2017


Does no one remember Stratego? My brother and I played the baseball and the hockey versions. We both wanted Babe Ruth.
posted by Splunge at 7:55 PM on December 18, 2017


Just play 'em one game at a time, hope I can help the team... make statistically-significant improvements to various metrics.
posted by fifteen schnitzengruben is my limit at 8:11 PM on December 18, 2017 [8 favorites]


why aren't they computing the lunchpail-ness of each player per game?
posted by indubitable at 8:25 PM on December 18, 2017 [1 favorite]


So a ball hit directly to the outfielder and caught is 100%?

I guess? Although if it's a grounder or line drive, presumably they should be chasing towards it. Just guessing, it's intended to be a measure of how fielders chase the ball -- do they walk back only to have to rush forward because they overshot?
posted by pwnguin at 8:46 PM on December 18, 2017


Ultimate inside baseball.
posted by blue shadows at 8:47 PM on December 18, 2017


I hope this is intended by that Carlin quote, but ... isn't that mean?
posted by zinful at 9:08 PM on December 18, 2017 [3 favorites]


For something like half of the 2017 season, people thought that guys like Victor Martinez and Miguel Cabrera were hitting the snot out of the ball, but it just turned out that the Statcast system at Comerica Park was running really hot and inflating everyone's exit velocity readings. Eventually people smelled a rat and took a look at the home/road splits, but who knows how many other, more subtle errors are lurking that we can't see because MLB won't release the underlying data?

OK, so it's a lot of data, but we have broadband connections and cloud computing clusters -- spend some of your gadzillion dollars and make it available in a Spark cluster or something, if for no other reason, to take advantage of the army of hobbyists who will vet it for you and help you make it more useful. I mean, I know it's just data about baseball, but it's DATA. ABOUT BASEBALL!
posted by tonycpsu at 9:22 PM on December 18, 2017 [7 favorites]


So a ball hit directly to the outfielder and caught is 100%?

Yes, but that's an edge case.

It measures whether the fielder ran a perfectly straight line from where they were standing when the ball was hit to where they caught it (or didn't catch it). The idea being that a fielder who is good at reading the hit initially and running efficient routes to it can run farther distances to make more catches. Not because they're faster, but because they are better at reading the trajectory of the ball basically from the moment it's hit, and running directly to where the ball is going to land.

This seems like a pretty narrow skill, and it's also affected by things out of a fielder's control. If you play in a domed stadium, your route efficiency might be higher because the winds aren't unpredictably moving the ball in flight. You might have near perfect route efficiency, but you're slow, so a faster runner with worse route efficiency might be better overall than you. This also doesn't take into account a fielder's vertical leap, or ability to dive and make catches. Or your surehandedness. Like, sometimes even pros have the ball bounce off their glove.
posted by thenormshow at 7:16 AM on December 19, 2017 [1 favorite]


Like, sometimes even pros have the ball bounce off their glove.
...or head.
posted by skyscraper at 2:48 PM on December 19, 2017


« Older “Pop”-peroni Pizza   |   Are these places holy or unholy? Newer »


This thread has been archived and is closed to new comments