05-18-2014, 01:56 PM #21
If the usefulness of projections and analysis is limited by being 100% accurate - someone should let Parker and every other baseball commentator know they're wasting their time. It's a rather unfair bar to hold anything to.
The funny thing is we hear these kinds of objections when there is pessimism about the favored player/club. But if someone were to post the same projections that say we can expect better from Nolasco, Mauer, and probably Perkins? Well, hell, regression to the mean is awesome then!
05-19-2014, 12:02 AM #22
The best explanation for the inability of projection models to reflect reality in anything but the aggregate comes from an analogy to financial modeling, and the best explanation for the inability of financial modeling comes in Nassim Taleb's work, which I'd recommend even if you aren't interested in finance because of how universally applicable this kind of thinking is. I'm partial to "Fooled by Randomness" because the book blew my hair back when I first read it, but "The Black Swan" is quite good too.
Basically, one of the core tenets of the arguments made is that, while a bell curve distribution may exist, the "tails," or the events that are statistically so improbable as to be virtually impossible are not only not impossible, but if they are treated as such, the models will not only fail but they will fail catastrophically, with cascading effects everywhere else. They're colloquially called the "fat tails" of the bell curve. We got an object lesson in the fat tails of the bell curve when mortgage backed securities, which were so "safe" as to be considered statistically virtually riskless, blew up and took down the market. Baseball reveals its fat tails again and again (remember the old saying "that's why they play the games"?), and as one poster previously mentioned, I would consider the Twins' 1991 season to be a resident of pretty darn fat part of a fat tail of a bell curve.
TL;DR, the models may be able to accurately tell you what will happen in the aggregate, but there will be outliers each and every year that make you stand back, scratch your head, and ask "what the actual ****?"
05-19-2014, 09:24 AM #23
- Liked 338 Times in 214 Posts
- Blog Entries
Somewhat back on topic:
Don't know PECOTA, but the author of ZiPS has found that there is less in-season regression than season-to-season. Still small samples this year, but that bodes well for a few Twins (and not so well for a few others). Fun full article here:
Relevant excerpt (Lesson #8):
Simply put, there was significantly less regression toward the mean for in-season stats than you would expect from the sample size, relative to season-to-season stats.
One notable example was BABIP, in that the BABIP overperformance, in the context of in-season projections, tended to stick more than one would expect from the heavier regression from season-to-season. That .400 first-half BABIP may be doomed next year, but players retain a surprisingly large amount of that bounce within the same season.