Added Note (June 9, 2010) It has been brought to my attention that the disputed Wikipedia entry has been changed to a more accurate representation. I figure I’ll just leave this as part of my website as a safeguard against it happening again, and also because it has a certain educational value.
Wikipedia Twisting the Truth – Voros McCracken Entry
First, let me be clear that this is not a criticism of Voros McCracken, whom I do not know and have never corresponded with. I have no reason to believe he has anything to do with what is posted for his entry on Wikipedia.
McCracken became relatively famous in the sabermetric community in January of 2001, when he had an article published in the Baseball Prospectus about a study he had done that led him to conclude:
“There is little if any difference among major-league pitchers in their ability to prevent hits on balls hit in the field of play.”
And in the closing sentences of his article he added:
“The critical thing to understand is that major-league pitchers don’t appear to have the ability to prevent hits on balls in play.”
For reasons I’ve never fully understood this was seen as earth-shattering news in the field, perhaps because it so overstated the truth of the matter. But for better or worse, McCracken became the guy who really got people to understand that most folks overestimate the pitcher’s ability to influence the number of hits that result from balls batted into the field of play. That’s quite fair and deserving of recognition.
Unfortunately the Wikipedia story is told in such a way that it neatly overlooks that McCracken’s study was not particularly well designed or interpreted, leading to his heavily overstated conclusion. The Wikipedia entry goes on to paint a picture that what he had found was radical news that “celebrity sabermetricians” found hard to accept, and then suggests it was eventually proven true and is accepted today.
One of those bull-headed “celebrity sabermetricians” is identified as me, and Wikipedia does a quick quote from an ESPN column by Rob Neyer that says I was “distinctly uncomfortable with McCracken’s conclusion.”
If you read the actual Neyer column you find I was fully supportive of the important kernel of truth in McCracken’s work which had already been borne out in my own research on the subject. What I was uncomfortable with was his overstated conclusions – which in fact were proven by subsequent independent research to be so overstated as to be wrong. The next year McCracken wrote again on the subject in the Baseball Primer, and this time he softened his conclusion in the direction of my findings, although frankly I still feel he resists admitting the actual degree of impact that pitchers have on hits per ball in play.
Before I cite the studies
proving McCracken’s initial conclusion was wrong, and the studies that were
able to duplicate the more reserved findings in my own study, let me provide
my whole unedited letter which also appeared in its entirety in Rob Neyer’s column of
Like McCracken, I've studied hits allowed per ball in play (though with the small difference that I subtract sacrifice hits from the balls in play).
I agree that this type of hit rate is not as heavily influenced by the pitcher as is commonly believed, but at the same time I am distinctly uncomfortable with McCracken's conclusion: "There is little if any difference among major-league pitchers in their ability to prevent hits on balls hit in the field of play."
My research uses the pitcher data from 1987 to 2000, adjusting each line to the relevant league rates transposed into an average season. I then studied the pitchers with at least 2000 balls in play (249 pitchers met this standard).
Under McCracken's conclusion, there should be little to no difference in the hit rate between Greg and Mike Maddux, but in these huge samples you still find a 16-point gap (.283 vs .299). And I can add that the 2B+3B rate is 24 percent higher for Mike than Greg.
Guys like Pedro Martinez (.271) and Nolan Ryan (.276) are roughly 50 points better than guys like Dave Weathers and Sean Bergman.
Sid Fernandez was the toughest to hit with an adjusted hit rate of .253 versus the norm of .293. Is that chance? (For those who don't know, even though Fernandez had well above average strikeout rates, his velocity was actually slightly below average. Due to an unusual physiology, his natural delivery put his release point in front of his uniform top, making the ball tough to judge and batters hit an unusually high number of harmless flyballs and pop-ups.)
It is damn hard to hit a knuckleball on the nose,
and all five knuckleballers in the study ranked in the top 25 percent of the
toughest to hit on a ball in play. Is that chance? In fact, when you combine
the data on Hough, Springer,
We know pitchers influence ground-ball rates, and hit rates are higher on ground balls. Break the 249 pitchers into three groups based on their adjusted ground-ball rates, and the fly-ball group is 7.6 points lower than the middle group, and 12 points lower than the ground-ball group.
(I also note that the vast majority of the extra hits against the groundball pitchers are singles. The ground-ball pitchers are allowing fewer 2B+3B than the fly-ball pitchers.)
I think pitchers are successful or not for a variety of reasons, and some reasons apply more to one group than another. Some have additional emphasis on control; some emphasize strikeouts; some groundballs, and yes, some emphasize pitches that are tough to hit sharply enough to get your share of hits on balls in play. I don't think the latter is a primary way for pitchers to distinguish themselves from others, but I do believe it is a more significant factor for some groups than others.
Several researchers duplicated McCracken’s work and found, as I did, that while the impact of pitchers on these batted balls was not great – certainly less than commonly believed – that in fact, pitchers did to a point of statistical significance influence these hit rates.
One of the best studies was Phil Birnbaum’s article “Factors Affecting Pitcher Ball-in-Play Average” for the SABR Statistical Analysis Committee. He demonstrated that a pitcher’s hits per ball in play (HPBP) was not random by showing that its difference relative to the norm was predictive of the pitcher’s HPBP in the next season. While the predicted element was not large, about 1/7th of the impact from the season before, the sample size (all pitchers 1979-90 with 300+ balls in play) was so immense the result was “statistically very significant: the probability that this would happen by chance is zero.”
In the same article Birnbaum choose to test my observation that, as a class, flyball pitchers tended to allow fewer hits per ball in the field of play. He found that my observation was accurate, and he also determined that pitchers with high strikeout rates formed a second class that did well in preventing hits on balls in play.
Incidentally, if you ever want to see a quote hacked to the reverse of its meaning, in that article Birbaum appears to try to set me greatly at odds with McCracken by quoting from my letter to Neyer this way: “Wright suggested that batting average on balls in play ‘is a primary way for pitchers to distinguish themselves from others …’ ”
Now look at the actual last sentence in the original letter. By leaving off the “I don’t think” beginning of my sentence he has me saying exactly the opposite of what I originally wrote!
Moving on, Tom Tippett of Diamond Mind and Clifford Blau of the SABR Statistical Analysis Committee, did separate expanded studies of knuckleball pitchers and both confirmed my finding that knuckleballers were unusually effective in preventing hits per ball in play – something I am told that McCracken also acknowledged to be true in his own later studies.
A few other studies I have not read on this topic but have been brought to my attention: Keith Woolner, who now works with the Cleveland Indians, did a similar study to Birnbaum’s with similar results. Eric Van, working as an analyst for the Red Sox, did a study of pitchers switching teams and found the pitchers accounted for 9% of the variance in their HPBP. In another study Van also showed that the level of a pitcher’s control also had a predictive value in regard to the pitcher’s HPBP.
The letter I wrote about my own research on this topic, and how it related to the conclusions in McCracken’s original article, has been proven by the independent research of others to be right on the nose. My letter gave a much more accurate view of the actual relationships involved than did McCracken’s original article. Yet today that letter is portrayed in an incredibly misleading light in the Wikipedia entry on McCracken, to enhance his listing. Again, this is not a criticism of McCracken; I don’t know who is behind it. I am setting the record straight, and also saying it is wrong to twist things away from the truth to build up someone’s credentials, and it is all the more distasteful when it is done unfairly at the expense of another.
In closing let me also correct any notion that I claim that my study was the first research on this topic. I’ve never said that, and I have no idea whether that is true or not. I don’t know when McCracken -- or someone else – first studied and realized this. I do know that McCracken is the guy who brought it into the public domain, and that’s what counts and what should count.
One writer has suggested my study was done under a “cloak of confidentiality” while working with the big league teams. That is not quite the case. It is true that after doing The Diamond Appraised, I refrained during my career from introducing into the public domain any of my work that could put me and my clients at a competitive disadvantage. But that was my call. This study was not done at the direction of any team, but simply for my own general understanding to help me in my evaluation of players. There was no obstacle to my publishing it other than my own choices. But frankly, even if I had felt no personal reservations about publishing it, I can’t imagine I would have. I didn’t think it was that important as a general principle because I already had a pretty good notion of the truth of the matter. I was very much surprised to find that was not the common understanding. It turned out there was a real need for this point to be made in the public domain, and McCracken did it. That was a real service to the sabermetric community, and he fully deserves the credit for it.