The other day, thanks to a baseball group on Facebook, I became acquainted with the article at this link, titled "Analysis of 4 Million Pitches Reveals Umps Really Do Suck at Calling Pitches".
From the headline alone, it appears that this is about as comprehensive as it can get -- 4 million pitches! The data covered the eleven major league seasons from 2008-2018, and even a skeptic like me can see the theoretical likelihood that looking at that many pitches from that many seasons ought to yield something resembling the truth.
I'll give the folks at Statcast and Pitch f/x credit for establishing themselves as the most ambitious in trying to quantify some of the more elusive aspects of baseball. That they analyzed 4 million pitches is an achievement in itself. Let's give them further credit for sharing this data with the public and with groups like the Boston University students who put together this article, written by Mark T. Williams. And I'll grant one more assumption: that the analysis of the data showed something.
Where I part from this article is in the conclusions is presents. The headline is certainly stark in condemning umpires as a group, and the main point is baldly stated: "MLB home plate umpires make incorrect calls at least 20% of the time--one in every five calls. In the 2018 season, MLB umpires made 34,246 incorrect ball and strike calls for an average of 14 per game, or 1.6 per inning."
Some poor schmuck posted a comment complaining that if 14 missed calls represented 20% of pitches, that would mean just 70 pitches in a game, which is absurd. His misconception was corrected when it was pointed out that the issue was not the total number of pitches, but rather the number for which the umpire made a ball or strike call. "Dope slap to me," noted the poster, feeling chastened for neglected this basic of baseball.
But wait a second. Is 70 the correct number for how many pitches are called by the plate umpire? The article included a graph showing the data analysis--for a game from 2010, when strikeouts and walks were less common than today. No matter. Although the data points are difficult to count exactly, I tallied 143 pitches called by the umpire, twice what the 14=20% formula suggests. Of those 143 pitches, the umpire, Dale Scott, missed 21, or 50% above the average. Yet 21 of 143 equals 14.7%.
Clearly, something is very wrong in the calculations of Mark T. Williams and his cohorts at Boston University. I don't know whether this reflects an inherent error in the Statcast and Pitch f/x approach or merely its misapplications, but I decided to play with some more numbers to see which version of the Dale Scott calculation resembles the norm more closely.
I looked at two sets of games--the 15 games played this past Sunday, April 21, along with the 33 games contested in the 2018 post-season. One of those, the 18-inning World Series Game 3, counted as two games for me, and with other extra-inning games included, I looked at the equivalent of 50 nine-inning games.
For each game, I counted pitches, strikes, balls, strikeouts, walks, and runs. The data was extremely consistent throughout, and almost any group of ten games chosen at random produce the same conclusions. I'm not going to post the data from each game, but here are the overall per-game figures:
The first thing that jumps out at me is that there are, on average, 104 called balls per game. If umpires average 14 wrong calls, that would be 13.5%, not "at least 20% of the time." I'll add that in the 49 games I tracked, only three times were there fewer than 80 called balls. The lowest total was 72, in a game where Clayton Kershaw allowed two hits in eight innings.
In other words, not once in this span of 49 (or 48, if you prefer) games would 14 mistakes have constituted more than 20% of the pitches.
Of course, my readers already know the catch with that statement. What about called strikes? With an average of more than 18 strikeouts a game, surely there are quite a few called strikes. In the graph of the Dale Scott game from 2010, I counted 56 called strikes. MLB doesn't provide data on called strikes vs. swinging strikes and swings that put the ball into play. So the strike totals provided in today's boxscore essentially mean "any pitch that wasn't a called ball," which leaves just bits and pieces of data to play with.
I have seen another article which says that a little more than half of all pitches result in a called ball-strike, with just fewer than half resulting in swings. With an average of 289 pitches, that would mean 145 pitches requiring the judgment of the umpire. That figure might even be a little low, but suffice to say an umpire missing 14 pitches out of 145 is in error "less than 10%" of the time, not "at least 20%".
Note that if I count the actual number of games I looked at, 48, the averages look a little better for the umpires. For 48 games, the pitches averaged 301, with 192 strikes and 109 balls. If it's correct that a little more than half of all pitches are called by the umpire, that would be 151 pitches, making 14 mistakes 9.3%. Add a few more calls, and we're under 9. We can have an entirely separate discussion of whether missing one out of 11 calls is unacceptable, but compared with at least one in 5, it doesn't sound like such a crisis. Yet on the Facebook page where I saw the article posted, nobody seemed to question its gospel, I suppose because it reinforces what they want to believe.
My "study" was the kind that I could do in one evening while watching a ballgame. Possibly I'm missing something, missing the key factor that renders their arithmetic flawless and mine flawed. Measured against its own criteria and premise, however, the Williams article seems to prove the opposite of its contention.