These metrics may be nothing more than the product of human beings playing baseball, but each number has a story to tell. Between two players, I can predict that the player with the higher AVG will produce more runs and I’ll be right most of the time. OBP and SLG have more to say in different ways, and I’d listen to these guys over AVG. A player who has an OBP of .285 is an out machine. A player with a SLG greater than .500 has a lot of pop in his bat; and the larger the difference between SLG and AVG, the greater the power. A guy with an OPS of .975, well he’s a stud. Some other stats we hear about contain much less information. RISP AVG, runs, and RBI get a lot of attention, but they’re just fluff.
The Pitcher Puzzle
Evaluating individual player contributions on defense in baseball is a tricky thing, compared to the offensive side of the game. Hitters face a defense alone, posting hitting statistics largely independent of teammates. Batters with good statistics in these areas can be credited with producing runs for their teams, and players with better statistics in these areas obviously deserve more credit than those with worse statistics. A general manger can be reasonably confident that a batting order composed of batters with good statistics will produce more runs than one with poor statistics. Although some teammate spillovers may occur from batter to batter, the spillovers on defense are much more problematic.
It is the joint responsibility of the pitcher and his fielders to prevent runs that make judging each party’s contribution complicated, though this difference is not widely recognized. At the end of a season, game, or even an inning, we can view the jointly produced results, but we may be unsure of how much each contributed. To evaluate pitchers, most analysts and fans use the earned run average (ERA) as a measure of run prevention. However, ERA differs from offensive statistics because of its joint production. A pitcher with a good defense may have an excellent ERA substantially due to his fielders.
Adding to the confusion is the fact that the metrics we have available to measure defense are quite poor. Errors are the most commonly used statistics to judge fielders, and it’s the third line in the box score after runs and hits. But the difference between an error and a hit is a highly subjective decision, which an official game scorekeeper determines. Should a particular ball put in play by the batter have been converted into an out by the fielder? Scorekeepers have to believe their eyes, but they can be deceiving. A line drive to the gap that bounces off a fast outfielder’s glove may be scored as an error, while that same ball played by a slow outfielder, who doesn’t come within ten feet of the ball, may be scored a hit.
Additionally, pitchers may bear some responsibility for their batting average on balls in play (BABIP). After all, he did allow the ball to be put in play, and the type of ball hit may affect the probability that an error occurs. High fly balls result in fewer hits and errors, since fielders have plenty of time to get under the balls. Ground balls and line drives travel at fast speeds, making these balls much harder to field.
While the responsibility for balls in play is a bit murky, separating the player contributions for some events is easy. When a pitcher throws the ball to the catcher, several outcomes may follow. The most common outcomes are a strike, a ball, a ball in play, or a home run—I’ll exclude rare events such as hit batters, balks, and catcher interference. Of these events, three do not involve a fielder other than the catcher. From the defensive end, balls, strikes, and home runs are solely the responsibility of the pitcher. Whether you have Ozzie Smith or John Kruk at shortstop, it doesn’t matter one way or the other when one of these events occurs. Balls and strikes pass harmlessly to the catcher, and home runs—excepting the ultra-rare inside-the-park home run—pass over the helpless fielders below. Only the hit ball in play requires that the pitcher receive help from his fielders. His fielders may convert the balls in play into outs or they may fall in for hits.
Wouldn’t it be smart to start in an evaluation of pitchers where we can determine responsibility? Well, luckily someone had the idea to do
this. Voros McCracken—a former paralegal who, while looking for new ways to win his rotisserie baseball league, ended up earning a World Series ring in the front office of the Boston Red Sox—is generally credited with being the first person to do this type of analysis. McCracken, who plays a small hero in
Moneyball
, developed a new metric known as DIPS ERA. DIPS stands for defense-independent pitching statistics. McCracken noticed an interesting phenomenon among major-league pitchers: ERAs are not very predictable from year to year. Sure, you can expect Randy Johnson and Roger Clemens to have better ERAs than a team’s typical fifth starter, but even these hallmarks of consistency have had ERAs that fluctuate quite a bit. This observation was the key to finding the responsibility in the prevention of runs.
Table 26 shows that Johnson and Clemens have been very good over their careers, but they have had up and down seasons. The last column contains the standard deviation of their ERAs, which quantifies the average yearly difference from each player’s career average in terms of earned runs. This difference is just under a full earned run per game for each pitcher, which is quite a large margin. If these consistent superstars suffer from fluctuations in their performances, then how does the rest of the league fare? Well, for pitchers who averaged one hundred or more innings over twenty-five seasons from 1980 through 2004, the standard deviation was about 0.9 earned runs. With the average ERA for these pitchers at about 4.04, this means an average pitcher’s ERA is expected to fluctuate by about 22 percent, again a sizable fluctuation. There is a good chance that an average pitcher will post an ERA of somewhere between 3.00 and 5.00, which means it’s quite easy for an average pitcher to look very good or very bad.
If these statistics fluctuate quite a bit, how are we to know how much of a pitcher’s performance is due to skill and how much is a product of luck? Skill should persist over time, while luck should not. If we observe a pitcher performing similarly from year to year in certain areas, then it is likely that he has some control over this area. By looking at different areas of performances over time, we should be able to find metrics that correlate from year to year. Those metrics that remain similar from year to year—past good (bad) performance begets future good (bad) performance—are metrics that capture pitcher skill. Those with little relationship over time are probably just capturing luck.
Figure 13 plots the relationship between pitcher ERAs in a current season and the previous season for every season in which the pitcher threw more than one hundred innings from 1980 to 2004. The ERAs discussed in this chapter are corrected for the typical influences of their home parks. Pitchers that play in hitter-friendly and pitcher-friendly parks have this factored into their ERAs. The current season is measured on the vertical axis and the previous season on the horizontal axis. The dots are quite dispersed, with only a slightly upward, or positive, trend— measured by the regression line. This means that the higher the ERA in the previous season, the higher we expect it to be the following year, and the lower the ERA, the lower we expect it to be the following year. But
the fact that the points are so widespread indicates that the relationship between ERA of the past and present is not very strong. The R
2
is a statistical measure of the strength of the correlation in explaining the observed events, where 1 means the previous year’s ERA explains 100 percent of the present seasons’s ERA and 0 means it explains 0 percent. The R
2
of 0.13 means that only 13 percent of the variance in this year’s ERA can be explained by the previous year’s ERA. This is not a tight relationship.
Well, so what? Baseball fans know that players have good years and bad years. The fact is that the fluctuation in ERAs may reflect more than just a deviation in a pitcher’s performance. It turns out there are factors beyond the pitcher’s control that have a huge impact on pitcher ERAs.
McCracken set out to remove the impact of fielders, to examine pitchers only in the areas of the game where fielders are not used: walks, strikeouts, hit batters, and home runs. He was actually not the first person to engage in this type of analysis. In his 1987
Baseball Abstract
, Bill James developed a similar metric he named
Indicated ERA
. Indicated ERA looked only at walks and home runs allowed by pitchers. As James puts it, he was looking to develop “a meaningful indicator of the pitcher’s self-destructive tendencies.”
70
And this is where he left his analysis. Just over a decade later McCracken noticed that there was something more to these defense-independent metrics than just being defense independent. Not only could you tell a lot about how good a pitcher was by looking only at DIPS, but looking at non-DIPS stats— that is, statistics that include fielder involvement on balls in play—tells us very little about pitchers. “Heresy!” was the responsive cry of the baseball establishment. “Everyone knows that Greg Maddux is so good because his pitches produce easily fielded balls.” But when McCracken looked at Maddux’s numbers, he found that Maddux’s hits on balls in play, as measured by BABIP, fluctuated widely from year to year. And the same was true for all pitchers.
Figure 14 shows the relationship between the previous and current seasons’ performances on the three main DIPS metrics—strikeouts, walks, and home runs. If a pitcher has a skill in any area, he should
perform similarly in that skill from year to year. If pitchers tend to repeat their performances in an area, reflected by a strong correlation from season to season, then we can reasonably assume that pitchers have a skill in that area. There is an extremely strong relationship for the strikeouts and a moderately strong relationship for walk rates and hit batters, with R
2
s of 0.61 and 0.42. And while the relationship is weaker for home runs, the R
2
of .22 is much stronger than the correlations of ERAs from season to season.