18 November 2009
The National League Cy Young Award will be awarded on Thursday, and the leading candidates are Tim Lincecum, Chris Carpenter, and Adam Wainwright. Over the last few years, the statistics used to evaluate pitchers have evolved, from the traditional stats like Wins and Losses and ERA, to more advanced stats such as VORP (Value over Replacement Player), FIP (Fielding Independent Pitching) and tRA (true Run Average). In this article, we’ll take a look at these stats, and see how they can help us determine the best pitcher of 2009.
Wins and Losses Wins and Losses are still the stats that are most identified with starting pitchers, but most fans now appreciate their limitations. For a pitcher to get a win, in addition to preventing the other team from scoring, he needs his offense to score runs for him and his bullpen to preserve the lead if he leaves the game. These last two items are ones that the pitcher has virtually no control over. In 2009, Braden Looper had a 14-7 record for the Brewers despite a 5.22 ERA. How did that happen? Because the Brewers scored runs at an 8.97 rate while Looper pitched, allowing him to get “Wins” even why he pitched rather poorly. For example, in Looper’s final win of the season on 10/2, he gave up 2 runs in the 1st Inning, 3 more in the 3rd, and another in the 4th, yet still was credited with a Win when the Brewers scored 12 runs over the final four innings. In his 13th Win, he gave up 5 runs over the first 5 innings, but received a Win since his offense scored 9 runs for him. Which pitchers received the best run support in the NL in 2009? Here are the Top 5. and it’s easy to why De La Rosa, Looper, and Lowe were credited with so many Wins – their teams happened to go score a ton of runs while they were on the mound.
|Jorge De La Rosa, COL||9.00||4.38||16-9|
|Braden Looper, MIL||8.97||5.22||14-7|
|Max Scherzer, ARI||8.35||4.12||9-11|
|Derek Lowe, ATL||8.14||4.67||15-10|
Many pitchers were also on the unlucky side of the run support stat. Take Clayton Kershaw, for example, who finished 8-8 with a 2.79 ERA in 171 IP. The conventional wisdom is that Kershaw’s low Win total was because he ran up high pitch counts and couldn’t go deep into games. While that was true to some extent, Kershaw was also remarkably unlucky in his starts. For example, on 7/29 against St. Louis, Kershaw threw 8 scoreless IP, but left without a decision. Pitchers in that situation received Wins in 68 out of 77 such occurrences (88%) in 2009. On two other occasions, Kershaw threw 7 scoreless IP without a Win (the starting pitcher earned a Win 81% of the time last year), and two more times went 6 shutout innings without a Win. For the season, Kershaw had 11 starts with at least 6 IP and 1 ER or less, yet was only awarded wins in 5 of them. Aside from Run Support, another big factor in Pitcher Wins is the Bullpen. Take another Dodger, Chad Billingsley, for example, who finished 12-11 with a 4.03 ERA and was eventually dropped from the rotation in the playoffs. Yet there were 5 games in 2009 where Billingsley left the game with the lead, and saw the bullpen give the game away after he left. I am pretty sure that if Billingsley’s record had been 16-8 or 17-9 instead of 12-11, the fans and Joe Torre would think more highly of Billingsley.
ERA and ERA+ Since it’s clear that Wins can be deceptive, it may be better to look at a pitcher’s ERA, which eliminates the Run Support factor and reduces the bullpen effect. Here are the NL leaders in ERA in 2009: 2009 NL ERA Leaders Chris Carpenter, 2.24 Tim Lincecum, 2.48 Jair Jurrjens, 2.60 Adam Wainwright, 2.63 Clayton Kershaw, 2.79 This is clearly a much better group of pitchers than the Win leaders, which included de la Rosa and Lowe in the Top 5. But there are several things that ERA doesn’t account for. One of these is the pitcher’s home park, since some stadiums are easier to score runs in than others. ERA+, or Adjusted ERA, tries to account for this, by scaling a pitcher’s ERA with a Park Factor for each stadium. ERA+ is also normalized to league average (ERA+ = 100*(lgERA/ERA), adjusted for ballpark), so a score of 100 is average. This makes it useful for comparing players across different seasons and eras. Here are the leaders in ERA+ in 2009: 2009 NL ERA+ Leaders Chris Carpenter, 183 Tim Lincecum, 176 Jair Jurrjens, 158 Adam Wainwright, 157 Matt Cain, 151
VORP Another factor to consider with ERA is that it is a rate stat, and not a counting stat. A pitcher who gives up 1 ER in 6 IP has an ERA of 1.50, but is clearly not as valuable as one who pitches 250 IP with an ERA of 2.50. But would a pitcher with 200 IP and an ERA of 2.30 be worth more than the 250 IP-2.50 ERA pitcher? VORP (Value over Replacement Pitcher, developed by Keith Woolner at Baseball Prospectus) is a stat that combines the pitcher quality (runs allowed) with the quantity of innings pitched. The idea is to calculate how many runs this pitcher saved compared to a “replacement-level” pitcher. So the formula for VORP is VORP = (Replacement_Level – RA)/9*IP, where Replacement_Level is generally defined as around 40% higher than league average for starting pitchers. Here are the 2009 VORP leaders in the NL: 2009 NL VORP Leaders Tim Lincecum, 69.8 Chris Carpenter, 68.7 Adam Wainwright, 67.1 Matt Cain, 61.3 Jair Jurrjens, 60.5 Dan Haren, 60.2 Again, the Top 3 of Carpenter, Lincecum, and Wainwright are very close.
SNWL SNWL (Support-Neutral Wins and Losses) looks at a pitcher’s performance on a game-by-game basis, rather than over the season total of ER and IP. For each game pitched, it calculates the probability that the team would win the game, assuming a league average offense and bullpen. So given the IP and Runs Allowed by the pitcher in that game, we can find the probability that the team should win (This is similar to the discussion above with Clayton Kershaw). SNWL is reported as either as a Win-Loss record, or can be converted to a “Value over Replacement” scale, SNVAR. Here are the SNWL and SNVAR leaders in 2009 for the NL:
FIP (and xFIP) A big split in the evaluation of pitchers came with stats like DIPS (Defense Independent Pitching Stats, by Voros McCracken) and FIP (Fielder Independent Pitching, by Tom Tango). The stats listed above (ERA, VORP, SNWL) are all based on the actual runs given up by a pitcher. However, it is clear that ERA involves many players besides the pitcher – namely the defense behind him. A good defense will obviously make more outs and give the pitcher a lower ERA. So how can we isolate the contributions of the pitcher from the defense behind him? One attempt to isolate the impact of the pitcher alone is FIP. FIP removes the effects of the fielders, and only looks at the things that the pitcher has control over – strikeouts, walks, HBP, and Home Runs allowed. The formula for FIP is: FIP = (HR*13+(BB+HBP-IBB)*3-K*2)/IP (plus a scaling factor to match the scale to that of ERA or RA)
Why do analysts think this is more useful that looking at things like ERA? One, because it eliminates the huge variability of fielders from the equation. And two, because it turns out that using FIP is a better predictor of future performance than ERA (or ERA+). That is, a pitcher who has managed a low ERA despite high BB and HRs (hence a high FIP) is much more likely to see that ERA rise in the future than one with the same ERA and a low FIP. So it may be a better evaluator of a pitcher’s true performance and skill than ERA.
Here are the leaders in NL FIP in 2009: 2009 NL FIP Leaders Tim Lincecum, 2.48 Javier Vazquez, 2.77 Chris Carpenter, 2.78 Josh Johnson, 3.06 Clayton Kershaw, 3.08 Adam Wainwright, 3.11
xFIP adds one more level of correction. The rate at which pitchers give up Home Runs is primarily a function of their fly ball rate and the home park. So is a pitcher who gives up a lot of warning-track fly balls showing a skill or just getting lucky? The research indicates that it’s probably just luck, and isn’t likely to continue. So xFIP takes out the pitcher’s actual HR rate, and uses the fly ball rate instead, assuming that an average percentage of fly balls will result in Home Runs.
Here are the leaders in xFIP for the NL in 2009 (from The Hardball Times 2009 NL xFIP Leaders Javier Vazquez, 2.89 Tim Lincecum, 2.94 Dan Haren, 3.16 Ricky Nolasco, 3.29 Josh Johnson, 3.42 Adam Wainwright, 3.45 Chris Carpenter, 3.45 By the way, Zach Greinke, who deservedly won the AL Cy Young award despite only 16 wins, is a big fan of FIP. After winning the award, Greinke was quoted in the New York Times as saying "That’s pretty much how I pitch, to try to keep my FIP as low as possible."
tRA True Run Average (from Graham MacAree) is similar to FIP, in that it attempts to isolate pitching from the defense. The primary difference is that it divides batted balls into ground balls, fly balls, and line drives. By looking at the vast amounts of data available, the expected outcome for each type of batted ball has been determined:
|Strikeout||Line Drive||Ground Ball||Fly Ball (OF)||Fly Ball (IF)|
This table shows that (for 2008), 83% of fly balls to the outfield were converted to outs, while 98.5% of fly balls to the infield became outs. From this data, the expected run outcome of each type of batted ball can be calculated. Then, since we have the batted-ball breakdown that each pitcher allowed, we can calculate the expected, or true, Run Average, given an average defense behind the pitcher. 2009 NL tRA Leaders (from Fangraphs) Tim Lincecum, 2.83 Chris Carpenter, 3.02 Clayton Kershaw, 3.36 Josh Johnson, 3.41 Adam Wainwright, 3.56 Javier Vazquez, 3.67 One interesting fact from the tRA data is that Javier Vazquez goes from the best in xFIP down to #6 in tRA. This is because his line drive rate of 23.6% was one of the worst in the league.
Conclusions 2009 has several pitchers who were very close in performance:
Carpenter has the lowest ERA of the group, but also the second fewest innings pitched. Wainwright has the most Wins and IP. Lincecum is 2nd in ERA, but led the league in strikeouts, and led all of the fielding-independent stats. It's very close, but I would rank them: 1. Tim Lincecum, SF 2. Chris Carpenter, StL 3. Adam Wainwright, StL