23 November 2009
Unlike the 2009 NL Cy Young Award, the NL MVP award is expected to be runaway victory for Albert Pujols. After that, there are a number of players battling for the #2 slot, including Chase Utley, Troy Tulowitzki, Prince Fielder, and Hanley Ramirez. Even though this year’s winner seems apparent, let’s examine the criteria involved in selecting the MVP. Does the player’s team have to make the playoffs? Or at least be a contender for the playoffs? Should pitchers be considered? How can we weight batting, baserunning, and defense? And how do the “advanced” stats like OPS, Win Shares, WARP, and WAR fit into this? In this article, I’ll try to address these questions.
Let’s address the team performance issue first. The MVP is an individual award, so I don’t see any good reason to eliminate a hitter from consideration just because he has lousy teammates. We often hear about players on bad teams racking up big stats in meaningless games, but in reality, most games are significant. The Nationals, for example, were out of the race early, but the games they played against the Braves down the stretch were extremely important to the Braves and the Rockies. If two players on different teams are very close in performance, I can understand an argument that performing during a pennant race could be a tie-breaker, but for now, let’s start by considering everyone.
One logical way to choose an MVP is to estimate how many Wins each player contributed to his team, including offense, pitching, and defense. There are three major stats that try this – Win Shares (Bill James Online), WARP (Baseball Prospectus), and WAR (Fangraphs).
Evaluating Offense - Traditional Stats Let’s start by looking at the leaders in the traditional offensive categories:
|Batting AVG||HOME RUNS||RBI|
Only one player finished in the Top 10 of all three traditional Triple Crown categories, Albert Pujols. Pujols actually finished in the Top 3 in all of them – that’s one reason why he’s expected to easily win the MVP. But are these three stats an accurate way to estimate a player’s batting contribution? Batting Average measures the rate at which a player gets a hit per official at bat. But it ignores several key factors. First, not all hits are equal, but batting average treats a single the same as a Home Run. Second, it does not include walks, which means that it completely ignores the fact that players like Pujols reach base over 100 times that way.
What about Home Runs? They are certainly very important, but on their own, they don’t tell you nearly enough about a player’s offensive contribution. And what about RBIs? They are the stat that has historically correlated highest with MVP voting, but RBIs are a very team-dependent stat. To get an RBI, there usually has to be a batter or batters on base already (except for HRs). But there are very large disparities in the RBI opportunities for different players (see table below). Although Andre Ethier and Hanley Ramirez both finished with 106 RBI last year, Ethier actually had 56 more runners on base in his at bats than Ramirez. And what about hits that move runners around, but don’t actually score the run – those are ignored too. Finally, neither Home Runs and RBI take into account the number of opportunities that a player had. So looking at HR/RBI probably isn’t the best way to assess a player’s contribution to an offense.
|Batter||RBI||Runners on Base|
Evaluating Offense - Some History Runs Created In the late 1970s, Bill James introduced the Runs Created stat in his Baseball Abstract. The basic concept is that scoring runs involves two things – getting runners on base, and then advancing base runners. At the team level, these two concepts can be quantified by On-Base Percentage (how often players reach base safely) and Slugging Percentage (total bases per at bat). The first Runs Created formula was simply OBP*SLG*AB, and remarkably, this simple formula usually predicts how many runs a team scores within 5%. Here are the results for 2009, where Runs Created is predicted by the simple formula RC = 0.98 * OBP * SLG *AB.
|Team||AB||OBP||SLG||RUNS||RC RUNS||% Error|
As the table shows, the simple formula comes remarkably close to the actual values. Combining Runs Created with the so-called Pythagorean Theorem (Team Win Percentage = RS^2/(RS^2 + RA^2), where RS = Runs Scored and RA = Runs Allowed) were a revelation for most baseball fans. They allowed fans to make fairly accurate estimates of things like “How many more runs would the Braves scored if Bob Horner had been healthy?” or “Suppose the Giants could replace Johnny LeMaster with Dave Concepcion?” Although this simple RC formula gets pretty close to the actual runs scored by a team, the formula has been refined over the years to have unique coefficients for singles, doubles, triples, and HR, and to include SB and CS. Variations of this are RC/27, which estimates the runs scored per game, or 27 outs.
Linear Weights A second method for quantifying a player’s offensive contribution, based on linear weights, was developed by George Lindsey in 1963. Using play-by-play data, Lindsey quantified the run-scoring value of each event. This technique was later expanded by Pete Palmer in the book Total Baseball, using game data as well as simulations. LWTS = 0.46*1B = 0.80*2B + 1.02*3B + 1.40*HR + 0.33*(BB+HBP) + 0.30*SB – 0.60*CS – 0.25*(AB-H) Linear Weight models are the basis for many current batting evaluators, including Equivalent Average (EqA) and weighted On-Base Average (wOBA).
Evaluating Offense - Today
OPS Since team runs can be estimated so well by only two parameters, OBP and SLG, naturally people looked at these two stats at the individual level. And, to simplify the math, adding the two numbers together became a common way to quantify a player’s offensive contribution – OPS=OBP+SLG. Here are the leaders in these three stats for the NL in 2009.
Although OPS correlates very well to Runs Scored, it’s not the best run estimator around. But it is popular because it is easy to calculate. But more refined calculations have shown that OBP should be weighted higher than SLG, rather than just adding them equally. And the fact that SLG treats a HR as 4 times better than a single isn’t quite right; people have since found better coefficients.
Win Shares were presented by Bill James a few years ago in his book of the same name. Three notable things about Win Shares are: 1) The sum of the individual player’s Win Shares match up with the total Wins by the team. 2) The stat incorporates “clutch” stats such as hitting with RISP and hitting HR with runners on base. 3) Play-by-play data is not used for the defensive evaluations; rather, the totals for assists, putouts, DPs, errors, etc. are used. 4) There are no “Loss Shares” or negative Win Shares for players who play below replacement level.
These are significant because the other stats (WAR, WARP) are based on the components (2B, HR, BB, etc.), not the actual team wins. So teams and players who win more games than predicted (presumably for better clutch performance) get extra credit for it. James assigned 3 Win Shares per team win. The first step is to divide the team’s win shares between offense and defense, based on the team’s relative strengths from a Marginal Runs calculation (including park effects). Then, within this, Win Shares are awarded to each player based on their contribution.
Offense – Once we know how many Win Shares the team has in total, this total is divided among the players based on the fraction of the team’s runs they created using the latest Runs Created formula. James also includes some “clutch” and RISP numbers in his formulas.
Defense - There is a different set of formulas for fielding Win Shares at each position. The stats used are things like Assists, Putouts, Errors, Double Plays, etc. Fielding Win Shares are based on cumulative stats, not play-by-play data.
Pitching - For pitchers, the Win Shares are allocated using a component ERA method – calculating a prediction of how many runs the pitcher gave up based on the number of singles, doubles, etc. that he allowed – sort of an inverse Runs Created formula. Again, there are correction factors for things like Saves tacked on at the end. More details on Win Shares can be found here.
Here are the 2009 NL Leaders in Win Shares (from Bill James Online):
2009 NL Win Shares Leaders
WARP (Wins Above Replacement Player) WARP is the stat used by Baseball Prospectus to combine all of a player’s stats into Wins above a Replacement Player. WARP is not directly connected to the actual number of wins that a team had, but is based on summing up the individual performance of each player. Offense - Offense is based on BP’s stat Equivalent Average. This is a park-adjusted linear weights-type formula that converts all of a player’s stats into a number that is scaled to the same scale as batting average, so that an average player is at .260 EqA. This can then be converted to Equivalent Runs, and to Runs Above a Replacement player. Defense – BP uses “Seasonal Totals” rather than Play-by-Play data. Adjustments are made for the nature of the pitching staff (LH/RH, GB/FB). Pitching – BP bases their pitching wins on stats based on park-adjusted ERA and Innings Pitched, as discussed in the VORP section of LINK.
2009 NL WARP Leaders
Wins Above Replacement (WAR)
WAR is the stat used at FanGraphs to rank players. The key differences to note are: 1) Defense is evaluated based on play-by-play data, not seasonal data, using Ultimate Zone Rating (UZR). 2) Pitching is evaluated based on Fielding-Independent stats (see 2009 NL Cy Young article). 3) A positional adjustment bonus is given to difficult defensive positions. 4) Catcher’s Defense is not yet rated, so all catchers are rated equal defensively. Offense – Offense is evaluated based on weighted on-base average (wOBA). This is also a park-adjusted, linear weights system, scaled to match on-base percentage. Defense – Defense is quantified using Ultimate Zone Rating. UZR divides the field into 64 zones, and counts how often the player makes plays in his nearby zones. Park factors, the speed of the batted ball, the handedness of the pitcher, and the flyball/groundball nature of the pitcher are also considered.
Pitching – Pitchers are evaluated based on FIP, rather than actual ERA. The logic here is that FIP is a better assessment of the pitcher’s contribution than ERA, which involves the interaction of the defense and relief pitchers. 2009 NL WAR Leaders
Summary Win Shares, WARP, and WAR all attempt to combine all of a player's stats into a single number, on the scale of Wins. Win Shares is the only one that uses actual Team Wins, while the other two assign Wins based on the individual component terms. However, the way that Win Shares divides shares between offense, fielding, and pitching is very complicated, and often quite arbitrary. All three methods have approximately equal methods to evaluate batting. However, WAR uses fielding-independent pitching, rather than the real ERA or hits allowed. Is this a better way to isolate the pitcher's performance? The jury is still out of this one, as we saw in some controversial ballots in the Cy Young voting. WAR is also the only one to evaluate fielding based on actual play-by-play data, rather than seasonal fielding totals. However, since Win Shares and WARP use seasonal defensive totals, they can be calculated throughout baseball history, while WAR is restricted to modern data. The table below summarizes the NL leaders in the stats of Win Shares (from Bill James Online), WARP (from Baseball Prospectus), and WAR (from FanGraphs), with their rank in parenthesis.
My NL MVP Ballot
|1||Albert Pujols||8.4 (1)||12.1 (1)||13.0 (1)|
|2||Chase Utley||7.6 (3)||8.6 (3)||10.5 (6)|
|3||Hanley Ramirez||7.3 (4)||7.3 (10)||11.4 (4)|
|4||Prince Fielder||6.8 (6)||7.9 (6)||11.9 (2)|
|5||Tim Lincecum||8.2 (2)||7.4 (9)||7.5 (18)|
|6||Matt Kemp||5 (15)||7.9 (5)||8.7 (9)|
|7||Adrian Gonzalez||6.3 (8)||9.2 (2)||11.3 (5)|
|8||Troy Tulowitzki||5.4 (12)||6.3 (18)||8.0 (11)|
|9||Pablo Sandoval||5.2 (14)||5.8 (24)||9.0 (7)|
|10||Ryan Howard||4.8 (18)||5.4 (28)||8.8 (8)|
|11||Ryan Braun||4.8 (19)||6.8 (15)||11.3 (3)|
Pujols is the clear #1, in every ranking system. I have the middle infielders, Utley and Ramirez, at #2 and #3, with Utley's defense just edging Ramirez's offensive advantage. Prince Fielder gets the #4 slot, while Lincecum is the highest ranked pitcher at #5. I have the 6-11 slots filled with Kemp, AGonzalez, Tulowitzki, Sandoval, Howard, and Braun, although I expect that Tulo will finish much higher in the actual voting.