Goatriders of the Apocalypse

A closer look at defense

When you watch a baseball game, most of the time what you are watching is the matchup between the pitcher and the hitter - that's the fundamental lens through which we tend to view baseball. One guy pitches the ball, and the other guy tries to hit it. What tends to get ignored is the defense playing behind the pitcher. There are generally two cases where the defense gets noticed:

  1. When a player makes a spectacular "web gem," generally by diving for the ball in some spectacular fashion.
  2. When a player makes a particularly egregious fielding miscue.

Those are corner cases. When we're judging the quality of a defense, what we're mostly concerned about is how many times a ball hit into play is turned into an out. The most important factor here is the ability to get to more baseballs; web gems and errors tell us little about a fielder's ability to get to more baseballs.

Most of the time, we don't bother to ask the simple question: "Did the defense get to as many baseballs as they should have?" No, generally we'll credit the pitcher with having a good game or a bad game, and unless there was a notable play, we'll ignore the role of the defense in that pitcher's performance.

Voros McCracken, who among other things has done some consulting for the Red Sox, looked at the divide between the responsibility of the pitcher and the defense, and came to one of the most startling and controversial conclusions in sabermetrics:

There is little if any difference among major-league pitchers in their ability to prevent hits on balls hit in the field of play.


You can better predict a pitcher's hits per balls in play from the rate of the rest of the pitcher's team than from the pitcher's own rate. This is pretty self-explanatory. The effects of having the same team defense and home park appear to be significant determinants in creating what little correlation there is in the stat.


The range of career rates of hits per balls in play for pitchers with a significant number of innings is about the same as the range you would expect from random chance. This is true even though we know that some pitchers may have had consistent advantages over others, as these rates are unadjusted for park or league. The vast majority of pitchers who have pitched significant innings have career rates between .280 and .290.

What Voros is talking about is a stat called Batting Average On Balls In Play, which is calculated like so:

(Hits - Home Runs)/(Plate Appearances - Home Runs - Walks - Hit By Pitch - Strikeout)

The main responsibility of the pitcher is to strike out batters while not walking batters and not allowing home runs. The rest falls mainly on the shoulders of the defense. (There are other things that pitchers can do - getting popups is very useful and it appears to be a real pitching skill.)

I'm sure that if I don't mention it myself, I'll hear Greg Maddux brought up as evidence that this isn't true. The problems with this argument:

  1. Maddux was one of the top 10 strikeout pitchers in the NL in nine seasons, and ranks second among active players in career strikeouts (eleventh all-time). Maddux was a very good strikeout pitcher in his prime.
  2. Maddux's amazing control allowed him to do this without issuing many walks; he has the third lowest career walk rate among active players.
  3. Maddux's career BABIP is .286, right where McCracken's research says it should be, if his theories are correct.

We could spend more time parsing DIPS research, if we wanted to. There's a lot of fascinating avenues to be explored. For now, what you need to take away from it is that defense is a vital part of preventing runs and thus keeping runs from scoring; a good defense can make your pitchers look much better, and a bad defense can make your pitchers look much worse. So when we talk about improving the Cubs' pitching staff, our actual goal is to improve the Cubs' overall run prevention ability, and that includes our defense.

A note, before we proceed any further - one of the inconveniences about the lack of attention paid to defense is the fact that good defensive statistics are harder to come by than good offensive statistics. I can wander onto almost any baseball website these days and at least find OBP and SLG, which are good enough when you can't find anything better. They're the McDonalds' of offensive statistics - they're hot, cheap, available pretty much anywhere, and close enough to being food that they'll do the job.

There is no McDonalds' of defensive statistics. This doesn't mean there aren't good defensive stats, it just means you have to go digging deeper to find them, and put more work into interpreting them. What it also means is that there isn't as much standardization between the main data providers - Retrosheet, Baseball Info Solutions, and STATS, Inc. I will be quoting from all three providers, because I don't have the money to buy a full set of play-by-play data from the commercial services (Retrosheet's records are available freely, but only in the offseason) and no one website has the data broken down exactly how I'm looking for it. Since there is slightly more subjectivity in the more detailed sort of recording used in fielding metrics (everyone knows what a hit is and isn't; it's a bit murkier differentiating between a fly ball and a line drive) there will be some discrepancies in the numbers between different data providers. That's okay, so long as we're aware of it and are careful of the fact.

Let's start off simply by looking at the pitching splits for the Cubs and the league as a whole. (By which I mean the National League.) So far this season, the Cubs pitchers (really fielders) have a .282 BABIP, compared to .295 for the league as a whole. This means the Cubs are allowing fewer hits on balls in play than other teams. This is a good thing, because more hits means more runs. But let's go ahead and break it down by hit type:

  Cubs NL
Ground Balls .229 .231
Fly Balls .131 .143
Line Drives .718 .713
Bunts .429 .341

Ground balls and bunts are the primary responsibility of the infield; line drives and fly balls the primary responsibility of the outfield. The Cubs are roughly average on the infield, and decently above average in the outfield. As a team our bunt defense is severely lacking, however.

Now, let's take a look at the Hardball Times team page - scroll all the way to the bottom and take a look at the fielding stats. Actually, first pause real quick and look at the pitching stats. They list DER, which is just 1 - BABIP, essentially looking at outs made instead of hits allowed. You can see the impact of our fielding on run prevention by looking at the difference between FIP, which projects ERA based on HR, BB and Ks, and actual team ERA. (There's also the above-average left on base rate for the Cubs to consider.)

And to the defense. THT splits it up between the infield and the outfield, listing RZR (which is the percentage of plays a team makes within their assigned zones) and OOZ, the number of plays made outside of the assigned zones. Again - most of the defensive value appears to be in the outfield.

But let's go ahead and break down responsibility for individual players. It's important to note that mainly what we are looking at is a player's ability to get to baseballs and turn them into outs. Things that fall outside the scope of this analysis include:

  • Catcher defense
  • Turning double plays
  • Preventing baserunners from taking extra bases on balls hit into the outfield
  • First basemen preventing throwing errors from other players

I don't want to degrade the importance of any of those things, but none of them (or even all of them together) are as important as the ability of a fielder to simply make an out on a ball in play. And they require different methods to look at them - without play-by-play data, it's especially hard to look at the last three. (Some other time, I'll try and get into evaluating catcher defense. Sneak peak: Soto is good at blocking pitches and throwing out baserunners.)

I briefly touched upon zones earlier; I'll go into more detail on what those zones are and what they mean now. First, here's a chart of the zones as defined by STATS, Inc.:


Certain zones are assigned to certain fielders - for example, a ground ball hit in zones H through L is considered to be the responsibility of the shortstop. Chris Dial does a great job of explaining the areas of responsibility for each fielder. (Just remember - there are different zones for different batted ball types; outfielders are responsible for fly balls and line drives hit into the outfield; infielders are responsible for ground balls and some line drives. Unfortunately, Dial doesn't specify the zones for line drives for infielders, but so far as I can gather they're smaller than the zones for ground balls.) All balls hit into a fielder's area of responsibility are considered "in zone opportunities." Zone rating is calculated as:

(Plays made in zone + plays made out of zone) / (In zone opportunities + plays made out of zone)

This tends to slightly underrate players with more range (among other concerns), but since I don't have the $5,000 or more to buy STATS's play-by-play data myself, I have to live with it. (BIS, as noted above, calculates zone rating by ignoring "out of zone" plays entirely, which isn't a better solution.)

Zone rating is the closest thing we have to a McDonalds' stat for defense; there are countless refinements you can make if you have the data (which, again, I don't) - for those interested you can read about Mitchel Lichtman's UZR, which is amazingly awesome and, coincidently, wholly unavailable since the All-Star Break last season. (mgl does work for MLB clubs, and thus doesn't publish as much of his data as one would like, because his data is awesome.)

We can roll our own plus-minus rating, though. (Again, much thanks to Chris Dial for explaining the process, as well as jinaz's post on the subject.) What we're looking at is how many plays - and while we're at it, runs - each player saved/allowed on defense compared to the league average.

STATS, Inc. Zone Rating is available from Rogers Sportsnet and ESPN; ESPN doesn't have as much detail as I would like and Sportsnet seems to have downright errors in their reporting - I know full well that Angel Pagan hasn't played outfield for the Cubs this season and Kosuke Fukudome has, and that's problematic. So I'm going to use BIS's data instead, as The Hardball Times publishes pretty much everything you could ask for. BIS uses more zones than STATS, and changes the zones of responsibility much more frequently, however. So just bear that in mind going forward.

I'm using jinaz's method for converting OOZ and IZ plays into +/-, and simply adding the two. Short version: I figure out how many in zone plays and out of zone plays a average player would have given that many "ball in zone" opportunities, and then subtract that from the number of plays made. Runs are calculated based upon the average value of a play made at that position.

I'm calculating my own average zone ratings for each position, because as I said, BIS changes zones periodically. (Full data for all NL players available for your perusal.)

Or, since this is what you've come for, here's all Cubs players, season to date (really season through the sixth):

Last First Pos BIZ Plays OOZ Plays +/- OOZ +/- Total +/- Runs +/-
Fukudome Kosuke RF 104 97 32 3.95 5.46 9.41 7.93
Lee Derrek 1B 86 69 22 3.30 1.83 5.13 4.05
Pie Felix CF 34 30 13 -1.18 4.48 3.29 2.77
Fontenot Mike 2B 36 32 3 2.68 -0.15 2.53 1.91
DeRosa Mark 3B 13 8 5 -1.02 2.77 1.75 1.40
DeRosa Mark RF 9 8 4 -0.05 1.70 1.65 1.39
Edmonds Jim CF 26 24 8 0.15 1.48 1.64 1.38
Cedeno Ronny 2B 25 20 4 -0.36 1.81 1.45 1.09
Ramirez Aramis 3B 91 62 18 -1.11 2.37 1.26 1.01
Johnson Reed LF 7 7 2 0.74 0.47 1.21 1.01
Hoffpauir Micah LF 1 1 1 0.11 0.78 0.89 0.74
Murton Matt LF 1 1 1 0.11 0.78 0.89 0.74
Soriano Alfonso 2B 2 2 0 0.37 -0.17 0.20 0.15
DeRosa Mark 1B 1 1 0 0.24 -0.23 0.00 0.00
Ward Daryle RF 1 1 0 0.11 -0.26 -0.15 -0.13
Fukudome Kosuke CF 1 1 0 0.08 -0.25 -0.17 -0.14
DeRosa Mark LF 15 13 3 -0.41 -0.28 -0.68 -0.57
Ward Daryle 1B 4 3 0 -0.06 -0.94 -0.99 -0.78
Soriano Alfonso LF 73 63 17 -2.26 1.06 -1.20 -0.99
Johnson Reed CF 67 60 16 -1.45 -0.80 -2.25 -1.89
Cedeno Ronny SS 19 15 1 -1.07 -1.74 -2.81 -2.12
DeRosa Mark 2B 85 67 3 -2.23 -4.44 -6.67 -5.03
Theriot Ryan S SS 135 111 13 -3.18 -6.50 -9.68 -7.29

Before we go any further, there are some clarifications I want to make:

  1. As with everything, sample size matters; numbers for players with 100 or more BIZ are more reliable than numbers for players with only one BIZ.
  2. These numbers are relative to the average at the position; some positions are more difficult than others. An average defensive shortstop is more valuable than the average defensive first baseman.

Remember when I said that our outfield defense was above-average? It turns out this is mostly creditable to Kosuke Fukudome, who is an outstanding right fielder. AAA warrior Felix Pie is another defensive standout in the outfield. And - credit where credit is due - so far Jim Edmonds seems to be playing a very good defensive center field, much better than I expected. Soriano and Johnson have both been subpar, but hardly defensive liabilities. (Earlier I went over Soriano's fielding in more detail.)

Our corner infield has been fine - Lee's defense at first is no surprise to Cubs fans, and Ramirez (while not challenging for the Gold Glove like last season) has done well for himself.

Our middle infield, though - yikes. Credit where credit is due, right off the bat - Mike Fontenot is playing a servicable second base, and though his range might not be the greatest, he's fielding all the balls he should be getting to. Again, there is sample size to consider, but so far, Mike Fontenot has done better than I would have expected.

Other than that, though - wow. Our middle infield is not covering itself in glory. DeRosa is saved the indignity of being the worst defensive 2B in the NL mostly through the superhuman efforts of Damion Easley, who in just over half as many chances has cost his team twice as many runs in the field as DeRosa has. Theriot... has no such luck. Theriot, as best I can figure it, is the worst defensive shortstop in the National League so far.

Great comfort on the day of a Jason Marquis start, huh?


so..theriot's good? I'm confused.

I kid, of course. Thanks for clarifying the whole defense thing. I'm still going to like the guy because he came up through the Cubs' system and it's his fault he's not great. It has nothing to do with being scrappy and white, though. I promise.

Chicago Tribune's Chicago's Best Blogs award