A Substitute for War

Basketball philosophy

The Nash Disequilibrium, or Why I Use +/- Statistics

with 9 comments

Image by OakleyOriginals via Flickr

I felt the need to write this as a result of the article I wrote on Kobe Bryant and his adjusted +/- statistics this season. That article showed my perspective as someone who uses these stats – this one gets into why one should use them.

I’m a math kind a guy. I’ve been making statistical rankings of basketball players and other such trivia for forever. When the internet was first reaching prominence, many did see how they would use it, though they actually did end up using it obviously. I was dying for it though from the start. To have access to data like basketball-reference.com has is like a geek nirvana for me.

Now, I always knew that in basketball, the stats didn’t cover everything, but I always figured that what they missed was relatively small and not ridiculously biased. And then in ’04-05, I found myself utterly fascinated by the Phoenix Suns and Steve Nash. Every metric I’d ever come up with or ever seen said that Nash wasn’t the best player on that team, but my common sense just found this absurd. He was the one directing that offense, not the scorers. The team had launched forward far beyond what anyone expected because of an improvement in team offense that was completely unbelievable, and the team had made but one major change and one other major decision: Sign Nash, and put the ball & decision making in his hands.

And Now for Something Completely Different

Yes it was clear that Amare Stoudemire was improving a lot and Shawn Marion was improving some, but understand the scope here: The Suns offensive efficiency went up by 13.1 points. In this season, ’10-11, the gap between the best and worse offenses in the league is only 11.3. Last season, when Kevin Durant had maybe the biggest season-to-season improvement I’ve ever seen, the Thunder’s offense only improved by 6.0. Bottom line is that you only make an improvement like that if you play a completely different way, and Nash was that way.

Tangent: Now some of you who know me may say, “Yeah, but you love Nash, isn’t that biasing you?”. The causation flowed the other way. Having my old way of thinking proven wrong by the guy is why I like him.

Getting back on narrative, as an analyst this left me rather despondent. Of course I briefly had the desperate hope that maybe we were all terribly underrating assists, but this seemed unlikely then, and further analysis shows no such correlation. No, it was clear that anything assembled solely from existing box score stats was missing a good deal. Worse, it wasn’t a reasonable to say that this just resulted in a lack of precision. With Phoenix we were literally seeing that the guy with the ball in his hands most of the time was having huge net impact beyond the stats. That means a clear bias against a certain type of player, which is a big problem.

And that’s why I started following more closely the internet basketball community, quickly moving from lurker to contributor on sites like RealGM and APBRmetrics. I had my ear to the ground as it were to see what alternatives were out there, and thus I discovered +/- statistics. (Briefly: +/- credits the player based on how good or bad his team did while he was on the court.) I was familiar with the stat from hockey, but the raw number they used never impressed. The basketball geeks however had done much more advanced work, and I knew I could use their stuff.

On Reliability and Validity

Let me introduce the concepts I’ll call reliability and validity (people use different terms, and use terms differently on this, don’t let the jargon throw you off):

Reliability measures repeatability, or what you could call consistency. I reliable measure means that it if you apply it to the same thing multiple times, you’re going to get very similar results.

Validity is harder for me to define so precisely, but in rough language, it measures if you’re aiming in the proper direction for your target.

I’ll give the example of a scale. If you step on the scale 3 different times and get 3 wildly different measures, you’ve got an unreliable scale. If you have a scale with no weight on it, and it’s reading “30 lbs”, well then you’ve got a bias, and thus your measure is invalid.

Let me say it another way by referring to the pictures below:

Image from columbia.edu

If I’m throwing darts at a dartboard, and I’ve got a reliable but invalid aim, it’ll look like the image on the left. If I’ve got valid aim but poor reliability, it’ll look like the center image. It’s only when we’ve got both reliability and validity do we get the results we really want, shown on the right.

What Nash and the Suns did convinced me that using box score based metrics would give us something like the image on the left. They are not valid, or at least not valid enough for me to use them to the extent that I’d like.

+/- Statistics give us a (more) valid stat

The problem with box score stats is that in the end, they only matter to the extent that they impact another stat: the scoreboard. So when we find that just looking at them only gives part of the picture, why not go just directly to the source of the meaning and measure a player’s correlation with scoreboard performance?

Now, as I’m guessing you most of you or either know or can guess, +/- stats ain’t perfect. When you can stand on the court doing nothing and get as much credit as the guy who scores the bucket, that is a flaw. Key point though: What kind of a flaw is it? I would argue that once you’ve made the adjustments the basketball stat community has made, and you get clear what you use it to measure, it’s error is virtually entirely one of reliability. It’s validity, in other words, is much superior to that of stats like PER, or Win Shares, or Wins Produced. Thus begins the debate about the flaws in the stat, so let me be you and will hammer this out without you having to break a sweat:

1. One player may play with better teammates than another.

2. One player may be more likely to play against better opponents than another.

Adjusted +/- uses regression analysis to account for both these issues. If you’re playing a disproportionate amount of time with a superstar, it’s going to credit the superstar for a lot of the team success. If you’re playing against the opposing team’s bench most of the time, it’s going to know that too.

3. A player may play a disproportionate amount of time in a good matchup.

Good one, you know your stuff. There’s a part of this I’m going to address in the next section, but for here let’s consider: What type of player would this be? Realistically, not a star. If you’re playing big minutes, then your coach is basically just playing you whenever possible. Yes he might play you more against the other team’s best (which adjusted +/- accounts for), but the way he’s using good matchups is decide how much you’re going to shoot, how the team’s going to attack, etc. You’re too good for him to actually pull you out regularly.

However, a player with less minutes could indeed have this issue, so I would never look at a guy who plays sub-minutes and say “That coach is an idiot, look at the +/-, he should be playing 35 MPG!”. I would simply conclude, if I concluded anything, that the player was being used well when he’s on the court.

4. A player may get to play in a role more conducive for his talents than another.

This is true of all stats, and why we should look at them directly as measurements of value or current impact. Use of that value measurement to extrapolate raw goodness is fine, but it should be understood to be less quantifiable. So this is a matter of clarification of the target. Make your target ambitious enough, and any claim on validity is absurd.

5. A player may just be lucky

That’s a reliability issue.

6. If the team didn’t have a particular player, they may change style and thus do better than +/- would predict.

This I think is the best criticism of +/- I know of. In terms of measuring an individual player, I know of no way around this.

However, consider how this affects other stats. I don’t think any stat is immune to this issue, so this certainly isn’t an issue that closes the validity gap between +/- and other stats.

Okay, say you’re right, how do you use that if there isn’t reliability?

Hopefully I’ve convinced you that +/- is more valid when properly adjusted than other stat. If not, consider at least that any bias it would have would be unrelated to the bias of the PER-type stats. If reliability weren’t an issue, everyone would want to use it for that reason alone presuming they didn’t somehow conclude +/- had inferior validity. It is however quite understandable to be unsure how you’d make use of it if there were no reliability in the stat.

Well, first of all, it’s not that there is no reliability, it’s that it is less reliable. That means that we just need more data to reduce randomness than we would in other stats. Note that it would be unreasonable to think that a player’s performance in one game by any metric to be the gospel describing that player. I went through this in the Kobe article. People wanted to dismiss what the stat says about him this year because of small sample size, but the whole point is that we’ve played enough that we can’t ignore the glaring stuff.

Over the course of one year there’s still a lot of noise. Enough noise that you would never want to base your MVP vote on one players finishing one slot ahead of another in the adjusted +/- standings certainly. (Not that I’d do that with any stat mind you.) However, not so much that you can’t tell anything, and you need to be aware of exactly what it would take for you to toss the numbers out. A good good gauge to use for that is the standard error (SE). At this point in the season, a one year estimate typically yields SE’s of 4-6 for star players. So clearly if the gap between two players like this were say, 2, you’d toss it out completely. Even it were the sum of the standard errors, around 10, could argue that’s not enough sample size to make a reliable conclusion. (And I’ll also note, that SE is nothing magical. You could argue for too little sample size even beyond it.) That gap between the league leaders and Kobe is around 20 though. That’s rough to ignore.

Then there are the multi-year estimates. The 2 year SE drops to around 2-4. A 6-year estimate we have drops it to below 1. If a player is consistently doing well year after year, to me that’s when +/- starts hitting paydirt and I start considering it to be more real than the other stats. In the 6-year estimate from ’03-04 to ’08-09, Kevin Garnett has by far the best rating despite having changed teams and roles midway through. I don’t know how that can not thought to be extremely informative and important.

+/- as a Barometer

Finally let’s consider +/- not as a measure for a particular player, but as something to consider the general tendencies of the league. For example, in the 6-year estimate above, there are 8 point guards (among a particular standard error cut off) whose offensive adjusted +/- rate above +3.0, and only 1 center (Shaq, at +3.38), but there are no point guards who rate that well on defensive adjusted +/- and 14 centers who rate at that level or better. The same trends have been analyzed in more detail at this site discussing the ’07-08 season. Seems to be saying quite strongly perimeter players are better on offense and big men are better on defense, which would make sense with what a lot of people already believe.

And let me revisit the two issues I partially brushed off earlier:

3. A player may play a disproportionate amount of time in a good matchup.

6. If the team didn’t have a particular player, they may change style and thus do better than +/- would predict.

Notice how these issues don’t really concern the barometer use, at least as it’s used above? Would it really make sense to say that all of a particular type of common player gets disproportionate good match ups or is being overrated by a use of a specialty style? I don’t see it. As an NBA player, you are your ability to help your team against other NBA teams. If you have abilities that correspond to success the league over, then it’s no fluke that you are succeeding.  (In other words, those mad one on one skills you got? Nobody cares about them if they don’t see the results on the scoreboard brother.)

So as a barometer, +/- become even closer to optimal validity, and considering that it’s easy to get large sample size when using the whole league, it has great reliability too. You really can’t do any better than that, I don’t see any excuse for a thorough, knowledgeable analyst and/or statistician to not use it.

The end goal is valid and reliable conclusions

I’m appending this last section on to the end as it seems clear I didn’t properly circle back to this. By no means am I saying that you should only use +/- statistics in your analysis, or that you should not use box score based metrics. What I’m saying is that the goal is to achieve valid and reliable conclusions, and that traditional metrics are very much lacking on the validity side of things. Therefore having a stat strong on the validity side of things and weaker on reliability side gives us a means to strengthen a weakness. I would not advocate using +/- stats alone, but rather as part of an analytic cocktail with the more traditional stats and of course observations to make our thinking as robust and well-rounded as possible.

Advertisements

Written by Matt Johnson

March 26, 2011 at 12:04 am

9 Responses

Subscribe to comments with RSS.

  1. I really like the reliability vs validity angle with associated dartboard for visualization. Cool concept with easy understanding.

    I also like the way you annunciate your reasons for wanting another way to look at the game besides just the box score stats. Anecdotes like your Nash one help make it clearer that something more was needed. Another angle you could have taken with that is just to point out that the box scores just don’t cover all of the game, especially on defense where blocks/steals/rebounds are at best secondary measures that completely miss entire swatches of an entire half of the game.

    One thing that I would get from this article, that I’m not sure that you mean, is that you are advocating using +/- stats INSTEAD OF box score based stats like PER or whatever. I’m not sure I would go that far, and your dartboard example gives good reasons why. If one HAD to pick just one, maybe I’d go with APM, but I think the best statistical picture combines the box scores with the +/- stats in some way. How you make that combination is where the artistry and skills of the analyst comes in, but I definitely think one needs to consider both types of info (in conjunction, obviously, with everything else) when making their decisions

    drza44

    March 26, 2011 at 4:32 am

    • drza, I really appreciate your feedback. I’ve added a section at the end to hammer in that I’m talking about adding a tool to an arsenal, not replacing an existing tool.

      Also you’re absolutely right that defense is actually the part least well covered by PER-type stats. If I were to think of one person most underrated by PER, I’d think of Bill Russell (who not coincidentally is another one of my favorites). Nash was the one that supplied my epiphany though, so he gets namechecked here.

      Matt Johnson

      March 26, 2011 at 1:56 pm

      • I guess a natural fall-back question is, how do you decide how much emphasis to put on APM? For example, as you point out in your last post, Kobe’s APM from this year is frankly horrid. It’s so bad that when I ran a stat analysis for wings in the league, I ended up using the 2-year APM because the 1-year values for Kobe (and Ray Allen) would have been so low that it significantly influenced their overall placement in my study. Enough so that I worried that my audience wouldn’t accept it and thus would miss the rest of the take-home I was presenting.

        But the thing is…I’m not positive that the APM story in this case is so wrong. I haven’t studied the Lakers this year in huge depth, but I watch a lot of the Celtics, and as I mentioned Ray’s APM is really low this year as well. This is because KG and Pierce are dominating the APM for the Cs, while Ray/Rondo and the others are lagging behind. But this sorta matches, as KG/Pierce seem to be the ones most often pulling the ship.

        Likewise, for the Lakers in this regular season, could it be that the Bigs really are just dominating the opponents to the point that the perimeter play is less important? And if so…if (for the sake of argument) Kobe really was putting up emptier (box score) numbers this year than he has in year’s past…OK, let me try again to frame this question, this time without anyone’s name in it:

        Can you BE a top-10 player in the league but be at the bottom of the league in APM? And if you can, does that say anything about the utility of the stat?

        drza44

        March 28, 2011 at 12:25 pm

      • I feel your pain man. I don’t intend to make a habit of talking blog shop, but when I really laid out my Kobe beliefs on here and RealGM, traffic really nosedived. They could not accept what I was saying, and so they just stopped paying attention. Not going to stop me from saying my piece, but it’s easy to see the pull talking heads feel not to deviate too strongly from the prescribed narrative.

        Deciding how much emphasis to give APM? Well I don’t think you can have one rule. I use it as a tool, fitting into the narrative that makes the most sense to me. That might seem wishy-washy to hard core stat guys, but my approach has always been a hybrid.

        I think it’s clear that a reading that Kobe is hurting the team is one that doesn’t think about how all of this is happening. To me it’s just such a key point to consider that the team is doing great with Kobe on the floor. The idea that Kobe’s a terrible player because he’s not pushing for even-greater in the regular season on a team only concerned with the post-season is silly.

        On the other hand, if you were to rank Kobe below Gasol because of the burden being carried, I would not call you crazy. I had that for part of the year before it just seemed clear that the truth of the matter is that the *team* is what’s so dang good. Kobe & Gasol are great, but as with the Celtics, it’s a mistake to think a keystone actually exists.

        Matt Johnson

        March 28, 2011 at 12:36 pm

      • What’s the term? Black box or something. The 2011 Lakers +/- is a black box. We know any combination of Kobe, Bynum, Gasol, Odom and Fisher is ridiculously successful. We don’t have the lineup diversity to tease out who gets the most credit.

        Pretty sure it’s not Fisher.

        Greyberger

        March 28, 2011 at 2:34 pm

      • @Greyberger,

        I liked your comedic timing there. Yeah, I think it’s safe Fisher not the secret superstar of the group.

        The single player APM absolutely is a black box. More detailed lineup analysis well make it less so, but to some degree the black box aspect is an inherent part of something that’s born simply out of correlation. This is why APM really needs to be used in a cocktail with more concrete method for optimal conclusions.

        Matt Johnson

        March 29, 2011 at 10:27 pm

  2. Very well written as always Doc! I often agree with your opinions, however I’m nowhere near as capable of writing my thoughts like you do.

    Hopefully, someday APM will be recognized and trusted by a lot more analysts and fans.

    Also, love the image for reliability/validity.

    Rapcity_11

    March 27, 2011 at 7:45 pm

    • Thanks so much Rapcity, appreciate the kind words.

      I’m also so glad others found the reliability/validity image as helpful as I’d hope it would be. People often look at bringing in 50 cent words or seemingly out of context concepts as a bad thing, and it can be poorly done, but I find they can quickly convey an idea that people would otherwise spend indefinite periods of time not seeing.

      I think APM is always going to have some struggles with mainstream acceptance. On one hand, teams already use it, so that’s pretty dang good. On the other hand it can so easily be used poorly, and the best versions of PM stats are so complex, that many will dismiss the stat out of hand.

      Part of the battle I’m trying to wage is in just getting some people to stop dismissing things completely, and look to use them instead along a grayscale.

      Matt Johnson

      March 29, 2011 at 10:23 pm

  3. […] and this where these new fangled advanced stats become so revolutionary. You’ve seen me wax optimistic about the benefits of +/- statistics before, here is where I find them extremely powerful. Below are the […]


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: