The 3P% Inflection Point

Well, that’s embarrassing.

Modern NBA media and analytics put extreme emphasis on the 3-point shot and its efficiency over several 2-point shots. After all, 3 > 2 is indisputable. Out of all the metrics created, 3 point percentage is one of the remaining useful statistics from a traditional box score. But why is this? Why do media members harp on this statistic so much? Why are fans so quick to blame shooting?

It’d be easy to blame Steph Curry and the jump shot happy Warriors for afflicting this strategy upon the NBA. However, it started much earlier than that. Mike D’Antoni and the 7 seconds or less Nash led Phoenix Suns showed a revolutionary pace and space offense. The Spurs played them, beat them, and later appropriated their style with various adaptations. Then the Warriors put the strategy on overdrive with numerous off ball actions in addition to the best shooting, possibly ever. Now, you’ve got fans, analysts, and talking heads on sports shows saying the same things:

  • Can he well enough to stay on the floor?
  • Can he play against the Warriors? Or the Rockets?
  • What’s his 3 point percentage?
  • “WHY CAN’T [MY PLAYER] SHOOT? TRADE HIM IMMEDIATELY. SIGN SOMEONE IN HIS PLACE.”
  • Why is that guy open?!

Math Required:

To begin seeing the importance of the 3-point shot, we have to start with expected value. The formula is simple.

Probability of an event * Point value = Expected Value

It’s not whether you’d get 3 points or 2 points from a shot. It’s the value you expect to return per shot taken. The values will look pretty funny, but I promise it will make more sense at the end of this article. We don’t need to be afraid of decimals anymore.

During Steph Curry’s insane MVP season, he shot 45.4% on 3-pointers. That broadly translates to 1.362 points per shot. To contrast this with a mid-range heavy player, Kobe Bryant’s scored 35.4 points per game on 48.2% 2-point FG% during his phenomenal 2005-06 season. This equates to 0.964 points per 2-point shot attempt. That’s a whopping 41.2% more expectation per shot for Steph! Granted, no players can shoot like Steph Curry can. but this feat remains an impressive increase in EV.

To further illustrate the value of a 3 point shot, we have to look at another statistic, team offensive efficiency:

Offensive Efficiency Formula = 100 * (Points Scored)/(Possessions)

You can usually find offensive efficiency abbreviated as Ortg.

This equates to how how many points a team is expected to score over 100 possessions. To scale this to expected value, we can simply divide by 100. I’m going to use team offensive efficiency as the relative standard to illustrate why 3-point shots and thus, shooting ability is so valuable now.

In the last three seasons, the league average offensive ratings were 108.6, 108.8, and 106.4 from newest to oldest. This averages out to 107.9. So the average NBA possession came out to 1.079 points per possession. This alone already illustrates how terrifying a Steph Curry 3-point shot really is and was especially during his MVP season.

The 3-Point Inflection Point

So how well do players need to shoot to remain a threat? We can solve this with some simple algebra.

1.079 PPP < 3P% * 3

This equates to about 36% to be equal to the average NBA possession. At first glance, that looks really, really high. I thought it’d end up being about 33% or a little bit less because all 2-point shots look pretty difficult on TV.

According to data from basketball-reference.com, there are 319 unique players of 713 total unique players who have a 3P% of 36% or higher in the past three seasons. Out of those 319 players, 110 players played at least 8 minutes per game and shot at least 1.5 3-point attempts per game.

Let me repeat. Only 110 out of 713 players in the past 3 seasons were significant players who shot better than 36% from 3. That includes some odd bench players. That includes the rotation players. It’s everyone in the past 3 seasons.

Just for fun, we can look at the last season too. The 2017-18 season had a total of 540 players. 113 had a 3P% better than 36%, played 8 minutes or more, and attempted 3 3-pointers per game.

To generate the same 1.079 PPP with a 2-point shot, the shot must at least have a 53.95% to go in. Generally, that means only paint shots are acceptable and why midrange shots are shunned so harshly. Well, unless you’re Steph Curry who shoots 60% in midrange shots. Then you can just shoot whatever you want. Because who cares? You’re Steph Curry. Everything coming out of your hands may as well be on fire. 🔥

How Spacing is Created

36% is the magic number for whether or not you close out on a 3-point shooter. It means those are the players that defenders either must guard diligently or stay near enough to make a worthy close out effort. An extra step or two away matters in the NBA with players this big and fast. Extra spacing warps a defense further out of formation and makes the point of attack more dangerous. Without spacing, a team can send extra defenders to stifle pick and rolls, deter drivers, step into passing lanes, and pack the paint. This is the mechanic governing spacing on the basketball court. The extra space is created simply by a threat of making a 3-pointer.

The offensive players’ shooting abilities will also determine how a pick and roll is guarded. If the ball handler is a great shooter, it can make a mess of defenses as evidenced by Curry-Thompson pick and rolls. Switches are one method, but they’re prone to error. One miscommunication? Someone is suddenly wide open, and everyone on defense is looking around, pointing fingers, and asking “whose fault is this?” Fighting over good screens fatigues players mentally and physically throughout the game and season. Having the screen defender hedge out to give your primary defender time to recover can put your screen defender out of position. He needs to be fast enough to run back to a screener if he shoots well. If the ball handler can’t shoot well, the primary defender will always go under to recover for the drive. If the ball handler can’t shoot better than 36% from 3, you’d rather give it up than allow for other, more valuable plays.

Just let 6’8″, 220 lbs Jimmy Butler tell you how hard screens can be.

Shooting is hard. Shooting is a skill. JJ Redick’s podcast with Kyle Korver shares just how much preparation goes into the mechanic that is shooting a basketball from 23 feet away. It shows how much a few little injuries can affect and throw off the entire, polished mechanic. It is further evidenced by how the percentages can change depending on how wide open a player is.

Caveats and Simplifications

  1. One major caveat to my article is the 1.079 PPP value includes 3-pointers in it. In the last decade or so, teams have used the 3-point line more and more. If we isolated just 2-point field goals and recalculated the minimum 3P% needed, I suspect it would be lower by at least 5%. Unfortunately, I don’t have the dataset to do this yet.
  2. 3-point percentage in this article is extremely simplified. It’s an all encompassing statistic that is accounting for every kind of shot. This includes catch and shoot or off the dribble. Off the dribble shots are known to be far more difficult given a defender is closer, and the mechanic seems more complicated than catching and shooting. The percentage is swayed by the difficulty of shots.
  3. Last second heaves and super late in the shot clock shots should be tossed out to get a more accurate sense of what a player’s true 3-point percentage is. I think this point is a good point of discussion. Several teams run actions and still get a decent shot at the ends of the shot clock. However, we could still find a way to throw out end of quarter heaves even if they have little effect.

A Reshaped, Modern Playoff Format

Why should we do this anyways? The old format works fine!

Now hear me out. I’m not saying the current playoff formatting is broken. I think it’s pretty optimal for scheduling purposes, but it is mostly predictable. The NBA is one of the more progressive sports leagues out there, readily doing away with things like regional titles and how winning subregions affect playoff seeding. But we can do even better. How about adding some variability for a more fun fan experience?  We can reward the top records even more. We can give them even more choice in the road towards the finals. Let’s fabricate some DRAMA!

I borrowed a lot of these ideas from the GSL StarCraft 2 league in South Korea, the most well respected tournament in the game. Esports tournaments are not limited to location, physical fatigue, or the extreme scheduling hurdles that physical sports have. The result is a lot of varied, modern formats. I understand there are a mountain of logistics that are make video games different from basketball teams, and I will be ignoring a lot of them for the sake of simplicity.

These are no fun to consider when planning a new format.

  • Arena scheduling
  • Travel and rest days
  • Scheduling all together from TV networks

My broad idea is maintaining fairness within the competitions, shaking up a rigid format, and rewarding the best records in the league with more choice in their playoff brackets.

A Brief Overview of GSL

The Rules:

  • The four top-ranked players from 2017 are seeded into separate groups (A-D-C-B order)
  • First four picks are made in order of seeding (#1 seed gets 1st pick, #2 seed gets 2nd pick, etc.)
  • Remaining picks in “snake draft” order starting from Group D
  • After all picks are made, #1 seed can swap any two non-seeded players

Seeded players are the top 4 from the previous season. They are split off into four separate groups and choose the first person to put into their respective groups. The chosen players then choose the third player, and the third player chooses the last one in each group. At the very end, the champion from the previous season gets to swap two players. You can find an example here.

Two players advance from each group and are seeded into a round of 8 playoffs stage where you cannot play a player from your own group.

The best part? The group selection is televised! With lots of smack talk!

Well, that format would take forever and stupidly crazy for the NBA!

You are right. It is. I tried it. There’s no way to do it unless the playoffs took 6+ months, but we can still steal some ideas for my own interests. I’m tired of seeing Golden State play LeBron. I don’t want to sit through more Clippers-Grizzlies series. I don’t want to have this fake fabricated Philadelphia-Boston drama. I want teams across both East and West conferences to meet one another before the finals. I want a new format! So here’s my idea.

Here’s the 16 teams that qualified for the playoffs in 2018 in order by record.

  1. Houston
  2. Toronto
  3. Golden State
  4. Boston
  5. Philadelphia
  6. Cleveland
  7. Portland
  8. Oklahoma City
  9. Utah
  10. Indiana
  11. New Orldeans
  12. San Antonio
  13. Minnesota
  14. Miami
  15. Milwaukee
  16. Washington

Now, we split the top two teams onto opposite ends of the bracket. Then, Houston gets the first pick of whether it wants Golden State or Boston on its side of the bracket.nba-playoffs-bracket-initial-picks

This is just the start. Next, we have to fill out where the teams 5-8 are slotted. Toronto picks first, then GS, and finally Boston. The teams picked get put into their respective brackets. Then we start from 1-8 and start picking first round opponents.

Here’s what I picked out:nba-playoffs-bracket-ro16.png

The sequence that led to this bracket:

  1. Houston and Toronto are put on opposite ends of the bracket.
  2. Houston chooses Boston in its bracket. Nobody wants Golden State on their side.
  3. Toronto chose OKC.
  4. GS chose Philadelphia.
  5. Boston chooses Portland
  6. Houston got stuck with the Lebrons.
  7. Then teams 1-8 by record choose their respective opponents.

So what about teams 9-16? Are they just at the mercy of the higher seeded teams? Yes. Yes, you are. Should have fought harder to get into the top 8 then!

Keep in mind this is just one scenario that I tried.

Why? Just why?

Nobody tanks for position.

You know what is probably done but nobody talks about? Tanking for position. Teams were jockeying to play Boston in the first round this year after it lost Kyrie Irving. Teams thought this Boston team would be a joke. This format completely eliminates lower seeds’ ability to determine playoff bracket positioning and rewards higher seeds for having a great regular season record.

Side note: Boston is amazing this year, and I’m very sad Ben Simmons has to suffer for their greatness.

Better records are rewarded with more power.

This format also give the top 4 seeds much more meaning and choice in determining their respective roads to the finals. We avoid situations such as in 2007 where the winner of the Finals was clearly going to come from the West. As spectacular as it is for a young Lebron to lead his team to the finals, the Suns were a much more formidable opponent for the Spurs. Home court is great. Choosing your bracket is much better, and possibly more fun.

I think this format gives a better chance for the two best teams to actually meet in the Finals.

Isn’t that what we all want? We want the Finals to be the spectacular contest with the two best teams. The fans and the media more or less know who the best few teams are. So the top seeded team should exert its power and throw the biggest threat on the opposite end of the bracket. We want each round of the playoffs to be more competitive than the previous ones. It should not matter if 6 of the top 8 teams come from the West. We just want the best to advance.

DRAMA! DRAMA! DRAMA! And variety.

Fabricated drama, this generation’s greatest talent. Let’s televise the selection process! Have the disrespect flow into the lower seeded teams. The Wizards’ backcourt would have a field day with the media saying everyone disrespects them. How great would it be if a lower seeded team won too? The smack talk among fan bases will be AMAZING.

Plus, aren’t we tired of seeing a Golden State vs Cleveland finals? How about we add some variety to the format to avoid a constantly repeating finals? They still have opportunities to meet before the finals if the brackets shake up right, so the storyline is still possible.

THIS IS MADNESS! What are the problems with your format?

The top seed has too much power.

I think it’s fair, but I can see this being a problem in the future. Having the top records should mean something.

Regional rivalries may be gone. Some fans care about the nostalgia.

There was a lot of talk this year about the Philadelphia-Boston rivalries of the 80s. Personally, I think this is stupid. A lot of the fans and all of the player except Thon Maker weren’t even alive during that time. They don’t care. It’s a new team. It’s a different world. I am open to hearing other fan perspectives about this idea. Feel free to message or comment me to elaborate. I will read it.

The first round may produce some real snoozers.

The top 4 seeds at least will choose the weakest opponents and probably stomp them to the ground. We may have a record number of sweeps and 4-1 series. This is good game theory, but it’s bad for the fan watching experience. It’s also really boring for the media to try and fabricate some kind of narrative.

Did I miss anything?

Tell me about it!

Old Championships, Modern Players

Introduction

A few weeks ago, I saw a post on the NBA subreddit putting the 2003 San Antonio Spurs’ championship into perspective by comparing its roster to current players. While the original author’s logic is far too simple and flawed, the concept is worth a look. I chose to expand this idea by finding similar players for the ’03 Spurs, ’11 Mavericks, and ’08 Celtics. These Spurs and Mavericks teams are commonly cited cases for a team having just one incredible superstar (Tim Duncan and Dirk Nowitzki) and a collection of subpar players. The ’08 Celtics is a more down to Earth example of one of the first super teams, albeit an aging one.

So was Tim’s and Dirk’s supporting casts really that bad? Initially, I mostly think no. Basketball is and always will be a team sport. However, as detailed briefly by Malcolm Gladwell in this episode of Revisionist History, a transcendent basketball player is possibly the determining factor for how great his team is.

Method

Warning: Math ahead. Feel free to skip this section and go straight to the results.

To find the most similar players, I used basketball-reference.com‘s player statistics. The features include all per game, per 36 minutes, and other advanced statistics such as offensive rating. I did not include shooting statistics found in individual player pages because I didn’t know how to scrape them all yet. If you do know how or have access to it, I want to know! You can learn more at the site’s glossary.

Next the player pool starts during the 1980-81 season and ends at the 2016-17 season. Where the player pool begins is mostly irrelevant because I only want to pull from the 2010-11 season and on. Current players are defined as players during or after the 2010-11 seasons. I figured this would provide more recognizable names. Anything older would be inaccurate because of how differently basketball is played now versus then. I also wanted to keep the idea from the Reddit post using modern players for comparison. This is really just for fun.

Players are then subsetted to those who played more than 26 games and at least 450 minutes in the season total. 27 games is about a third of the season, and 450 minutes was include Speedy Claxton, a significant player in the Spurs playoff lineup.

In order to treat all the features equally, I uses scikit-learn’s StandardScaler to scale each data point in the feature so comparing points versus blocks have equal weight. Otherwise, points naturally having higher numbers will carry more weight than steals or blocks in my comparison method.

Finally, the most fun part is the comparison method. I implemented KDTree to calculate Euclidean distance between data points. While we can Pythagorean theorem the hell out of 2D or even 3D features, 25 dimensions is a bit harder for me to comprehend. So we can employ linear algebra and more maths to see which points are closest to one another. The closer the point, the more similar one player is to another simply based on statistics. I chose to exclude the distance values in my results for display simplicity. If you are interested in how similar each player is, feel free to message me. I ignored player position when querying results because of how differently basketball is played now.

Results

Remember that the players higher up on the list are more similar to the projected target player. I highlighted the two most similar players in their respective team colors. I also included the season and team of the player queried. I believe those are important things to remember when considering the exact player. Rookie Steph Curry and MVP Steph Curry are very different players.

2008 Boston Celtics

08_celts

Significant Lineup:

PG: Raymond Felton
SG: Rashard Lewis
SF: Manu Ginobili, the very possible, and some will argue rightful Finals MVP, full mane GINOBILIIIIII
PF: Tim Duncan
C: DeAndre Jordan

Bench:

Carl Landry
Patty Mills
George Hill (Or breakout playoff star Johnathon Simmons!)
Zaza

Yeah, I’d say that’s a pretty great collection of players.

2003 San Antonio Spurs

03_spurs

Results Surrounding Tim Duncan:

PG: Ty Lawson
SG: James Harden (OKC James Harden right before being traded)
SF: Jared Dudley
PF: Tim Duncan
C: Tim Duncan

Notes:

  • That’s pretty neat that David Robinson in his final year is comparatively so similar to Tim Duncan in their waning years.
  • Ty Lawson and young, speedy Tony Parker. Pretty solid comparison.
  • Manu and a young Beard! Also solid.
  • While this is not the current Warriors, Tim Duncan’s supporting cast is nowhere near as outrageously bad as the revisionist history says it is.

2011 Dallas Mavericks

11_mavs

Surrounding Dirk Nowitzki:

PG: Nicolas Batum
SG: Mickael Pietrus
SF: David West
PF: Dirk Notwitzki
C: NBA Champion Tristan Thompson

I’m just going to say it. That doesn’t look like a championship team to me.

Discussion

When looking at these player comparisons, Dirk’s run is easily the most impressive championship run in the modern era. The 3rd seeded Mavericks were a laughing stock at the time. Even the 6th seeded Blazers were jokingly favored over them. The Mavs then swept the defending champion Lakers in the semifinals, beat a young, up an coming Thunder team in the conference finals, and the first year of the Heatles in the finals. We called Dirk soft. We laughed at his teams. We laughed when Cuban refused to pay a future 2 time MVP, but the lovable German and his glorious blonde locks proved he is one of the greatest.

Tim Duncan’s unsupported run in 2003, while incredible due to his individual feats, looks a bit overstated. Young Manu was good. Young Tony Parker albeit with questionable decision making, was good. Bruce Bowen is good. Seeing how David Robinson compares to old Tim Duncan also means he’s still really good at basketball. Perhaps we remember Tim’s run as running with a bunch of scrubs because of just how incredible he was that year, how amazing Coach Pop is, and because that was the beginning of the big 3 coming together.

Moving Forward

Next steps would be gather shooting statistics such as volume of 2 pointers, shot distances, and other percentages to get a better idea of player similarity. While this only addresses how players are similar offensively, I believe these are especially important statistics because they show a lot about player positioning and where each player is effective. A high scoring Anthony Davis does not play in the same areas as a high scoring Kevin Love or Hakeem.

If I could, gathering defensive statistics and creating metrics for defense will make this query much, much better. Unfortunately, I do not have access to player tracking data. I will be writing a post soon about what we cannot measure that has significant effects on defense soon.

Caveats

  • Defensive statistics are incomplete. I am aware we cannot reasonably compare players this way. This was simply a fun exercise putting Tim’s and Dirk’s championship runs into perspective.
  • It’s difficult to find players to compare to great players. We define great players as great because they are outliers. Trying to find things to compare to outliers in itself is a bit silly. This is why I chose to leave out other championship teams, the two main stars, and think the Boston Celtics comparison table is probably garbage. I would not put any weight on it myself. It’s just there because it’s fun to see.
  • I am aware of the inconsistencies in table appearances, and I got really tired of them at this point. Excel is no fun to use!
  • Players with teams stating multiple teams means they were either bought out, traded, or re-signed somewhere else mid season.
  • I didn’t look at the degree to which players were similar to their outputs. I can get them, but I didn’t deem them necessary to see for this specific project.
  • Unsurprisingly, players are similar to themselves. I chose to remove those from the list.

A Look into the Top 50 Scoring Seasons

This post extracts the top 50 scoring seasons since 1981, the introduction of the 3-point line. This new mechanic undoubtedly changed the way players score, but how do the top scorers actually differ? You can view an interactive set of these plots here. It has mouseover effects, filtering, and highlighting effects containing more information about each player and season.

 

Looking at the counts of player entries, position, and time periods, we can see a few things:

  • Michael Jordan is incredible.
  • Adrian Dantley is a force too.
  • The wing players score the most.
  • Most of the high scoring seasons came earlier.
Scatter plot - all points
This scatter plot is a measure of both volume and efficiency. Being towards the bottom right is bad, and being in the top left, which doesn’t exist yet, would be amazing. The tiny dot next to Wilkins’ name is Adrian Dantley, signifying a high scoring output, high minutes played, and low usage rate, something Westbrook would be confounded by.

Things I note in this scatter plot:

  • The giant Russell Westbrook entry, with nearly 42% usage rate.
  • Adrian Dantley’s multiple entries have extremely low usage rates.
  • Again, Michael Jordan is amazing.
  • Although Allen Iverson has multiple entries, we can see his efficiency is low for the volume.
FTr vs 3PAr scatter
This plot visualizes how some of our entries differentiated themselves in how each player scored. In the top right, you have players who shot a lot of 3-pointers and free throws. In the bottom left, you have players who utilized neither. In the top left are players who shot a lot of free throws but not 3-pointers, hence the centers. In the Stephen Curry corner are players who shot few free throws but plenty of 3-pointers.

Shot type counts

These two plots show how scoring has changed among the eras. In the modern era, players utilize every kind of mechanic to score. However, what could be extra impressive is how Michael Jordan didn’t shoot the most free throws or 3-pointers, he just scored on two point field goals. This also visualizes the scoring balance of a lot of modern players, particularly James Harden.

Notes:

  • Eras are grouped as following:
    • 80s: 1980 – 1989
    • Before Zone Defense: 1990 – 2002
    • Introduction of Zone Defense: 2003 – 2010
    • Modern Era: 2011 – 2017
  • A glossary of terms can be found here.
  • Here is a definition of free-throw rate.
  • Here is a definition of 3-point rate.
  • An asterisk on a player’s name denotes a Hall of Fame player.
  • Again, interactive forms of these plots can be viewed here.

The Evolution of the 3-Point Shot

In the past few years, we’ve seen various statistics showing the NBA audience all the new 3-point records and how differently the 3-point line is used now. For comparison, the leader of the ’87 – ’88 season was Danny Ainge who made 148 3-pointers. During the ’17 – ’18 season is about 45 games in, and Klay Thompson leads the NBA with 154 3-pointers made. This post takes a look at how players have improved to take advantage of the 3-point line.

Dashboard - FTA
Since the introduction of the 3-point line in the 1980-81 season, 3-point attempts and 3-point percentages have steadily risen while 2-point shots per game have remained relatively the same.
Dashboard Scatter - all
When looking at the most prolific chuckers in the NBA, we can see that most of the entries, especially the good ones, come from the last 15 years. Era groups are shown in footnotes.
Dashboard Scatter - Modern
Even then, two names clearly stand out.
Dashboard Scatter - Top 50
Even among the most prolific 3-point makers in the NBA, the Splash Brothers are head and shoulders above everyone in attempts and efficiency. The one dot on the top right that doesn’t have a label corresponds to Ray Allen.

 

Dashboard - Splash
Just how special are they? They’ve maintained a high 3-point percentage while shooting at a high volume while being on the same team. The only dip came when this scrub Kevin Durant joined the team.

Few things to note:

  • Mark Jackson’s first year as head coach was the 2011-12 season. Steph and Klay immediately were clearly given the green light to shoot after that.
  • Eras are grouped as following:
    • 80s: 1980 – 1989
    • Before Zone Defense: 1990 – 2002
    • Introduction of Zone Defense: 2003 – 2010
    • Modern Era: 2011 – 2017
  • The 3-point line was shortened during the 1994-95 and 1995-96 seasons, the years Steve Kerr and Tim Legler achieved above 50% on 3-point attempts.
  • In the second scatter plot, the restrictions are a minimum of 1000 minutes played and at least 120 3 pointers made per season.

Intro

My name is Ben Xiao. I am a long time poker player, basketball enthusiast, prospective data wizard to be, German shepherd service human.

Today will be the start of many posts of basketball analytics. I’ll do my best to update this what I think is interesting material. In the meantime, I’ll be doing a lot of reading and data mining to present material worth sharing. As the very old Ben Franklin once said, “Either write something worth reading or do something worth writing.”