It’s why they call it ‘March Madness.’
Every year, a team that nobody expects anything from (and that many fans have never heard of) knocks off a team with a household name.
For example, in 2012, 2-seed Duke was stunned by 15-seed Lehigh.
First of all, where is Lehigh? Second of all, how in the world did that team vanquish the vaunted Duke Blue Devils in Greensboro, North Carolina? (The Lehigh Mountain Hawks are located in Bethlehem, Pennsylvania by the way.)
On the surface, a result like that is downright shocking. Below the surface? Could we have seen it coming?
Research, Research and More Research
I wanted to know if it’s possible to predict these colossal upsets–so I loaded up Excel, pulled up statsheet.com and got after it.
I looked back over the past 10 NCAA Tournaments and found the 30 most shocking upsets, including:
- 15’s over 2’s, 14’s over 3’s and 13’s over 4’s
- 12’s over 5’s and 11’s over 6’s that involved a non-BCS team defeating a BCS team
- 1-seeds losing to an 8/9 in the second round (and 10-seed Davidson over 2-seed Georgetown in 2008)
- 1-seeds losing in the Elite 8 to double-digit seeds
Then I decided on 13 different stats from the regular season with which to compare the teams involved in those upsets:
- Offensive efficiency (points scored per 100 possessions)
- Offensive turnover percentage (percentage of possessions that ended in turnovers)
- Field goal percentage
- Free throw percentage
- Rebounds per game
- Assist-to-turnover ratio
- Three-point field goal percentage
- Road winning percentage
- Strength of schedule
- Winning percentage in last 10 games
- Average age of starting five
- Defensive efficiency (points allowed per 100 possessions)
- Turnovers forced per game
Calling In The Expert
It took me the better part of a week, but I gathered all the data. Then I brought in my good friend Andy, a high-level engineer at a large global corporation (and a huge college basketball fan) to analyze the numbers.
Andy glanced over the massive spreadsheet and immediately began highlighting cells and selecting functions from drop-down menus that I never knew existed. I asked him what he was doing and he replied, “I’m just using your statistics to compare the winners and losers, assuming a ‘Gaussian distribution’ across national team rankings.”
In layman’s terms: We’ll find out which stats matter the most.
The analysis revealed The Top 6 Stats (in descending order of significance) that factor into your average shocking upset:
- Turnovers forced
- Average age of Starting 5
- Winning percentage in last 10 games
- Offensive turnover percentage
- Defensive efficiency
- Road winning percentage
When you think about it, the fact that these particular statistics matter the most makes a heckuva lot of sense. Basically, the teams that take the ball away, have more experience and are hotter down the stretch have a good chance of shocking the world. In addition, these teams tend to take care of the ball and play great defense while demonstrating the ability to gut out victories in tough environments throughout the regular season.
Testing the Analysis
Let’s revisit the aforementioned (15) Lehigh over (2) Duke upset:
- Turnovers forced: Lehigh ranked 111th in the nation, Duke ranked 234th
- Average age of starting five (1 = true freshman, 5 = redshirt senior): Lehigh 3.2, Duke 2.6
- Winning percentage in last 10 games: Lehigh .900, Duke .800
- Offensive turnover percentage: Lehigh 10th in nation, Duke 31st
- Defensive efficiency: Lehigh 58th, Duke 160th (Wow!)
- Road winning percentage: Lehigh .667, Duke .889
Let’s look at the game itself:
- Lehigh committed just 8 turnovers while forcing Duke into 12 giveaways
- Lehigh junior guard C.J. McCollum (30 pts, 6 assts) outplayed Duke freshman guard Austin Rivers (19 pts, 1 asst)
- Duke shot nearly 46% from the field in 2011-12, but Lehigh’s defensive efficiency was superior; As a result, Duke was held to 41% in the upset.
As you can see, the game followed suit.
Dispelling the Myths: The 7 Stats That Don’t Appear To Matter
Let’s move on to the stats that may not matter quite as much as you think. Ever hear a college basketball analyst discuss how a team’s strength of schedule is in the 200s and they don’t even deserve to be in the tourney? Or how a team shoots a miserable 39% from the field and a dismal 29% from 3—not recipes for success in March where scoring is at a premium?
Check that analysis at the door. In the 30 upsets analyzed, the “David” in the “David vs. Goliath” matchup had a SOS ranked almost 140 spots below their opponent. The “Davids” were, on average, measurably worse at knocking down shots.
Here are the least significant stats (in descending order):
- Strength of schedule
- Field goal percentage
- Three-point field goal percentage
- Rebounds per game
- Assist-to-turnover ratio
- Offensive efficiency (points scored per 100 possessions)
- Free throw percentage
Let’s examine the (14) Bucknell over (3) Kansas upset of 2005:
- Strength of schedule: Bucknell 172nd , Kansas 2nd
- Field goal percentage rank: Bucknell 138th, Kansas 25th
- Rebounds per game: Bucknell 298th, Kansas 73rd
- Assist-to-turnover ratio: Bucknell 222nd, Kansas 40th
- Offensive efficiency: Bucknell 211th, Kansas 36th
The game played out according to those stats. Kansas shot a better percentage and outrebounded the Bison, but if you refer to the stats that do matter, Kansas was 225th nationally in turnovers forced that season (Bucknell only committed 9 in the game) and hit only one of 11 three-point shots against Bucknell’s 16th ranked defense. Those factors allowed the Patriot Leaguers to hang around and eventually bust everybody’s brackets with the 64-63 win.
Summing Up and Looking Ahead
Upsets happen every year. It’s why the NCAA Tournament is regarded by many as the best postseason in sports.
The next two weeks will be spent gathering data on this year’s teams, so we can help you with your bracket.
Stay tuned for Selection Sunday (March 17th) and the week leading up to the Round of 64 when I will again team with Andy the Statistical Maven to break down this year’s tournament teams using a multivariable linear regression model to determine the most vulnerable ‘big namers’ and the most dangerous ‘little guys.’





Pingback: 2013 NCAA Tournament - Possible Upsets and Final 4 Picks