Welcome to MSDN Blogs Sign in | Join | Help

Raymond's highly scientific predictions for the 2008 NCAA men's basketball tournament

It's that time again: Raymond comes up with an absurd, arbitrary criterion for filling out his NCAA bracket.

This time, I studied all the games played in the NCAA men's basketball tournament since 1985 and computed how many of the games were won by the favorite and how many were upsets, broken down by the numerical difference between the seeding of the two teams.

Seed
Difference
Winner Upset
Rate
FavoriteUnderdog
1 121 105 46%
2 14 18 56%
3 110 76 41%
4 55 23 29%
5 113 45 28%
6 9 2 18%
7 100 42 30%
8 114 40 26%
9 81 19 19%
10 3 2 40%
11 86 15 15%
12 3 0 0%
13 84 4 5%
14 0 0 N/A
15 88 0 0%

I found it interesting that when the teams are seeded N and N+2, you get an upset more than half the time!

If the probability of the favorite winning is p and I choose the favorite with probability q, then the prediction would be correct pq + (1−p)(1−q) = (2p−1)q + (1−p) of the time. If you hold p constant, then this is maximized when q = 0 if p < ½, or when q = 1 if p > ½. (If p = ½, then it doesn't matter what you pick for q.)

Therefore, to maximize the number of correct predictions, I should always choose the favorite, unless the two teams are seeded N and N+2, in which case I always choose the underdog. But that makes for a boring bracket. Consequently, I went for the suboptimal algorithm of choosing q = p. Here is the result:


Opening Round Game

16Mount St. Marys Mount St. Marys
16Coppin St.

East bracket

1North Carolina North Carolina North Carolina North Carolina Louisville
16Mount St. Marys
8Indiana Arkansas
9Arkansas
5Notre Dame Notre Dame Washington St.
12George Mason
4Washington St. Washington St.
13Winthrop
6Oklahoma Oklahoma Louisville Louisville
11St. Joes
3Louisville Louisville
14Boise St.
7Butler South Alabama Tennessee
10South Alabama
2Tennessee Tennessee
15American

Midwest bracket

1Kansas Kansas Kansas Kansas Gonzaga
16Portland St.
8UNLV UNLV
9Kent St.
5Clemson Villanova Villanova
12Villanova
4Vanderbilt Siena
13Siena
6USC Kansas St. Wisconsin Gonzaga
11Kansas St.
3Wisconsin Wisconsin
14Cal St. Fullerton
7Gonzaga Gonzaga Gonzaga
10Davidson
2Georgetown UMBC
15UMBC

South bracket

1Memphis Memphis Oregon Michigan St. Michigan St.
16Texas Arlington
8Mississippi St. Oregon
9Oregon
5Michigan St. Michigan St. Michigan St.
12Temple
4Pittsburgh Pittsburgh
13Oral Roberts
6Marquette Marquette Stanford Stanford
11Kentucky
3Stanford Stanford
14Cornel
7Miami Fla Miami Fla Texas
10St. Marys
2Texas Texas
15Austin Peay

South bracket

1UCLA UCLA UCLA UCLA Xavier
16Miss Valley St.
8BYU Texas A&M
9Texas A&M
5Drake Drake Drake
12W. Kentucky
4Connecticut San Diego
13San Diego
6Purdue Purdue Xavier Xavier
11Baylor
3Xavier Xavier
14Georgia
7West Virginia West Virginia Duke
10Arizona
2Duke Duke
15Belmont

Finals

3Louisville Michigan St. Michigan St.
5Michigan St.
7Gonzaga Xavier
3Xavier
Published Friday, March 21, 2008 7:00 AM by oldnewthing
Filed under:

Comments

# re: Raymond's highly scientific predictions for the 2008 NCAA men's basketball tournament

Friday, March 21, 2008 11:01 AM by Tom

Well, Raymond, as of today you are 17 for 17.  You even got the opening game right.

# Money, man

Friday, March 21, 2008 11:29 AM by Nathan_works

You got skin in the game or just pontificating ? Surely your coworkers setup a $5 bracket or somesuch.

# re: Raymond's highly scientific predictions for the 2008 NCAA men's basketball tournament

Friday, March 21, 2008 12:30 PM by Tom

I'd say Raymond probably doesn't have any "skin" in the game.  If you check out this post < http://blogs.msdn.com/oldnewthing/archive/2006/03/16/552822.aspx > you'll see he doesn't know squat about basketball and just dos this for fun.  The first year the teams were ranked based on whose president served the longest, and the next year it was based on the pay of the head coach.  

I gave a copy of Raymond's brackets to an NCAA freak here at work and his jaw dropped when he say Raymond had predicted the UMBC upset of Georgetown.  I dunno, Raymond -- you might have found a good method here!

# re: Raymond's highly scientific predictions for the 2008 NCAA men's basketball tournament

Friday, March 21, 2008 1:16 PM by Josh

WHOO! Go Retrievers!

I graduated from UMBC last year, and I can't remember any sort of basketball team on campus. I have no idea how they managed to make it to the NCAA tournament, much less beat the #2 team.

# re: Raymond's highly scientific predictions for the 2008 NCAA men's basketball tournament

Friday, March 21, 2008 3:08 PM by SM

Shouldn't they play the game before a winner is declared?  It's starting on 3/21 at 3:10...

# re: Raymond's highly scientific predictions for the 2008 NCAA men's basketball tournament

Friday, March 21, 2008 3:24 PM by Neil

OK, I'll bite: why are so few matches played between teams with an even seed difference?

# re: Raymond's highly scientific predictions for the 2008 NCAA men's basketball tournament

Friday, March 21, 2008 3:40 PM by Euro

You ought to read Isaac Asimov's 'The Machine That Won The War' for the most effective statistical algorithm ever created to make the right choices in situations like this.

# re: Raymond's highly scientific predictions for the 2008 NCAA men's basketball tournament

Friday, March 21, 2008 4:10 PM by Craig

>OK, I'll bite: why are so few matches played between teams with an even seed difference?

All the first round games are played with an odd seed difference (1 plays 16, etc).    More than half the games in any bracket are first round games, so assuming a 50-50 distribution of even and odd seed differences after the first round means that 75% of the games are odd.

# re: Raymond's highly scientific predictions for the 2008 NCAA men's basketball tournament

Friday, March 21, 2008 4:21 PM by Daniel

Raymond, K St. over USC, impressive.

> OK, I'll bite: why are so few matches played between teams with an even seed difference?

Because the way tournament is structured, the best team plays the worst team. If all the favorites win (they often do), you will always have an odd number for the seed difference (only an upset produce an even seed difference next round). Only exception is when 8th seed plays the 9th seed in the first round, it's a virtual coin toss. Therefore, it's a 50-50 for a first seed to meet a 8th or 9th seed in the second round of regional action. (look at row 7 and 8, nearly identical).

# re: Raymond's highly scientific predictions for the 2008 NCAA men's basketball tournament

Friday, March 21, 2008 4:30 PM by Julia

So far you've only missed two... looks like you have a pretty good method.

Go Michigan St.!

# re: Raymond's highly scientific predictions for the 2008 NCAA men's basketball tournament

Friday, March 21, 2008 5:17 PM by Brandon Turner

I like the looks of your bracket, Michigan State out front like they should be!

Go Green!

# re: Raymond's highly scientific predictions for the 2008 NCAA men's basketball tournament

Friday, March 21, 2008 7:05 PM by john

I have nothing to add other than the fact that you have two South brackets and no West bracket.  I nitpick at thee!

# March Madness Craziness: Part One &laquo; Esoteric Dissertations from a One-Track Mind

# re: Raymond's highly scientific predictions for the 2008 NCAA men's basketball tournament

Wednesday, March 26, 2008 10:07 AM by patrick

You did extremely well with your first round picks. Nice job.  I'm trying to understand your methodology. You wrote that you "went for the suboptimal algorithm of choosing q = p."  Can you explain that more? I don't understand but am fascinated by what you came up with.

# re: Raymond's highly scientific predictions for the 2008 NCAA men's basketball tournament

Thursday, March 27, 2008 3:21 PM by ScottB

@patrick -- it means that instead of always picking the favorite, he'll pick the favorite most of the time: for seed difference 1 (upset rate 46%), he's 46% likely to choose the upset (probably by RNG).

# re: Raymond's highly scientific predictions for the 2008 NCAA men's basketball tournament

Friday, March 28, 2008 8:22 AM by patrick

@ScottB & Raymond.  I don't understand... Take, for example, the choices when #4 vs. #13.  Twice the #4 seed is picked to win, twice the #13 seed is picked, but the upset rate for this differential is not 50%.  And the two times the #13 was picked (San Diego over UCONN, Siena over Vanderbilt), Raymond was correct. Why were these particular #13's chosen and not the other two #13s?  Just trying to figure out whether this was random chance.

[You found me out. It was not random chance. I used my time machine. -Raymond]
New Comments to this post are disabled
 
Page view tracker