The New “BI” (Baseball Intelligence)
I spent last week at an off-site meeting with several other Microsoft people. With the unbelievable playoff action that was going on, the lunchtime topic inevitably turned to baseball. Given the international flavor of the table, it was an interesting set of perspectives on the sport. I am a huge baseball fan and I always have fun trying to get my wife to appreciate what is going on. But while I hold out hope for her (actually, I don’t and that’s OK—I think our marriage will survive :->), I don’t think I will ever get through to the international contingent. They just hate the start/stop/start cadence of American sports (baseball and American football, in particular). One of them pointed out that he always hears about not being able to see all the different pieces of the game, but he disagreed. He understood the count changes, batters changes, etc. and that the game was in a constant state of flux. It didn’t impress him. I started to explain that noticing the moving parts isn’t nearly as important as understanding and appreciating them. But I gave up mid-argument. I’ve tried so many times to describe it and I can never do so. Well, now I have a blog, so let me try here…
There’s a ton of beauty in baseball. There is indeed flux and it occurs in each individual count, each inning, each game, each week, and each season. Every season is a 26-week soap opera, complete with twists and turns. Watching a baseball game without some context of the situation or what is at stake is like non-alcoholic beer or decaffeinated coffee—it may be satisfying to some, but it misses the point for me. Sometimes, it’s something emotional (like the relationship between the city of Boston and so-called “Curse of the Bambino”) and sometimes it’s flat-out gutty (did anyone else see Curt Schilling’s bloody ankle on Sunday?). But here’s why I think it’s sad that scientists of any nationality are missing out: it’s a beautiful chaos that can be quantified and analyzed in the world of statistics and probabilities (sounds like Physics, Chemistry, and even Biology). What dominates the game more than anything else are the numbers. The downtime in pitches it literally the time it takes to run through all the data and re-assess the situation. A true fan often turns into a one-man “BI” system (time to pull out your SQL Server Analysis Services manuals). Each pitch changes the queried parameters on a multi-dimensional expression. There’s an entire group called SABR (Society for American Baseball Research) that lives to analyze these statistics and get to the heart of the game. These “sabrematicians” take a macro approach to what an astute baseball fan does on a micro-lever: spot the trends and understand the behaviors that drive outcomes. Why, you might ask. Well, the game changes in a matter of pitches. Let’s run through a realistic example of when the Orioles play the Red Sox…
When David Ortiz bats against Sidney Ponson with runners on base, Ortiz would likely have the advantage. Ortiz hits better with runners in scoring position (35% vs. 28% with no one on base) and hits right-handers far better than lefties (33% vs. 25%). Therefore, the manager brings in lefthander BJ Ryan. The parameters change as not only have you taken Ortiz out of his strength, but put BJ in his strength of facing a lefty (90% success vs. only 75% against righties). But BJ throws ball 1 and ball 2—suddenly the advantage turns back to Ortiz because he knows Ryan doesn’t want to throw another ball and Ortiz likes to take a huge swing, which he can do without risk. Statistics captures this as Ortiz is successful over 37% of the time in these situations Ryan only gets 66% of his hitters out once he reaches this count. Ryan throws the next pitches over and Ortiz fouls both back. Ryan is in the advantage again because he excels at two-strike counts and Ortiz has to be defensive to avoid the strikeout. We’re back to a 16% success rate for Ortiz and 83% for Ryan. Ryan throws his wicked slider, Ortiz misses and the at-bat is over. Now that was five pitches and a couple of minutes to close out the at-bat. According to the new fan, we waited all that time for a strikeout. But to me, the constant re-assessment of the situation and evolution of the showdown keeps me engaged. Situation re-evaluation is what gets the anxiety level rising and falling even though “nothing is happening”. The new baseball fan will never understand this—there’s just too much to go though on this (in this example, we just talked about the count and base runners. We didn’t talk about what inning the game was in or what the score was, which also has an element of influence on the outcome. There’s too much in terms of understanding where the levers are to make the re-assessment. And there’s always the psychological element of the broader picture. If this happened in September, Ryan was trying to gain a greater role in the Orioles bullpen, the Orioles were making a last dash at finishing with a winning record for the first time in seven years, Ortiz is one of the top contenders for the MVP award, and the Red Sox need this win to make the playoffs and suddenly the highs and lows are magnified.
This is what I see when I watch a baseball game. So when someone asks me to teach them to love baseball, I know I am doomed to failure. They used to have a saying about Oriole legend Cal Ripken: “Watch him for one day and you’ll be bored. Watch him for a year and you’ll be amazed.” That describes how I see the sport. So, for those of you who are taken to a game and bored by what you see, just know that you probably aren't running the right "analytics"…
{Green Day – American Idiot}