The InteriorPointSolver class is capable of solving linear, quadratic, and (in our upcoming 2.0 release) second order conic optimization problems. Let's take as an example the production planning problem I wrote about in a previous post. It is a simple linear programming problem with two decisions (the production for each refinery) and three constraints (the demand for each refined product). In our SFS sample we simply declared the decisions, goals, and constraints and expressed them algebraically using the SolverContext and Model. The InteriorPointSolver class represents linear models in a slightly different way: using variables and rows. Variables are the analogue of SFS Decisions, whereas rows can represent either goals or constraints. Both variables and rows can have upper and/or lower bounds. In a linear model, both goals and constraints consist of linear combinations of variables, such as "3*x + 4*y + 5*z". Whereas in the SFS we can express these terms very naturally, with the InteriorPointSolver, the terms are expressed one at a time by specifying the coefficient associated with each variable in the row. Here is the C# code:
private static void PetrochemLinearModel() { InteriorPointSolver solver = new InteriorPointSolver(); int sa, vz; solver.AddVariable("SA", out sa); solver.SetBounds(sa, 0, 9000); solver.AddVariable("VZ", out vz); solver.SetBounds(vz, 0, 6000); int goal; solver.AddRow("goal", out goal); solver.AddGoal(goal, 0, true); solver.SetCoefficient(goal, sa, 20); solver.SetCoefficient(goal, vz, 15); int demand1, demand2, demand3; solver.AddRow("demand1", out demand1); solver.SetCoefficient(demand1, sa, 0.3); solver.SetCoefficient(demand1, vz, 0.4); solver.SetLowerBound(demand1, 1126); solver.AddRow("demand2", out demand2); solver.SetCoefficient(demand2, sa, 0.4); solver.SetCoefficient(demand2, vz, 0.2); solver.SetLowerBound(demand2, 1500); solver.AddRow("demand3", out demand3); solver.SetCoefficient(demand3, sa, 0.2); solver.SetCoefficient(demand3, vz, 0.3); solver.SetLowerBound(demand3, 500); var param = new InteriorPointSolverParams(); var solution = solver.Solve(param); Console.WriteLine("goal = " + solution.GetValue(goal).ToDouble()); Console.WriteLine("sa = " + solution.GetValue(sa).ToDouble()); Console.WriteLine("vz = " + solution.GetValue(vz).ToDouble()); }
You'll also need to add "using Microsoft.SolverFoundation.Common;" and "using Microsoft.SolverFoundation.Solvers;" to the top of your program. Extending this code to solve a quadratic program is easy: there is an overload of SetCoefficient that takes two variables as input. In my next post I will talk about second order conic programming, and a few other improvements in the 2.0 version of the InteriorPointSolver.
Lengning Liu has written a great post that highlights another exciting Solver Foundation 2.0 area: a Solver Foundation ODSL in F#. (F# is a functional programming language for .Net.) The ODSL provides an intuitive, cool way to express LP and MIP problems. My favorite addition to the ODSL is units of measure. Parameters and decisions can be annotated with their units of measure, and the associated consistency checks are enforced by F# at compile time. This is a great way to provide clarity to models as well as enforce correctness. Here's an example from Lengning's post:
let a = 20.0<Dollar/Barrel>
let b = 15.0<Dollar/Barrel>
let sa = var<Barrel/Day>()
let vz = var<_>()
minimise (a * sa + b * vz)
where
[
0.3 * sa + 0.4 * vz >= 2000.<_>;
0.4 * sa + 0.2 * vz >= 1500.<_>;
0.2 * sa + 0.3 * vz >= 500.<_>;
sa <= 9000.<_>;
vz <= 6000.<_>;
sa >= 0.<_>;
vz >= 0.<_>
]
I left a small cliffhanger in my last post. After a long week I finally had a chance to read through the Adams paper about estimating the value of "going for it" on 4th down. I admit I was a little bit let down. As a reminder - the question is what action a football team should take on fourth down. Failure to gain the necessary yards means the ball is turned over to the opposing side, kicking turns over the ball but with better field position, and making the first down allows the drive to continue, potentially leading to more points. The conclusion of the Romer paper was that coaches are too conservative and kick the ball away in situations where they should go for it instead.
Adams hits the nail on the head by asserting that the results of the Romer paper just do not pass the "smell test". It's nuts to suggest that it's a good idea to go for it on 4th and 4 on your own 25 yard line. But that leaves us only with more questions - is the conclusion of the Romer paper still valid, even if overstated? Can we identify a flaw in the reasoning? Is there a better way to model the problem?
Adams first suggestion for improving the model is to include more historical data. Adams and Romer both claim it's hard to come up with a good model for the "going for it" problem because teams seldom go for it on fourth down in practice - data is hard to come by. Romer and Adams both use game data from the 1998 - 2000 seasons, but Adams uses data from the entire game, not just the first quarter. But why not include more recent data? [The Adams paper was written in '06, so he could have doubled the data set. We have a couple more seasons-worth of data now.] So I'm not sure I even buy the premise that data is lacking.
Adams' second approach is to use Madden '07 to simulate 4th down situations. I initially thought this was a really cool idea, and it kind of is, and then I remembered something I once read. Madden himself asked the designers at EA to make 4th downs more difficult to convert! You cannot find a better example of Galbraith's notion of "conventional wisdom" in action. So as far as I am concerned, you have to throw out the middle section of the paper. Madden is not a simulation: it is pretending to be a simulation. It wants to make you feel like you are experiencing real NFL football. But the problem is that we as players do not make decisions the way that GMs, coaches, and players do. Our motivations are completely different, and there are no real consequences for our actions (other than bragging rights over your roommate). My GM will not fire me if I go for it on 4th and 5 on my own 25. Thus the game must be tuned to correct for this, otherwise you will get Tecmo-like gameplay.
The last section proposes a game-theoretic approach. Adams introduces a zero-sum game with the offense and defense as opponents. The offense and defense both have the choice of choosing a pass- or run-oriented strategy. The payoffs depend on their choices. Adams points out that this is a "simplified version of reality." (It's very close to the original Tecmo Bowl - two choices instead of four.) He uses this approach primarily to make the point that it is not a good idea (as Romer proposes) to use third down data to model fourth down choices, because the payoffs change enough to matter. It is an interesting line of argument for the claim that Romer's conclusions are overstated, but it does not provide insight into how to better model the problem.
Anyway, in the course of poking around the web I came across the ZEUS Football simulation engine. It is frequently referenced in the NYTimes "5th down" blog. For example, here is an interesting discussion about taking an intentional safety late in the game. (I won't bother to explain what that means, because if you have made it this far, you clearly already know what I am talking about.)
All the questions I raised at the beginning of this post are probably best answered by a simulation engine. Which reminds me - did I mention that Solver Foundation is adding stochastic capabilities for our version 2?
Last weekend marked the first big college football Saturday of the year. The only game I really cared about (now that I am married and have kids) was the Northern Iowa – Iowa game: I went to Iowa, and I grew up in Cedar Falls (home of the UNI campus). Iowa came from behind and won 17-16 after blocking two field goal attempts in the final seconds. My brother and I were talking on the phone during the 4th quarter. My brother, a UNI fan, was bothered by the conservative coaching that he feels let Iowa back into the game, and that spun into a more general conversation about risk-averse coaching. I don't know anything about sabermetrics, but I did read Moneyball, and I love sports.
Our conversation reminded me of a paper by economist David Romer that I had always intended on reading: “It’s Fourth Down and What Does the Bellman Equation Say?” (I actually recommend this updated 2005 version.) So I read it. It received some attention from the sports world when it came out because Romer’s claim is that the conventional wisdom is wrong: it’s often a much better idea to try to convert on 4th down rather than kick the ball away to the other team. It depends on field position and yards-to-go, of course. He sets up a not-too-complicated dynamic programming model where he is able to place values on particular game situations, and then compares the difference between kicking and “going for it”. It’s interesting but I have some problems with it – in particular the use of 3rd down outcomes rather than 4th down outcomes. The justification is that because teams don’t go for it on 4th down very often there is not enough data, so 3rd down data is a reasonable substitute. I have issues with this because for one thing, playcalling is very different on 3rd down, especially when one is approaching field goal range. Players are also taught to handle 3rd down differently - throw the ball away and avoid taking a sack. Romer does address these sorts of issues, but it still bothers me.
To overcome the lack of 4th down “going for it” data, Christopher Adams from the FTC (!) uses Madden NFL 07 (!!) for simulation purposes and constructs a game theoretic model for 4th down attempts in this paper. He comes to a different conclusion: the conventional wisdom may not be so bad after all. I am looking forward to reading the Adams paper in detail - it's in my backpack. I hope to do some experimentation in this area once I get a grip on the concepts. I would like to write a paper about the prevent defense that Gregg Easterbrook and Bill Simmons despise so much! But right now, I’ve got to get back to work ;)
I'm back from ISMP and helping to put the finishing touches on Solver Foundation 2.0. In our first version we introduced: our declarative modeling language OML; Solver Foundation Services, which provides a powerful, consistent .Net API for modeling and solving a wide range of decision problems; a set of solvers for linear, quadratic, mixed integer, constraint, and unconstrained nonlinear programming; third-party solver plug-in support; and an Excel add-in that provides easy access to all of Solver Foundation's functionality through Office 2007. Our aim in version 2.0 is to make it easier for non-experts to model and solve problems, and to expand the range of problems that can be solved using Solver Foundation. I will spend the next few posts talking about our plans for 2.0.
Many real-world problems involve uncertainty. For example, task durations, sales estimates, rate of return, and so on. Simulation and/or stochastic programming techniques are often used to account for randomness, but applying such techniques can often be quite tricky. In Solver Foundation 2.0 we are adding OML, Solver Foundation Services and solver support for linear stochastic models. The two new modeling concepts are:
As you will see in a moment, if you know how to build regular models using OML, building models with random parameters and recourse decisions will be no problem. Solver Foundation Services will contain new APIs that allow for stochastic modeling using random parameters and recourse decisions. Indexed random parameters and data binding will be fully supported. We will support the following distributions in V2:
Here is a simple OML example. It is a simple production planning problem with scalar input (here is a non-stochastic sample in C#). We introduce recourse decisions that represent the amount of pre-refined product to buy, in case demand cannot be met by production. The demand for each product is random; the demand for gas is modeled using two scenarios of equal probability, the others are uniformly distributed. Not very realistic, but I wanted to cram all the new concepts in a short model! Indexed random parameters are of course possible. In that case you will be able to use data binding to define the parameters associated with the distributions - keeping a clean separation between data and model, and allowing for easy experimentation. Model[ Decisions[Reals[0, Infinity], SaudiArabia, Venezuela], Decisions[Reals[0, Infinity], // To define a recourse decision, just wrap Recourse[] around it. Recourse[GasBuy], Recourse[JetFuelBuy], Recourse[LubricantBuy] ], Parameters[ // Here is a scenario-based random parameter with two scenarios of equal probability. Scenarios[Reals[0, Infinity]], GasDemand = {{0.5, 1900}, {0.5, 2100}} ], // Here is a continuous random parameter. Parameters[UniformDistribution[1500, 1600], JetFuelDemand], Parameters[UniformDistribution[400, 500], LubricantDemand], Goals[ Minimize[ goal -> 20 * SaudiArabia+ 15 * Venezuela + (38.4 * GasBuy + 35.2 * JetFuelBuy + 28.8 * LubricantBuy) ] ], Constraints[ 0.3 * SaudiArabia + 0.4 * Venezuela + GasBuy >= GasDemand, 0.4 * SaudiArabia + 0.2 * Venezuela + JetFuelBuy >= JetFuelDemand, 0.2 * SaudiArabia + 0.3 * Venezuela+ LubricantBuy >= LubricantDemand, SaudiArabia <= 9000, Venezuela <= 6000 ]] The modeling constructs are simple, but underneath the hood there is a lot going on:
Parameters[ // Here is a scenario-based random parameter with two scenarios of equal probability. Scenarios[Reals[0, Infinity]], GasDemand = {{0.5, 1900}, {0.5, 2100}} ], // Here is a continuous random parameter. Parameters[UniformDistribution[1500, 1600], JetFuelDemand], Parameters[UniformDistribution[400, 500], LubricantDemand], Goals[ Minimize[ goal -> 20 * SaudiArabia+ 15 * Venezuela + (38.4 * GasBuy + 35.2 * JetFuelBuy + 28.8 * LubricantBuy) ] ], Constraints[ 0.3 * SaudiArabia + 0.4 * Venezuela + GasBuy >= GasDemand, 0.4 * SaudiArabia + 0.2 * Venezuela + JetFuelBuy >= JetFuelDemand, 0.2 * SaudiArabia + 0.3 * Venezuela+ LubricantBuy >= LubricantDemand, SaudiArabia <= 9000, Venezuela <= 6000 ]]