Several days ago, Microsoft Solver Foundation 3.0 was released, providing new features to build and solve nonlinear models. These features are at the solver, Solver Foundation Services, and modeling levels. Solver Foundation’s Compact Quasi Newton solver handles unconstrained nonlinear models – it has been supported since version 1.0. Before our 3.0 release, the Compact Quasi Newton solver (CQN) had its own API for modeling and solving unconstrained, unbounded, non-linear models, and there was no access to this solver from Solver Foundation Services. In version 3.0, CQN solver uses the same interfaces that any other third party non-linear solver can use for modeling and solving a non-linear model: the INonlinearModel and INonlinearSolver interfaces. In this post we will show you how to model and solve nonlinear models using the new CQN solver API. In some later post we will show how to use Solver Foundation Services to model your problem, without needing to specify derivatives.
In this walkthrough we consider the Rosenbrock optimization problem. The original problem is a simple function with two dimensions (variables) that is often used as a test problem in optimization. The reason we use this function is that it can very easily be expanded to n-dimensions. We will use a multi-dimensional variant when each variable is coupled just with one other variable.
Here are the logical steps you need to take in order to model and solve problem with the CQN solver. These steps apply to any other nonlinear solver (with a few small variations).
1. Create the modeling object.
2. Define your model. This includes the variables, the function to optimize (the goal) and the gradient vector. Note that only unconstrained, unbounded models can be solved with CQN, no constraints are specified.
3. Solve the model and optionally report the solution.
Let’s see how those steps are carried out with the new solver API:
1. Create the solver and parameters. Solver parameters can be used to configure solver behavior.
var solverParams = new CompactQuasiNewtonSolverParams();
var solver = new CompactQuasiNewtonSolver();
2. Next, create the variables. For this sample we are solving the 100 dimensional model, so we create 100 variables. Note that we use null for the variable keys. Keys can be useful if you want to refer to a variable by its key instead of its index. The tradeoff is a bigger memory footprint (to store the keys).After that, we add a single row for the objective function and register it as a goal. Finally, we define callbacks for function and gradient evaluation.
int dimentions = 100;
int vidRow;
int[] vidVariables = new int[dimentions];
// add variables, using null as the key.
for (int i = 0; i < dimentions; i++)
solver.AddVariable(null, out vidVariables[i]);
//add a row and set it as the minimization goal
solver.AddRow(null, out vidRow);
solver.AddGoal(vidRow, 0, true);
// set the function and gradient evaluators.
solver.FunctionEvaluator = RosenbrockFunction;
solver.GradientEvaluator = RosenbrockGradient;
3. Solve the model and print out the result, using the solver’s ToString() override.
//Solve the model
solver.Solve(solverParams);
Console.WriteLine("== Multidimensional variant of Rosenbrock ==");
Console.WriteLine(solver.ToString());
That’s it. Well, not quite, as the interesting part is the definition of the function and gradient callbacks. Let’s see the definition for them:
/// <summary>
/// Function value callback for first variant of Rosenbrock's function.
/// This is the multidimensional variant of Ronsenbrock when n must be pair
/// f(x) = sum(i, from 1 to N){ [alpha(x(2i) - x(2i-1)^2)^2] + [(1 - x(2i-1))^2] }
/// <param name="model">the model.</param>
/// <param name="rowVid">the row index.</param>
/// <param name="values">the variable values.</param>
/// <param name="newValues">is first evaluator call with those variable values.</param>
/// <returns>the row value</returns>
private static double RosenbrockFunction(INonlinearModel model, int rowVid, ValuesByIndex values, bool newValues) {
const int firstVid = 1;
const int alpha = 100;
double value = 0;
int dimentions = model.VariableCount;
for (int i = firstVid; i <= dimentions / 2; i++) {
value += alpha * (Math.Pow((values[2 * i] - values[2 * i - 1] * values[2 * i - 1]), 2)) + (Math.Pow(1 - values[2 * i - 1], 2));
}
return value;
/// Gradient value callback for the first variant of Rosenbrock's function.
/// </summary>
/// <param name="gradient">the gradient values (set by the user).</param>
private static void RosenbrockGradient(INonlinearModel model, int rowVid, ValuesByIndex values, bool newValues, ValuesByIndex gradient) {
gradient[2 * i - 1] = (4 * alpha * values[2 * i - 1] * (values[2 * i - 1] * values[2 * i - 1] - values[2 * i]) + 2 * values[2 * i - 1] - 2);
gradient[2 * i] = (2 * alpha * -1 * (values[2 * i - 1] * values[2 * i - 1] - values[2 * i]));
There are few types and concepts worth mentioning.
- ValuesByIndex – This type allows you to map from a variable index to underlying solver value. The values argument represent the current values for the variables when the callback is called, so the function and gradient callbacks can get the values in order to compute the function value and the gradient values. If for example there is a variable with index x, values[x] is the current value of x. The gradient argument represents the gradient value associated with each variable, so if for example there is a variable with index x, gradient[x] represent the value of df/dx. Gradient evaluator uses gradient argument to set the right value for the gradient.
- Index allocation convention – There is a convention for how the CQN solver allocates variable and row indexes. By convention variables get indexes from 1 ... VariableCount, in the order they were added. The only row (the goal) will always get the index 0. You can rely on this convention in your callbacks to access variable and row indexes without having to do any bookkeeping. In fact, notice that we did not use the vidVariables array after creating the variables, and we could have even omitted it altogether.
- bool newValues – This Boolean flag indicates if this is first evaluator call with those variable values. This can be used to make the computation of gradients in your evaluator more efficient, as long as you have some computation on the function callback, which can be carried over to be used when computing the gradient. For each point, CQN solver always calls the function evaluator first followed by gradient evaluator. Therefore this Boolean will always be true when function evaluator is called, and false when gradient evaluator is called. In this sample code we did not take advantage of this flag.
With that in mind, the rest is straightforward. The function evaluator gets back the value of the function in the current point, and the gradient evaluator sets the values for the gradient.
Running the code, the output is :
== Multidimensional variant of Rosenbrock ==
Minimize problem
Dimensions = 100
Variable indexes:
1, 2, 3, 4, 5, 6, 7, 8, 9, 10...
Starting point = 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
Solution quality is: LocalOptima
Number of iterations performed: 22
Number of evaluation calls: 28
Finishing point =1.00000000000559, 1.00000000001197, 1.00000000000559, 1.00000000001197, 1.00000000000559, 1.00000000001197, 1.00000000000559, 1.00000000001197, 1.00000000000559, 1.00000000001197...
Finishing value =4.72745153676104E-21
All of this information can also be programmatically accessed from the solver through methods or properties.
The code for this sample is a modification of code provide as a sample project in your document directory under “…\Documents\Microsoft Solver Foundation\Samples\Solvers\C#\CQN”.
In the next post, we will consider the following question: how would you model Rosenbrock function optimization using the Solver Foundation Services? (Hint: easily, naturally and with no gradient definition!)
-Solver Foundation Team