Jumping Into C++: Calculating Unknowns

Eric Battalio

February 20th, 20130 0

Last time, I wrote a C++ program to count words in text files. This time, write some code to calculate the motion of an object as an excuse to create a class, use function pointers and mess with some new containers. Thanks to my colleagues Ale Contenti and Andy Rich for providing the inspiration and original Simple Physics Calculator application, James McNellis for reviewing drafts and providing code snippets, and Stephan Lavavej for additional feedback.

Requirements

This time, I need to develop a class that provides information about the motion of an object. Requirements include:

There are five equation variables: distance, time, acceleration, initial velocity, and final velocity.
Use the Kinematic Equations to calculate for two unknown equation variables when the other three variables are known.
Provide a way to clear or reset the variables back to an unset state.
Avoid calculations with division by zero, negative square roots, or when there are not exactly two unknown and three known values.

For example, if the class is given distance (250 meters), time (7.3 seconds) and acceleration (3.2 meters/second squared), if should calculate the unknown values initial velocity (22.5666 m/s) and final velocity (45.9266 m/s). Resetting the class makes it ready to accept a new set of known values. If there is a problem calculating the unknowns, the operation should fail with an error message. For this exercise the code is used in a console application, but it should be general enough to be usable to drive a UI like this one:

As a bonus requirement, the code should be portable. For those who cannot wait, grab the codeand dig in. [Updated 2/25/2013: Code is also on GitHub at https://github.com/ebattalio/calculate-unknowns]

First Version: The General Problem

My first attempt addressed the general problem of solving for two unknowns with three known values. The final solution will still need to use the Kinematic Equations, but the general solution is simpler for me to work with (I don’t need to worry about the subtleties of physics and the equations) and can be used as the starting point for similar applications in other domains or with more variables. The math is not difficult, but I’m here for the C++. The real equations will be mixed in later. (Thanks XKCD!) Before cracking open Visual Studio, I wondered what code would handle the combination of known and unknown values. The first thing that popped into my head was “if statements”; use a bunch of cascading if statements to cull known values and calculate unknown values. Each if statement would check for a particular combination of unknown values, one for each combination. How many combinations are there?

I doodled a chart and found there are only 10 combinations of unknowns that matter. The combinations on the diagonal (circled) use the same variable twice so can be ignored. The combinations in the upper-right triangle duplicate those in the lower-left, so they too can be ignored. What remains is the lower-left triangle, ten different combinations of unknowns. Not too many for if statements, but I preferred using a combination of the two unknowns as a key value (for example, “ae” or “bd”) along with a switch statement to execute the right computations. Switch statements are still cool, right? They are easy to follow, efficient and can be extended with little effort. With that solved, I thought about how to store equation values. An equation value is a number but may empty (unset and unknown), so a single variable would not work. I needed two, one to track the value and one to track the presence of a value. In my first version, I went newbie and used simple member variables (one pair for each variable) and custom helper functions (get, set, and has value). This implementation is not ideal, but it is simple, easy to test, and easy to evolve. I’ve cribbed a couple of parts from the FormulaV1.h. The first example shows how I declared the variables to track equation values and the second shows how the key code is calculated. I’ve also copied one of the get functions.

double _a, _b, _c, _d, _e;

bool _a_hasval, _b_hasval, _c_hasval, _d_hasval, _e_hasval;

unsigned unknownsKey() const

{

return !hasA() + (!hasB() << 1) + (!hasC() << 2) + (!hasD() << 3) + (!hasE() << 4);

}

double getA() const

{

if(hasA()) return _a; throw FormulaV1Exception(“Missing A”);

}

The code throws an exception to tell the caller “hey, there is no value for this variable” when there is no value. Is this better than rewriting the function to pass the value as an output parameter and return true if a value exists? In both cases, the onus is still on the caller to track the state of the variables. If the caller fails, the code will fail and the result will be the same. Herb Sutter captures the essence in When and How to Use Exceptions:

Distinguish between errors and nonerrors. A failure is an error if and only if it violates a function’s ability to meet its callees’ preconditions, to establish its own postconditions, or to reestablish an invariant it shares responsibility for maintaining. Everything else is not an error.

I’ll treat this as a best practice.

Second Version: Using Ordered Map and Function Pointers

The first version used a switch statement to solve for the two unknowns when the calculation is invoked. The default case handles unrecognized combinations of unknowns, something that short of a bad coding should never happen. Thinking about this, all the switch statement really does is map a key (the set of unknowns) to a value (a code block or function). In the second version of the code, I replaced the switch with a map storing an integer key and function pointer value:

typedef std::map<unsigned int, std::function<void()>> FormulasV2;

FormulasV2 _formulas;

It is initialized in the constructor using a static function that returns a populated formulas object, a “neat trick” that I will use again in the future. In some cases, it might also be faster. For this to work, the initializing function needs to bind to the right this pointer. In the code, it is passed from the caller:

FormulaV2() : _formulas(create_function_map(this))

{

}

static FormulasV2 create_function_map(FormulaV2* p)

{

FormulasV2 formulas;

formulas[3] = std::bind(&FormulaV2::compute_ab, p);

formulas[5] = std::bind(&FormulaV2::compute_ac, p);

// ..

formulas[24] = std::bind(&FormulaV2::compute_de, p);

return formulas;

}

The original version used an unordered_map, but after speaking with Stephan the best practice is to use ordered maps by default, reaching for unordered maps only when their very special characteristics are needed. On Stack Overflow, GManNickG mostly agrees but makes a case for unordered_map when pure lookup speed is required. I didn’t use lambdas in this version, but I could have. It would have made the code easier to extend because all of the code to handle the mapping and compute would be in a single location, not scattered about as discrete class methods. It also would have removed the need to use std::bind. I probably should have used lambdas, but by avoiding them the code was easier to debug and maintain. For those looking for lambda guidance, consider Lambdas, Lambdas Everywhere (Channel 9, Herb Sutter, from PDC2010), Lambda Expression in C++ (MSDN), Lambda Expression vs Functor in C++ (accepted answer, Crazy Eddie, StackOverflow), Anonymous Functions (Wikipedia), and for the die-hard fan, the C++ standard. One other notable change was made. The exception class was modified to derive from std::runtime_error. The previous exception class definition lacked a user-defined destructor. James McNellis dishes the dirt on this situation in “How does an exception specification affect virtual destructor overriding?” over on Stack Overflow.

Third Version: A Better Way to Handle Equation Variables

In the third version (FormulaV3), equation variables are managed in containers instead of individual member variables and manipulated using a single set of helper functions using enumerator tags and a couple utility functions to access the correct item:

enum struct tag : unsigned { a, b, c, d, e, count };

typedef std::array<double, static_cast<unsigned>(tag::count)> value_sequence;

typedef std::bitset<static_cast<unsigned>(tag::count)> presence_sequence;

static unsigned as_underlying(tag const t) { return static_cast<unsigned>(t); }

static unsigned as_bit (tag const t) { return 1 << as_underlying(t); }

as_underlying takes a tag name (like tag::a) and returns its underlying value (like 0). The return value is used as an index into a value_sequence array. as_bit returns a value with the corresponding bit set as an unsigned integer, handy for working with a presence_sequence. By handling equation variables this way, the code is easier to read, easier to extend and there are fewer functions to test. Adopting the code is as easy as modifying tag and defining the compute functions.

Final Version: Mixing in Physics

FormulaV4b is the final version of my code. The design from the previous version was carried forward with customizations to handle object motion calculations:

Equation variable names were changed in the tag enumeration.
Physics equations were used to compute the unknowns.
Since some of the equations might result in an attempt to divide by zero or take a negative square root, more exceptions can potentially be thrown.

Of all the bugs I had to chase down, the most insidious was a misplaced decimal point in a constant used in one of the equations. Instead of “0.5”, I had written “.05” and it threw off the results. The results pointed to the problem but my eyes and brain did not see the mistake for about 15 minutes and a bunch of strategically placed output statements. Compiler warnings were a comparative joy. This and previous versions are attached. [Updated 2/25/2013: Code is also on GitHub]. I’ve also attached FormulaV4a, code I cribbed from a demo Simple Physics Calculator XAML/C++ application written by Ale Contenti and Andy Rich. It uses lambdas and manages equation variables differently than my implementation. I peeked at this implementation before writing my own, which is cheating, but not really since I used the application for inspiration and wrote it from my own newbie perspective. The code includes an example calling each version.

Bonus: Unedited Reviewer Feedback

Grab some popcorn!

Using bits to track unknown values: “This is problematic in general because it sets up a limitation – however many bits are in the integer you select. When space is at a premium, you cannot beat bits, but otherwise avoiding unnecessary bit twiddling is a good idea.” But…but… L
More on unknown values: “Boost.Optional is the thing to use.” Had I not avoided external libraries, Boost.Optional looks like a good solution. As a bonus, no more bit twiddling.
Bit twiddling revisited: “But, there are tradeoffs: boost::optional cannot benefit from the compressed representation of a bitfield, potentially increasing the size of an equation object by N bytes where N is the number of variables. If N <= 64, I would argue that the bitfield solution is preferable, especially since the number of variables must be known at compile time with your implementation anyway, so you could easily use templates to utilize less efficient, more general code for N > 64.”
Using the term “hash” to describe the value indicating the two unknowns (returned by getUnknownsKey): “The use of the word “hash” here is potentially confusing. “hash” is a Word Of Power and generally indicates that collisions are possible.” If I were Humpty Dumpty, I could say (in rather a scornful tone) that “the word “hash” means just what I choose it to mean – neither more nor less”, but I am not. I fixed the code.
Worrying about following proper throwing conventions for exceptions: “Fortunately, there are enough conventions, especially in this area, that regardless of what you do you are probably conforming to some convention.” I suspect this is true in life as much as in C++ code.
Fretting over the etiquette of if statements: “[if] is not a function. I strongly recommend putting a space after it. People who don’t are bad and should feel bad.” I felt bad. I fixed it.
Naming Boolean functions: “Functions that return bool but aren’t named in a manner that makes their return value obvious, are bad. In this case, it is not clear what true and false mean.” The offender was an earlier version of calculate. I changed it to a void function and relied on exceptions to communicate calculation failures. The caller must keep track of how many variables have been set.
I am almost embarrassed to include this one. I mistakenly used “throw new” when raising an exception: “DANGER! DANGER! WRONG! “throw new” is bogus in C++.” Old habits die hard.

Each of these was a learning experience in itself.

In Closing

When you were learning C++, what types of small projects did you work on? Do you have suggestions for future articles? Can this code be improved? Do you want to write a “jumping into C++” article? Let us know in the usual places – Facebook, Twitter, ebattali@microsoft.com or comments below. CalculatingWithUnknowns.zip