For a compiler, the common things are code size vs. speed.

The conventional wisdom is to optimize for code-size, because that tends to reduce page faults / get better cache usage and thus ultimately be faster. (I've just exhausted my knowledge on that topic. Head over to Rico's blog if you need more.)

I think an often missed optimization goal is for simplicity.

In my opinion, prefer to optimize for simplicity.   Why?

  1. I'd rather have code be correct and slow, than fast and wrong.
  2. That generally makes the whole dev/test cycle faster, thus giving you more development cycles to focus on optimizing the real bottlenecks.
  3. The gap between the performance of the (simple + easy) solution versus the (complex and "fast") solution is probably smaller than you expect.


"...about 97% of the time: Premature optimization is the root of all evil". (That's Rico, quoting Knuth, quoting Tony Hoare). I'm not advocating being dumb: common sense should still rule. For example, if you need to lookup in a data structure, don't quadruple index it. But as least use a single indexing scheme that you expect to be the common case.  [Update, fixed quote thanks to Peter]


Some practical cases:

  1. Don't try to be fancy with threads. Don't try to do these fancy complex lockless operations without perf measurements to show you that the complexity is actually buying you speed. Just take the leaf level lock and spend those brain cycles somewhere else. Keep your threading bugs as simple as possible.
  2. Not every search has to be a super-smart multiple indexed thing. Linear searches are ok for small sets.
  3. Use conventions that your team understands (or take on the burden of helping your team understand the conventions).  Sometimes simplicity is relative.
  4. For C++, use destructor cleanup (eg, smart pointers) instead of manually managing all cleanup yourself.
  5. Don't fight the optimizer.


The irony is that some of these simplifications are not only easier to understand, but can perform faster too.  For example, a binary search can wreck your cache lines whereas a linear search might play well.  Or you may try to optimize something yourself (eg,  inline assembly) where the C++ optimizer could do it better.