(Some additional remarks on this posting can be found here -- feel free to continue comments on that chain)

Here are a few things that I often look for when reviewing code or APIs for performance issues.  None of these are absolutes but they’re little things that seem to come up fairly often.  I've covered many of these before.

In no particular order:

Delegates:  Are you using delegates when you could just be using polymorphism?  Delegates let you arrange for any method on any object to be called.  With just an interface or virtual you get a fixed method on any object which is often good enough.  Delegates cost you on a per instance basis – virtual methods cost on a per class basis.

Virtual Methods:  Are you using virtual methods when direct calls would do?  Many times people go with virtual methods to allow for future extensibility.  Extensibility is a good thing but it does come at a price – make sure your full extensibility story is worked out and that your use of virtual functions is actually going to get you to where you need to be.  For instance, sometimes people think through the call site issues but then don’t consider how the “extended” objects are going to be created.  Later they realize that (most of) the virtual functions didn’t help at all and they needed an entirely different model to get the “extended” objects into the system.

Sealing: Sealing can be a way of limiting the polymorphism of your class to just those sites where polymorphism is needed.  If you will fully control the type then sealing can be a great thing for performance as it enables direct calls and inlining.

Type and API Inflation:  In the managed world there seems a tendency to add more classes with more members each of which does less.  This can be a great design principal but it isn’t always appropriate.  Consider your unit-of-work carefully and make sure the APIs are chunky enough to do their job well (see below).  Don’t forget that each class and function has a static overhead associated with it – all things being equal less classes with less functions gives better performance.  Make sure that you aren’t adding the kitchen sink to your classes just because it fits.

API Chunkiness:  Wrong-sized API’s often translate into wrong-sized transactions to an underlying database or memory store.  Carefully consider issues like the unit of work (how big the transactions are) and isolation even if you aren’t talking to a database.  It’s often good to think about in-memory structures as though they were a database that needed be accessed concurrently even if they aren't.

Concurrency:  Don’t use a more complex concurrency model than is necessary.  Often times the very simplest model is all that is necessary – complex sharing rules or low level synchronization frequently ends up hurting much more than it helps.  Put synchronization at the layer of your implementation that best understands the “transaction” and have none at higher or lower levels if you can possibly avoid it.  Steer clear of complicated synchronization methods especially those that require specific knowledge of strongness/weakness in the processor's memory model.

Fewer Dlls:  All things being equal you tend to get better performance out of fewer large DLLs than you do out of lots of smaller DLLs.  This does break down at some point – especially if the DLL eagerly initializes parts of itself.  Where more DLLs tends to win is in superior patching opportunities and if you can avoid loading many/most of the DLLs at all in the normal cases.  Consolidate DLLs where it makes sense to do so.

Late Bound Semantics: Simply put, if you don’t need Reflection then don’t use it. However hard we work on reflective access to types and members it will never be as fast or economical as the early-bound semantics.  If you are using Reflection for extensibility, consider patterns where only the extended cases pay the cost of Reflection.

Less pointers:  I’d love to see more arrays of primitives, and less forests of pointers. Pointer rich data structures generally do less well on modern processors.  Pretend you have to pay me one picopenny per pointer. How big would the check be by the end of the year?

Cache with Policy:  Make sure that any caches you build have a well understood policy for removing/aging items and don’t just grow forever.  A cache without a proper policy isn’t a cache it’s a memory leak.  Weak pointer based caches look cute on paper but often still suffer from bad policy.

For more information have a look at the Performance and Scalability PAG  -- chapter 5 particularly targets managed code.