I'm often asked "What's new in Whidbey" and so I thought I'd put together this (very) brief list of some of the more important items that got attention during this product cycle. This is by no means exhaustive but it's a taste of some of the nice improvements you'll see performance-wise.

Ngen for improved code sharing

Possibly the largest single investment area for Whidbey performance has been the drive to decrease the number of private bytes that are associated with precompiled managed modules.  Everett had roughly 40% of the bytes in any given ngen'd module as private.  Whidbey reduces that level to the 10% neighbourhood while at the same time reducing the overall working set of similar modules with better layout.  Overall you may see your private bytes in any given module fall to as low as 1/8th of what they used to be.

The benefits of this are fairly immediate -- fewer private bytes means less IO and less memory pressure. Both of which tend to translate directly into improved startup times, both warm and cold.

Note: Beware the Private Bytes counter, it does not tell the full story.  Details to come in another entry "when is a private byte not a private byte"

See also: Jason Zander's Ngen tutorial

Advanced Ngen features

Hard binding

You can use System.Runtime.CompilerServices.DependencyAttribute to tell ngen that one assembly always requires the other to be loaded.  Opting into this hard link between the assemblies causes ngen to directly code certain offsets from the always loaded assembly into the image it is creating thereby reducing or eliminating the need for fixups.

See also: the MSDN article on Native Image Service

String Freezing

Use System.Runtime.CompilerServices.StringFreezingAttribute to indicate that ngen should include a GC segment that contains all of your string literals pre-created in your assembly.  This further reduces the need for fixups and thereby reduces private pages however an assembly with frozen strings cannot be unloaded so it is a bad idea to set this attribute on assemblies that are "transient" in nature.

Generics

Generics are a double edged sword so I won't come out and say they are the solution to all your performance woes but they are a great tool to have in your arsenal.  In Whidbey you can easily create strongly-typed value-type based collections that eradicate boxing in many important cases.  Just have a care not to go crazy creating collection types or you may find that you have added far too much code to your project.

See also: Six Questions about Generics and Performance

Enumerators and Foreach

The Generic collections such as List<T> and Dictionary<X,Y> use a superior approach to implementing enumeration than did ArrayList and Hashtable.  If enumeration costs are significant in your application you may get significant savings by using the new collections even if you don't need the strong typing. 

GC Improvements

A great deal of tuning, bug fixing, and addition of new important hueristics makes the Whidbey GC the best we have ever released.  In addition it is much easier to choose the particular GC mode you want (server vs. workstation, concurrent or not) -- you no longer have to use the hosting API to get that flexiblity.

See also: Maoni's articles on the new GC 

Exceptions Improvements

A variety of improvements in the cost of thrown exceptions have helped some cases, especially those with fairly deep stacks.  Our guidance is still to avoid exceptions on the main path but you might find them somewhat less deadly in recursive algorthims.

Exception Avoidance

Many classes now support exception-free variations such as TryParse which give a return code instead of throwing an error in failure cases.  These are highly recommended in cases where parsing failures can be reasonably expected.

Security

A great deal of work went into the security system as a whole.  From basic things like reducing the cost of startup by improving the security system XML parser to making the declarative security more efficiently. In Whidbey the most common demands -- for things unmanaged code access and full trust -- are greatly reduced in cost, as are the cost of asserts.  Generally "full trust" performance is excellent with very low throughput overhead.  These overhead reductions especially pay dividends in interop cases that are very chatty.


Strings

There are new string overloads available that make it possible to specify the comparison mode.  See Dave Fetterman's excellent article describing when to use which type of comparison and be sure to take advantage of ordinal based comparisons (appropriate) for both speed and security.

Reflection

Whidbey includes a major overhaul of the caching policy in the reflection system which results in much more economical overall behavior. This combined with some useful new API's for getting just what you need can mean great improvements.  See Joel Pobar's excellent article on the costs of reflection and his insights into best practices as well as his blog generally.

Cross App Domain calls

Marshalling of simple data types (e.g. strings, ints, copyable structs, etc.) between application domains was vastly improved.  In some cases we observed as high as 10x throughput improvements. 

AddMemoryPressure

You can use GC.AddMemoryPressure and GC.RemoveMemoryPressure to inform the collector that you are allocating and freeing unmanaged memory.  The helps the collector to better understand the true memory pressue in your application and collect appropriately.  It's important to remember that using this API improperly can do a lot of damage because it so directly affects GC behavior.  Be sure to verify improvements with measurements.


Profiling API

Many additional callbacks have been added to make it possible to completely track the lifetime of objects without ad hoc hueristics regarding object promotion and so forth.  As a result finding sources of Mid-Life-Crisis, undisposed objects, and memory leaks generally is greatly improved.  A new CLRProfiler (Beta 2 version) supports these features and of course you can write your own customer profilers that do likewise with comparative ease.


Threadpool

Adaptive thread injection based on throughput and better coordination with GC.