So I've checked in the new Longhorn threadpool. (Unmanaged).
If you use the threadpool, I think you'll enjoy it; there are a number of API features which should make it very easy to write correct code which uses it, particularly in the area of cleanup and dll unload synchronization, which the old QueueUserWorkItem() interface was sorely lacking in. Performance and scalability should be a little better, too.
So now it's back to fixing bugs, at least for a little while; there are a lot of edge cases in ntdll.dll and kernel32.dll which never really were thought through properly. I spent a bit of time this past fall plowing through these; this is a good time to do some more before the next big project.
I wish I had time before the kernel team stops Longhorn work to do some serious work in the PE loader code; there are a number of performance problems with having one big loader lock per process, so making that a bit more granular would be great. It'd also be nice to untangle it a bit; currently there're callouts to various components sprinkled throughout the code. Maybe once Longhorn ships...