Larry Osterman is doing a really great series on concurrency and scalability issues. He's touching on how to identify them and some of the hidden places they manifest themselves. The great thing is that we've actually found similar issues in the profiler analysis engine using the profiler. For example, if you are taking a lot of samples in the Win32 heap processing functions, there's a good chance that you are running in to the heap locking issues discussed here. (Or in any case, you are probably allocating too frequently!)
Concurrency, Part 10 - How do you know if you've got a scalability issue?
Concurrency, part 11 - Hidden scalability issues