I did a little work today on a significant perf issue that cropped up and was preventing us from meeting our Beta2 exit criteria.  Basically when reloading an extremely large solution we were seeing the whole UI hang while doing some intensive CPU labor.  After doing some profiling with some of our internal tools we discovered it was in my code.  Sigh...

Before explaining the issue I thought I'd discuss a little bit about the portion of architecture that was affected.  To start with consider the this code and the following sequence of events:

public class MyClass {
    public SomeType someField;
}

  1. We start compiling the type so that we'll able to provide intellisense on it
  2. We start compiling the field of that type
  3. We attempt to resolve the type (SomeType) of the field.  (and say we find it in A.DLL)
  4. Later you then remove A.DLL and add B.DLL (which also contains a type 'SomeType')
  5. You then try to use some bit of intellisense (like 'goto definition') on 'SomeType

In order to make sure that works we need to correctly bind 'SomeType' to the type in B.DLL.   In our current model we used a 'push' based approach.  When the set of imported DLLs had changed we'd "push" that information out to all interested parties.    In this case the interested parties were the set of all types we'd parsed out of your source code and any time those types had bound to some type from the DLL.   Cases where those types would bind to a type from the DLL would be (as above) for the type of a field, the return type or parameters to a method, the supertype of a method, etc. etc.  Then when the listener received notification that a DLL had been added or removed it would go and unbind/rebind any previously bound types.

As you can imagine there are a *lot* of places that bind to types from metadata (just think about your own code), and pushing this information to all of them was actually quite costly.  What's worse was that we were compounding that cost.  When you load or reload a solution we get notified of all the DLLs your project imports.  In the case of this uber-solution we were getting notified about many tens of DLLs that were included in the project.   So not only were we pushing the information about the add or remove of a DLL, we were doing it many times.  So we’d unbind/rebind on the addition of the first DLL, then we’d do it again on the addition of the second, then the third, etc.  Of course, all but the last unbind/rebind was unnecessary.

We had a few ideas about how to make things better.  The first was to just push this work into a task that our background thread would process.  This would allow us start processing user interaction on the foreground thread again that would make the experience much better in terms of user perception.  We decided against that for two reasons.  The first was that we're very wary about moving work to the background thread, especially this late in the game.  We spent a lot of time in Whidbey trying to eliminate deadlocks from C#, and moving work to the background thread is just inviting a ciritical hang to occur right as we're shutting down.  The second reason was that even if we did move this to the background thread we'd still be doing a lot of unnecessary duplicated work.  This would mean our background thread would be just chewing up CPU when it could be doing important things like updating the core intellisense understanding of your code as you're typing.

Another idea we had was to try to batch up all the changes, and then dispatch them all at once.  So if three DLLs were added to the project we'd only issue the 'push' once at the end instead of three times.  We decided against this because it would involve changing the interfaces we have with the VS project system which would require a lot of coordination amongst all the different languages in order to make sure they would all support these new interfaces.

So, instead we decided to reverse the operation entirely and move from a push model to a pull model.  So now what happens is that the current project stores a version number indicating what its current version is, and each parsed type stores the version number of the project when it last sync'ed with it.  Whenever references are added or removed from the project that version is incremented.  Then when you ask for something (like the return type of 'someField') we compare the current type's version against the version of the project.  If they disagree then we know we can't trust the current type and we recompile it, getting the right information, and then storing the version of the world we are consistent with.

This change solved a couple of issues for us.  First, it dropped the perf impact of solution load/reload down to almost nothing (in terms of the time spent in C# at least).  Second, it allowed us to drastically simplify our code since we now were able to cut out all the push logic that was threaded through a fair amount of code.

What’s interesting is that if you consider the simple one DLL case the total amount of CPU time spent doing this work will be more after my change was made.  Why?  Well, before the change we had a set amount of work to do and it was accomplished in one pass (i.e. pushing the information to all interested parties).  After the change the same work needs to be done *and* some even more work needs to be done so that the interested parties check versions and pull in information every time they’re accessed.  Of course, in the multiple DLL case we should be a lot better since instead of pushing the information n-times to each listener, we will only pull it once.

Of course, even in the single DLL case while our overall performance might be worse, our perceived performance will be much better.  This is because we’ll have taken operations that would have stalled the UI before and we’ve moved them into operations that will now be demand driven and will operate fast enough that they can be performed in the idle time between your keystrokes.

When I get into work I’ll send out some information on the project that we use on our team to measure performance on large solutions.  In return I’d like to hear from you guys on the kinds of projects you work on.  Number of projects in your solution, number of files in your project/solution, total size of all your source, largest single source code file.  Those stats would be invaluable in making sure we can make our performance scale to the projects that you’ll be working on.

Edit: The solution i work with and routinely do perf testing on has 8 projects, 1400 .cs files (the largest being 352KB), totaling 25 MB of source.
I've worked on fixing bugs in VS7.1 for enterprises with larger solutions that than (i.e. 250 projects), but i'm curious about you guys.

Edit #2: When loading a project there are actually two components interacting.  The "project system" and the "c# compiler/language service".  Scalability across many projects (i.e. handling 250 projects) is mainly dealt with on the project system side.  Scalability across many files and large files is mainly dealt with on the C# side.  Also, for the most part loading all the projects happens on the main thread, whereas loading all the files happens on the background thread.  This will hopefully allow us to load each project very quickly and allow you to start editing/working right away.  However, we'll be sitting there churning away in the background to bring ourselves up to a full understanding of your code.