So how does an application like Visual Web Developer wind up with a working set like 150 megabytes? I've got an unusual perspective on this problem...

About a year ago I was diagnosed with adult-onset diabetes.  You have to learn and change to live with diabetes.  One of the first things you learn is that eating sugar doesn't cause diabetes; you have to have a genetic predisposition for it.  It's not caused by germs, and it's not curable.  Due to a lot of factors it's basically the inability to control your blood glucose.

I was looking at a graph of the working set size of a particular test case over time when I was struck by how similar it looked to a blood glucose plot.  I started thinking about "digital diabetes"--the inability to control working set.  The genetic predisposition in this case comes from building an application on a complex framework, which is not an unusual situation these days.

Like the disease, this condition is incurable.  Once your application looses control of its working set it's not going to regain it except in extraordinary cases, like when you're in sole control of the code base, or you have the resources to do a complete analysis and rewrite.  So what can be done?  I'm convinced that the same response which is effective in controlling human diabetes will work for digital diabetes as well.

First, you have to monitor the situation.  Put tests in place which measure the working set under a number of situations.  Run them daily.  Respond quickly to spikes.  Don't let the average creep up.  Keep it under control.

With adult onset diabetes, a general rule is that if you reduce your weight to 90% of what it was at the point of onset you will half your insulin resistance (which is a very good thing.)  If you're serious about dealing with digital diabetes I'd recommend the radical step of trimming 10% of your code immediately.  I have not had the opportunity to test this myself, but it makes sense.  I'd love to hear from anyone who tries this approach.

And finally, all this comes to nothing unless you make a serious and successful effort to modify the habits which lead to the condition in the first place.  For diabetics this means diet, exercise, and healthy living.  For programmers this means changing coding habits that lead to working set bloat.  I would love to develop a list of habits to modify; please send me your suggestions.  Here's what I have so far:

  • Stop fixing bugs by adding code.  I must have code reviewed 10,000 bug fixes in the last eight years, and I'd estimate that about five of them removed code.  Some added hundreds of lines.  If you add it up it's easy to identify a controllable source of bloat.
  • Reduce complexity to improve performance, don't increase it.  Most discussions I hear about performance improvements begin "We could be a lot smarter about how we handle ...".  As a result the logic flow becomes more complex, caches are added, and classes grow and multiply.  While this addresses the immediate problem, the result of a dozen or a hundred programmers taking this approach routinely is a working set explosion.
  • Don't re-purpose components.  I've been involved with aggregating a few sizable run-time components and reusing them as the foundations for design-time features.  There are perfectly valid reasons for doing this, such as fidelity and efficiency, but working set is rarely given sufficient consideration when determining the feasibility of this approach.  Proof of concept code may not reveal the problem in advance.
  • Don't add features without considering their performance impact. This sounds deceptively simple. When performance comes up at all in feature discussions it's usually restricted to the algorithm to be employed; "Yeah, that could be order log n..." and the discussion is over. What about working set? How will the new feature perform on your minimum target machine? What's the cost of getting from initial implementation to acceptable performance? How will the additional overhead affect the performance of other features?