FYI: C# and VB Closures are per-scope

FYI: C# and VB Closures are per-scope

  • Comments 8

This post assumes that you understand how closures are implemented in C#. They're implemented in essentially the same way in the upcoming version of Visual Basic.  As Raymond and Grant point out in their various articles on the subject, the question of whether or not two instances of a delegate share a closed-over variable or have their own copy depends on where the variable is in scope in relation to the delegate creation. I think that this issue is reasonably well-documented by these guys. Lots has been written on this subject already; no need for me to recap it all here.

However, a related issue which I haven't seen anyone talk much about is what the consequences of having one closure per scope are. Though it makes the closure semantics conceptually easier to think about (and implement!), it can lead to an unfortunate problem with garbage collection. Consider the following:

Func<Cheap> M(){
  Cheap c = new Cheap();
  Expensive e = new Expensive();
  Func<Expensive> shortlived = ()=>e;
  Func<Cheap> longlived = ()=>c;
  // use shortlived
  // use longlived
  return longlived;

If the short-lived delegate does not survive past the end of the method, when is the expensive resource released?

The closure for the short-lived delegate owns the expensive resource. But since there is one closure per scope, both the short-lived and the long-lived delegates own a single closure. The closure cannot be collected until every delegate that owns it is dead. Therefore the expensive resource is not released until the long-lived delegate is released, even though the long-lived delegate does not reference the expensive resource!

We could solve this problem in the compilers by coming up with a smarter mechanism for determining how to create closures, and perhaps some day we will, but it will not be in C# 3.0. Until that day, if you are creating what you think are short-lived anonymous methods or lambdas or queries which close over expensive resources, you might want to explicitly finalize those resources when you know that you're done with them. It's easy to accidentally make the resource live longer than you think it does.

  • Raymond said: "It turns out that most computer programming doesn't consist of being clever or making hard decisions. You just have one kernel of an idea ("hey let's have anonymous methods") and then the rest is just doing what has to be done, no actual decisions needed. " Seems to me that your example is a situation where being clever would help.

    Wouldn't static data flow analysis tell you when shortlived can be collected?

  • First off, Raymond said "most", not "all".

    Second, though I understand what Raymond is getting at -- most of writing a compiler really does consist of "throw another hash table at the problem" -- I disagree somewhat with his statement.  The germ of the idea (eg, "hey, let's add a query syntax to the language!") does not always lead to an obvious design or implementation. Design and implementation are always processes where multiple decisions must be made in the context of difficult tradeoffs.

    Third, in general, static analysis is frequently insufficient to determine when a given object can be collected.  We _could_ of course do as much static analysis as possible, and then give hints to the garbage collector about what can go away when.  But we have invested in a highly tuned dynamic garbage collector for a reason, and that reason is to make it easier on us compiler writers and our users. Generally we feel that it is better to give the GC the stuff it needs to do its work well at runtime and let it handle the heuristics about when is the right time to collect what.

    The situation I am describing here is one where the choices we've made work against the efficacy of the garbage collector.  We have to do an analysis of every hoisted local and every scope anyway. I think we would do better to solve this problem by improving our hoisting algorithm to support more than one closure class per scope and letting the GC do its work dynamically, rather than investing in some new static analysis model that would not always give good results.

  • It is talked about a lot, but not related to C#. JavaScript has the same implementation of one closure per scope and in the JavaScript community there has been a lot of talk about it, because it causes memory leaks when combined with the way Internet Explorer implements the DOM.

  • Yes, I'm the guy at Microsoft who diagnosed that problem with JScript originally, I have some familiarity with it. :-)

    Though JScript also suffers from the problem that short-lived objects end up having their lifetimes extended to that of the longest-lived object in the same closure, that is not the cause of the memory leak.  Rather, the cause of the memory leak is the mixture of JScript's mark-and-sweep garbage collection with IE's COM-style reference counting. The two problems are independent; solving one will not solve the other.

  • Since you are familiar with JScript memory leaks problem, do you also know if there are any plans for fixing it in upcoming IE versions?

    We've been developing a very large IE application for five years, and we basically hit a brick-wall with this problem. It has reached the point where we are contemplating reimplementing everything in managed code, but this is extremely expensive. It would be great news for us if this problem could simply "go away"...

  • I have not worked on scripting for many years and I've never worked on IE, so I do not know what their plans are.  I do know that the scripting team is aware of the issue.

  • Welcome to the XXVIII Community Convergence. In these posts I try to wrap up events that have occurred

  • re: 匿名メソッドとラムダ、そして LINQ の違い - vol 1

Page 1 of 1 (8 items)