Holy cow, I wrote a book!
Like anonymous methods, iterators in C# are very complex syntactic sugar. You could do it all yourself (after all, you did have to do it all yourself in earlier versions of C#), but the compiler transformation makes for much greater convenience.
The idea behind iterators is that they take a function with yield return statements (and possible some yield break statements) and convert it into a state machine. When you yield return, the state of the function is recorded, and execution resumes from that state the next time the iterator is called upon to produce another object.
yield return
yield break
Here's the basic idea: All the local variables of the iterator (treating iterator parameters as pre-initialized local variables, including the hidden this parameter) become member variables of a helper class. The helper class also has an internal state member that keeps track of where execution left off and an internal current member that holds the object most recently enumerated.
this
class MyClass { int limit = 0; public MyClass(int limit) { this.limit = limit; } public IEnumerable<int> CountFrom(int start) { for (int i = start; i <= limit; i++) { yield return i; } } }
The CountFrom method produces an integer enumerator that spits out the integers starting at start and continuing up to and including limit. The compiler internally converts this enumerator into something like this:
CountFrom
start
limit
class MyClass_Enumerator : IEnumerable<int> { int state$0 = 0;// internal member int current$0; // internal member MyClass this$0; // implicit parameter to CountFrom int start; // explicit parameter to CountFrom int i; // local variable of CountFrom public int Current { get { return current$0; } } public bool MoveNext() { switch (state$0) { case 0: goto resume$0; case 1: goto resume$1; case 2: return false; } resume$0:; for (i = start; i <= this$0.limit; i++) { current$0 = i; state$0 = 1; return true; resume$1:; } state$0 = 2; return false; } ... other bookkeeping, not important here ... } public IEnumerable<int> CountFrom(int start) { MyClass_Enumerator e = new MyClass_Enumerator(); e.this$0 = this; e.start = start; return e; }
The enumerator class is auto-generated by the compiler and, as promised, it contains two internal members for the state and current object, plus a member for each parameter (including the hidden this parameter), plus a member for each local variable. The Current property merely returns the current object. All the real work happens in MoveNext.
Current
MoveNext
To generate the MoveNext method, the compiler takes the code you write and performs a few transformations. First, all the references to variables and parameters need to be adjusted since the code moved to a helper class.
this$0
m
this$0.m
this.m
v
this.v
The compiler also has to deal with all those yield return statements.
yield return x
current$0 = x; state$0 = n; return true; resume$n:;
where n is an increasing number starting at 1.
n
And then there are the yield break statements.
state$0 = n2; return false;
n2
Finally, the compiler puts the big state dispatcher at the top of the function.
switch (state$0) { case 0: goto resume$0; case 1: goto resume$1; case 2: goto resume$2; ... case n: goto resume$n; case n2: return false; }
with one case statement for each state, plus the initial zero state and the final n2 state.
case
Notice that this transformation is quite different from the enumeration model we built based on coroutines and fibers. The C# method is far more efficient in terms of memory usage since it doesn't consume an entire stack (typically a megabyte in size) like the fiber approach does. Instead it just borrows the stack of the caller, and anything that it needs to save across calls to MoveNext are stored in a helper object (which goes on the heap rather than the stack). This fake-out is normally quite effective—most people don't even realize that it's happening—but there are places where the difference is significant, and we'll see that shortly.
Exercise: Why do we need to write state$0 = n2; and add the case n2: return false;? Why can't we just transform each yield break into return false; and stop there?
state$0 = n2;
case n2: return false;
return false;