Iterator Blocks, Part One

Iterator Blocks, Part One

Rate This
  • Comments 8

There is a constant tension in language design between solving general problems and solving specific problems; finding the right spot on the general-to-specific spectrum can be quite tricky.

The design of iterator blocks yields (ha ha) a germane example. At almost every step along the way, there are opportunities for making choices that determine whether a more general or a more specific problem is being solved.

Let’s start with the high-level design of the feature. Iterator blocks are, as the name implies, all about making it easier to write code that iterates over some collection of items in a natural way. This is a pretty darn general scenario; there are lots of potential collections (stacks, queues, lists, dozens of different kinds of trees…) containing arbitrarily many different types of items, and lots of ways to iterate over them (in order, post order, pre order, filtered, projected, grouped, sorted…)

We could have made it much more general. Our iterator blocks can be seen as a weak kind of coroutine. We could have chosen to implement full coroutines and just made iterator blocks a special case of coroutines. And of course, coroutines are in turn less general than first-class continuations; we could have implemented continuations, implemented coroutines in terms of continuations, and iterators in terms of coroutines.

But we didn’t. We decided that the sweet spot for the high-level scenario-driven design of the feature was iterators over collections, and so we concentrated on that. Some adventurous people use the fact that iterators are “poor-man’s coroutines” as a shortcut to building coroutine-like systems that have only a tenuous connection to the semantics of iterating over the items in a collection, but those are the exception, not the rule. Their scenarios certainly did not drive the design of the feature.

We want to balance the generality of the feature against the high cost of that generality. Premature optimization is often cited as the root of all evil, but I don’t think that’s quite true. Premature generality is responsible for a lot of evil too! As we’ll see, a lot of the time when faced with a design decision we take the YAGNI position; we choose to implement a little bit less generality to get a large cost savings, with the assumption that hardly anyone would benefit from that spending.

Over our next few fabulous adventures in coding I’ll discuss some of the reasons for the seemingly odd restrictions on the generality of iterator blocks – things like, why can there be no unsafe code in iterator blocks? Why can an iterator block not take a ref or out parameter? Why can’t you put a yield in a finally? And so on.

  • It's exactly that kind of attitude that has made for a programming language that feels light years more productive for building a very large class of applications than so many of its ancestor languages.  People don't realize that there's so much software to be built and so much hardware to throw around that the weak link in the chain is human comprehension.

    It strikes me that a similar analysis could be made of C# generics vs C++ templates.

  • Let me guess :-)

    Iterator blocks cannot take ref or out parameters because those cannot be warped into fields inside a helper class which implements the iterator interface.

    You cannot put a yield in a finally because you cannot goto into finally. Or maybe because that would break the invariant of finally clause: statements after yield might never be executed.

    You cannot put unsafe code in the iterator block because then you could run into a situation where one part of code that e. g. allocates non-GC-ed memory is executed, and the other part that frees it is not executed, leading to memory leaks and other scary unsafe and unmanaged stuff.

    (sorry for my English, I am not quite native speaker)

  • A similar argument could have been made about IDisposable and using blocks,

    The big problem I see with this is that .Net now has 2 specific structures (foreach and using) which both could have been implemented as continuations.

    The restrictive nature of the implementation, kind of hurts you, you can not intercept and track exceptions in using blocks, you can not pass around foreach blocks (or track exceptions)

  • Eric

    Guess being one of the "adventurous people" iterators & async have worked great for the last 5 years for me :-)

    http://codeplex.com/easyasync

    Look forward to the rest of the posts in this series.

    Could you also elaborate on how you determine if a feature such as coroutine or continuation should be implemented in the *language or the run time*?

    I remember in 1.1 the undocumented hosting API hack I used to implement coroutines for .NET

    http://msdn.microsoft.com/en-us/magazine/cc164086.aspx

    Also in 2.0 the hosting API was revamped to support different tasking models to address similar continuation based scenarios (and also SQL server fiber mode)

    Where does the language designer fit into all this?

    Ajai

  • What's the likelihood of being able to yield inside the try of a try/catch block in a future version? Or is there some basic reason why that can't make sense. Aside from that I think iterator methods are fine as they are - close enough to general coroutines, and lacking the "spooky" aspects of general continuations.

    As for yield return in a finally block, that would be odd because finally blocks run if the iterator is abandoned via Dispose. Yielding more items wouldn't be the sanest response to that.

  • What's the likelihood of getting working coroutines in a future version of C# or CLR?  the problem with using iterator syntax has been demonstrated by you before:

       IEnumerator InOrderTraverseTree() {

           if (this.left != null)

           foreach(object o in this.left.InOrderTraverseTree())  yield return o;

           yield return this.value;

           if (this.right != null)

           foreach(object o in this.right.InOrderTraverseTree()) yield return o;

       }

    The problem being that you zip up and down the co-routine's call stack on every iteration step.  

    Don't get me wrong, the iteration syntax captures a nice pattern.  But people have noticed that it doesn't get you as far as 'something else', though maybe they don't know the name of that 'something else' yet, having never worked in a language that has coroutines or first class continuations.

    I'm curious if you'd agree, but I've always thought of the thing missing from iterator syntax is the method call abstraction.  In a hypothetical co-routine environment, instead of 'yield return x', you would 'push(x)', where 'push' is some sort of callable thing -- delegate in C# parlance -- where you could pass that delegate to recursive calls.

       // in hypothetical C#++

       [Coroutine(IEnumerable)]

       void InOrderTraverseTree(Action<object> push) {

           if (this.left != null)

               this.left.InOrderTraverseTree(push);

           yield return this.value;

           if (this.right != null)

               this.right.InOrderTraverseTree(push);

       }

    Slightly improved readibility, drastically improved asymptotic performance.  

    One alternate approach I've heard suggested is chainable iterators -- that 'yield return' could be passed an IEnumerable to fold in.  I'm curious which you'd prefer.

  • You can't yield from a finally block because yield is really a return, and you cannot return from a finally block -- you must leave the block to go to a return outside of it. And once you leave that block, there's no way to get back into the finally to resume execution on the next iteration.

    And if there was an unhandled exception, when would it propogate to the caller? After all the yields in the finally block had been enumerated? But what if you never finished enumerating them? And where would you put the exception in the mean time?

  • I'm a dyed-in-the-wool YAGNI guy. Thank you for everything, including Windows Script!!! Some time I'd like you to see what I did with (every bit of) Windows Script in my astronomy automation system. You'll probably not believe it. I even found named parameters, ha ha.

    I love C# and have used the hell out of it for the last 5 years or so. Keep up the great work, and keep blogging. I'm reading all of the parts to this interesting article on iterators now.

Page 1 of 1 (8 items)