Blog - Title

Implementing Iterators - VB

Implementing Iterators - VB

  • Comments 0

[Table of Contents] [Next Topic]

This topic on iterators discusses the yield return feature of C#, and its implications to us when writing Visual Basic code in the functional style.  Visual Basic doesn’t have a feature that corresponds to yield return, but there are several ways to work around the lack of it.

This blog is inactive.
New blog: EricWhite.com/blog

Blog TOC
The key concept behind iterators (and yield return) is that sometimes we want to write a function (either an extension method or not) that performs some projection or transformation that we can’t accomplish using the standard query extension methods.  For example, further on in this tutorial, we’ll implement a function that implements grouping in a different way than the grouping extension methods that come with the .NET framework.  In another program, I wanted an extension method to skip the last n items in a collection – this was convenient to implement using yield return.

This topic will first discuss the semantics of the yield return keyword in C#.  It will then show an implementation of an iterator in Visual Basic, and compare/contrast this implementation to the C# one.  We’ll also discuss the various ways to work around the lack of yield return in VB.

Yield Return

Yield return, implemented in C# 2.0, is a means to more elegantly implement the plumbing for iterators using C#.

Another way to say this: yield return allows us to write lazy functions that iterate over some source.  Lazy functions allow LINQ to delay execution of queries until the last possible moment.  It allows queries to be written in such a way that LINQ does not need to assemble massive intermediate results of queries.  Without the avoidance of intermediate results of queries, the system would rapidly become unwieldy and unworkable.

The following two small programs demonstrate the difference in implementing a collection via the IEnumerable interface, and using yield return in an iterator block.

The first example, written in Visual Basic, shows that there is a lot of plumbing that you have to write.  You have to implement a class that derives from IEnumerable, and another class that derives from IEnumerator.  The GetEnumerator() method in MyListOfStrings returns an instance of the class that derives from IEnumerator.  But the end result is that you can iterate through the collection using For Each.

Imports System.Collections
 
Public Class MyListOfStrings
    Implements IEnumerable
    Private _strings() As String
    Sub New(ByVal sArray() As String)
        ReDim _strings(sArray.Length)
        For i = 0 To sArray.Length - 1
            _strings(i) = sArray(i)
        Next
 
    End Sub
    Function GetEnumerator() As IEnumerator _
            Implements IEnumerable.GetEnumerator
        Return New StringEnum(_strings)
    End Function
End Class
 
Public Class StringEnum
    Implements IEnumerator
    Public _strings() As String
    ' Enumerators are positioned before the first element
    ' until the first MoveNext() call.
    Private position As Integer = -1
    Public ReadOnly Property Current() As Object _
            Implements System.Collections.IEnumerator.Current
        Get
            Try
                Console.WriteLine("about to return {0}", _
                                  _strings(position))
                Return _strings(position)
            Catch e As IndexOutOfRangeException
                Throw New InvalidOperationException()
            End Try
        End Get
    End Property
 
    Public Function MoveNext() As Boolean _
            Implements System.Collections.IEnumerator.MoveNext
        position = position + 1
        Return (position < _strings.Length)
    End Function
 
    Public Sub Reset() _
            Implements System.Collections.IEnumerator.Reset
        position = -1
    End Sub
    Sub New(ByVal list() As String)
        _strings = list
    End Sub
End Class
 
Module Module1
    Sub Main()
        Dim sa() As String = {"aaa", "bbb", "ccc"}
        Dim p As MyListOfStrings = New MyListOfStrings(sa)
        For Each s In p
            Console.WriteLine(s)
        Next
    End Sub
End Module
 

Using the yield return keyword in C# 3.0, the equivalent in functionality is as follows:

class Program
{
    public static IEnumerable<string> MyListOfStrings(string[] sa)
    {
        foreach (var s in sa)
        {
            Console.WriteLine("about to yield return");
            yield return s;
        }
    }
 
    static void Main(string[] args)
    {
        string[] sa = new[] {
            "aaa",
            "bbb",
            "ccc"
        };
 
        foreach (string s in MyListOfStrings(sa))
            Console.WriteLine(s);
    }
}
 

As you can see, this is significantly easier.

This isn't as magic as it looks.  When you use the yield keyword, what happens is that the compiler automatically generates an enumerator class that keeps the current state of the iteration.  This class has four potential states: before, running, suspended, and after.  This class has Reset and MoveNext methods, and a Current property.  When you iterate through a collection that is implemented using yield return, you are moving from item to item in the enumerator using the MoveNext method.  The implementation of iterator blocks is fairly involved.  A technical discussion of iterator blocks can be found in the C# specifications.

Implementing Iterators in Visual Basic

So what’s a Visual Basic developer to do?

One option is to code your iterator in C#, and then use the function from Visual Basic.  This is easy and straightforward, provided you know C#.

But what I recommend in most situations is to not worry about making your extension method be lazy.  Coding a non-lazy extension method is easy.  And for many scenarios, it just doesn’t matter.  I’ve written many little programs that include queries into Open XML documents, and for just about all of them, it would not matter if one of my extension methods isn’t lazy.  It would be a different matter if all extension methods materialized intermediate results, but most of them are lazy already.  As Visual Basic developers, we enjoy the lazy benefits of those extension methods in the framework that are lazy.  So I’d recommend a strategy of first coding your extension method in a non-lazy fashion, and seeing if you really have any performance issues.  If you do, then you can revisit the extension method, and either implement an iterator in VB, or write it in C# using yield blocks.

This is the approach that I’ll take when implementing the GroupAdjacent extension method later in this tutorial.

[Table of Contents] [Next Topic] [Blog Map]

Attachment: Examples.txt
Leave a Comment
  • Please add 6 and 1 and type the answer here:
  • Post