Other posts:

Let’s try out our little cache. First I want to write a synchronous version of it as a baseline.

    Private Shared Sub TestSync(ByVal sites() As String, ByVal sitesToDownload As Integer, ByVal howLong As Integer)
        Dim syncCache As New Dictionary(Of String, String)
        Dim count = sites.Count()
        Dim url1 = "http://moneycentral.msn.com/investor/invsub/results/statemnt.aspx?Symbol="

        For i = 0 To sitesToDownload - 1
            Dim html As String = ""
            Dim url = url1 & sites(i Mod count)
            If Not syncCache.TryGetValue(url, html) Then
                html = LoadWebPage(url)
                syncCache(url) = html
            End If
            DoWork(html, howLong)
        Next
    End Sub

This is a loop that loads webpages in the cache if they are not already there. sites is a list of tickers used to compose the urls; sitesToDownload is the total number of sites to download, so that a single url can be loaded multiple times; howLong represents the work to be done on each loaded page.

In this version the cache is simply a Dictionary and there is no parallelism. The two bold lines is where the cache is managed.

DoWork is this.

    Public Shared Sub DoWork(ByVal html As String, ByVal howLong As Integer)
        Thread.Sleep(howLong)
    End Sub

Let’s take a look at the asynchronous version.

    Private Shared Sub TestAsync(ByVal sites() As String, ByVal sitesToDownload As Integer, ByVal howLong As Integer)
        Dim htmlCache As New HtmlCache
        Dim count = sites.Count()
        Dim url = "http://moneycentral.msn.com/investor/invsub/results/statemnt.aspx?Symbol="
        Using ce = New CountdownEvent(sitesToDownload)
            For i = 1 To sitesToDownload
                htmlCache.GetHtmlAsync(
                    url & sites(i Mod count),
                    Sub(s)
                        DoWork(s, howLong)
                        ce.Signal()
                    End Sub)
            Next
            ce.Wait()
        End Using

There are several points worth making on this:

  • The lambda used as second parameter for GetHtmlAsync is invoked on a different thread whenever the html has been retrieved (which could be immediately if the cache has downloaded the url before)
  • CountDownEvent allows a thread to wait for a certain number of signals to be sent. The waiting happens on the main thread in the ce.Wait() instruction. The triggering of the event happens in the lambda described in the point above (the ce.Signal() instruction)

This is the driver for the overall testing.

    Private Shared Sub TestPerf(ByVal s As String, ByVal a As Action, ByVal iterations As Integer)
        Dim clock As New Stopwatch

        clock.Start()
        For i = 1 To iterations
            a()
        Next
        clock.Stop()
        Dim ts = clock.Elapsed
        Dim elapsedTime = String.Format(s & ": {0:00}:{1:00}:{2:00}.{3:00}", ts.Hours, ts.Minutes, ts.Seconds, ts.Milliseconds / 10)
        Console.WriteLine(elapsedTime, "RunTime")
    End Sub

There is not much to say about it. Start the clock, perform a bunch of iterations of the passed lambda, stop the clock, print out performance.

And finally the main method. Note that all the adjustable parameters are factored out before the calls to TestPerf.

    Public Shared Sub Main()
        Dim tickers = New String() {"mmm", "aos", "shlm", "cas", "abt", "anf", "abm", "akr", "acet", "afl", "agl", "adc", "apd",
"ayr", "alsk", "ain", "axb", "are", "ale", "ab", "all"} Dim sitesToDownload = 50 Dim workToDoOnEachUrlInMilliSec = 20 Dim perfIterations = 5 TestPerf("Async", Sub() TestAsync(tickers, sitesToDownload, workToDoOnEachUrlInMilliSec), perfIterations) TestPerf("Sync", Sub() TestSync(tickers, sitesToDownload, workToDoOnEachUrlInMilliSec), perfIterations) End Sub

Feel free to change (tickers, sitesToDownload, workToDoOnEachUrlInMilliSec, perfIterations). Depending on the ratios between these parameters and the number of cores on your machine, you’re going to see different results. Which highlights the fact that parallelizing your algorithms can yield performance gains or not depending on both software and hardware considerations. I get ~3X improvement on my box. I attached the full source file for your amusement.