Welcome to MSDN Blogs Sign in | Join | Help

I'm a VB at PDC

Hello from PDC!

 


The VB session was here: http://microsoftpdc.com/Sessions/FT32 - I'll post the code from it shortly, and let you know when the video goes live. The talk covered many of the new features in VB10, and showed a Silverlight4 application written in VB which fetched Twitter feeds...

gduncan411 (Greg Duncan): #vb10 Rocks!  :)
jwk4946 (Jeff Keys): #vb10 rocks #pdc09
BenRoy (Ben Roy) - Twitter is hot at #pdc09. This sesion using #vb10 to pull stuff into Silverlight
jimseiwert (Jim Seiwert): Death to the underscore #vb10 #pdc09
itrainulearn (ITrain (Malaysia)): wassap #vb10 looking good #pdc09
athman20 (athman): :)???? anything new?

I have a limited number of "IM A VB" nametapes left over, as pictured above -- if you'd like one, email me (lwischik@microsoft.com) and tell me your address and if I still have any left then I'll post one over. They look cool on army surplus shirts!

It was good to meet Miguel (of Mono fame) in person. I've personally been experimenting with Mono at home for a while. I was abashed to realize that the Mono I'd been experimenting with was about two years old now, which could explain a lot of its quirks! Anyway, I told him what I tell everyone: do feel free to email me whenever you have questions about the VB specification or compiler.

Posted by ljw1004 | 0 Comments

I'm a VB

Check out the license plates on my new motorcycle!

Posted by ljw1004 | 0 Comments

Web-scraping with VB's XML support

There was an interesting article about using VB's XML support for generating HTML: http://www.infoq.com/news/2009/02/MVC-VB.

I've been using VB and XML for the reverse purpose -- scraping web pages to retreive information. I enjoy sailing, and I wanted to find historical data on windspeeds to know when would be the best time of year to set out on a long trip. (Answer: March and April have the best winds around Seattle).

I found an excellent site, to scrape from, http://www.almanac.com/, which has historical weather data for many places around the country. The first step in scraping is copyright law. Facts alone are not copyrightable, but the act of selecting and compiling facts is a creative work and so the compilation is copyrightable. Hence, for instance, a telephone directory is protected by copyright. So too is almanac.com's compilation. And that's why I only scraped their pages for my own personal use.

The almanac has URLs like this: http://www.almanac.com/weatherhistory/oneday.php?number=994014&wban=99999&day=1&month=4&year=2008&searchtype=. It's easy to see what the format is, and generate similar URLs myself.

 

The code to parse XHTML:

I looked at the HTML source code of a page from the almanac in Notepad, figured out its structure, and wrote some simple XML queries to dig into it. (Note: the function "Fetch" fetches HTML pages from the web, but converts them into XHTML ready for VB XML queries. More on that later). Here's the VB code. I highlighted the XML queries.

 

Option Strict On

Imports System.Net

Imports System.IO

Imports <xmlns:xhtml="http://www.w3.org/1999/xhtml">

 

 

Module Module1

 

    Dim Places As Integer() = {994014}

    Dim Years As Integer() = {2008}

    Dim Months As Integer() = {4, 5}

 

    Sub Main()

        Console.WriteLine("{1}{0}{2}{0}{3}{0}{4}{0}{5}{0}{6}{0}{7}{0}{8}", vbTab, "Date (Y/M/D)", "Location", "Temp (^F)", "Precipitation (in)", "Visibility (miles)", "Wind Mean (mph)", "Wind Sustained (mph)", "Wind Gust (mph)")

        For Each year As Integer In Years

            For Each month As Integer In Months

                Dim d = New DateTime(year, month, 1)

                Dim dnm = New DateTime(If(d.Month = 12, d.Year + 1, d.Year), If(d.Month = 12, 1, d.Month + 1), d.Day)

                Dim lastDay = CInt((dnm - d).TotalDays)

                For day As Integer = 1 To lastDay

                    For Each place As Integer In Places

                        Dim url = String.Format("http://www.almanac.com/weatherhistory/oneday.php?number={0}&wban=99999&day={1}&month={2}&year={3}&searchtype=", place, day, month, year)

                        Dim fn = Fetch(url)

                        Dim xml = XElement.Load(fn)

                        Dim body = (From i In xml...<xhtml:div> Where i.GetAttr("class") = "yui-u first").FirstOrDefault

                        If body Is Nothing Then Continue For

                        Dim title = body.<xhtml:h2>.Value.ToString.Replace(",", " ")

                        If title.ToLower.StartsWith("no data") Then Continue For

                        Dim temp, precipitation, visibility, windMean, windSustained, windGust As Double?

                        Dim data = From i In body...<xhtml:td>

                        For Each td In data

                            Dim text = td.<xhtml:p>.FirstOrDefault

                            If text Is Nothing Then Continue For

                            Dim category = text.Value.Replace(vbCrLf, " ").Replace(vbCr, " ").Replace(vbLf, " ").ToLower

                            text = td.<xhtml:b>.FirstOrDefault

                            If text Is Nothing Then Continue For

                            Dim svalue = text.Value.Replace(vbCrLf, " ").Replace(vbCr, " ").Replace(vbLf, " ").ToLower

                            Dim value = 0.0 : If Not Double.TryParse(svalue, value) Then Continue For

                            If category Like "mean temperature" Then temp = value

                            If category Like "total precipitation" Then precipitation = value

                            If category Like "visibility" Then visibility = value

                            If category Like "mean wind speed" Then windMean = value

                            If category Like "maximum sustained" Then windSustained = value

                            If category Like "maximum gust" Then windGust = value

                        Next

                        Dim s = String.Format("{0:0000}/{1:00}/{2:00}", year, month, day)

                        Console.WriteLine("{1}{0}{2}{0}{3}{0}{4}{0}{5}{0}{6}{0}{7}{0}{8}", vbTab, s, title, temp, precipitation, visibility, windMean, windSustained, windGust)

                    Next

                Next

            Next

        Next

    End Sub

End Module

 

Fetching pages: HTML into XHTML

Goal: to use VB's XML support for reading the web page. That's because VB has such nice syntax (I find it easier than xpath, or beautiful soup, or the alternatives). The problem is that most web-pages are written in a sloppy kind of HTML that might render okay but certainly can't be loaded into XElement.Load.

Solution: download Tidy, an awesome open-source library and executable for, well, tidying HTML into proper XHTML. I downloaded "tidy.exe" and put it into my windows directory, so I could execute it without messing around with the path.

The above code calls a function "Fetch". This is the one that fetches pages, and invokes "tidy" to clean up the html. Here is the implementation of Fetch. It uses a function "InputAndOutputToEnd" to redirect input and output of tidy.exe when it runs it. I wrote about InputAndOutputToEnd last month.

 

Module Helpers

 

    ''' <summary>

    ''' GetAttr: x.GetAttr("attr") is equivalent to x.@attr. It's here to work around a MONO bug: MONO

    ''' will throw an exception on x.@attr if the attribute is absent; the CLR doesn't. This function

    ''' also doesn't throw.

    ''' </summary>

    <System.Runtime.CompilerServices.Extension()> Function GetAttr(ByVal e As XElement, ByVal attr As String) As String

        If e Is Nothing Then Return ""

        For Each a In e.Attributes

            If String.Compare(attr, a.Name.LocalName, True) = 0 Then Return a.Value

        Next

        Return ""

    End Function

 

    ''' <summary>

    ''' Fetch: this function fetches the given Url and saves it into a cache in a temporary directory.

    ''' It returns the filename. If the Url had given back "text/html", then this function invokes

    ''' "tidy.exe" (from http://tidy.sourceforge.net/) to turn the html into valid XHTML such as can

    ''' be read with XElement.Load. The function will throw an exception if anything bad happened,

    ''' e.g. WebException or BadUriException. If asked to fetch a url but this url had already been downloaded

    ''' previously, and the previous download was no more than "CacheAtLeastDays" old and hadn't

    ''' been deleted, then the previous download is used. The idea is that our program might well hammer

    ''' web-services, and we don't want to be too cruel on them, so even if they didn't specify caching

    ''' for a page then we might still want to cache it. (If the webservice specified a cache longer than

    ''' CacheAtLeastDays, then any number of internet proxies along the way might cache it, and so

    ''' CacheAtLeastDays is a minimum rather than a maximum.) This function is not protected against

    ''' multiple threads calling it. There might be contention if multiple threads call it and try to

    ''' download and write to the same file. Note: in the cache, URLs are escaped then truncated to 240

    ''' characters. So if they were longer than that (e.g. long query strings) then there'll be cache

    ''' conflicts and the wrong data might be returned.

    ''' </summary>

    Function Fetch(ByVal Url As String, Optional ByVal CacheAtLeastDays As Double = 7) As String

        Dim dir = IO.Path.GetTempPath & My.Application.Info.AssemblyName & "\fetch"

        If Not Directory.Exists(dir) Then Directory.CreateDirectory(dir)

        ' Note: if the directory already existed, then CreateDirectory just proceeds silently without fuss.

 

        Dim fn = dir & "\" & Uri.EscapeDataString(Url.Replace("http://", "").Replace("/", "_")).Replace("%", "#")

        ' MONO: If you try to XElement.Load(fn) where fn includes %escapes, then it tries to unescape them.

        ' So we make sure there are no %escapes in the filename.  (CLR doesn't have this quirk.)

 

        fn = fn.Substring(0, Math.Min(240, fn.Length))

        ' MONO on unix: is fine so long as every directory/filename component is <=240 characters.

        ' CLR on windows: requires the entire path "fn" to be <=240 characters.

        ' http://blogs.msdn.com/bclteam/archive/2007/02/13/long-paths-in-net-part-1-of-3-kim-hamilton.aspx

 

        If File.Exists(fn) Then

            Dim age = DateTime.Now - File.GetLastWriteTime(fn)

            If age.TotalDays <= CacheAtLeastDays Then Return fn

            File.Delete(fn)

        End If

 

        Dim x = WebRequest.Create(Url)

        Using r = x.GetResponse

            Dim t = ""

            Using rs As New StreamReader(r.GetResponseStream)

                t = rs.ReadToEnd

            End Using

            If Not r.ContentType.StartsWith("text/html") Then

                My.Computer.FileSystem.WriteAllText(fn, t, False, Text.Encoding.UTF8)

                Return fn

            End If

            Using tidy As New System.Diagnostics.Process

                Dim cmd = "tidy"

                Dim args = "-asxml -numeric -quiet --doctype omit"

                ' MONO: XElement.Load throws an exception if DOCTYPE is present. CLR doesn't. Hence we omit the DOCTYPE.

                tidy.StartInfo.FileName = cmd

                tidy.StartInfo.Arguments = args

                tidy.StartInfo.UseShellExecute = False

                tidy.StartInfo.RedirectStandardInput = True

                tidy.StartInfo.RedirectStandardOutput = True

                tidy.StartInfo.RedirectStandardError = True

                tidy.Start()

                Dim err = "", op = ""

                tidy.InputAndOutputToEnd(t, op, err)

                tidy.WaitForExit(5000)

                If tidy.HasExited Then

                    ' We had already asked ("-numeric") for tidy to escape non-ascii characters. But

                    ' nonetheless, XElement.Load will throw an exception if there are any, and we really

                    ' don't want that, so we'll do belt-and-braces here:

                    Dim op2 As New Text.StringBuilder(op.Length)

                    For i = 0 To op.Length - 1

                        Dim c = AscW(op(i))

                        If (c >= 32 AndAlso c < 127) OrElse c = 13 OrElse c = 10 OrElse c = 9 Then

                            op2.Append(op(i))

                        End If

                    Next

                    My.Computer.FileSystem.WriteAllText(fn, op2.ToString, False, Text.Encoding.ASCII)

                    Return fn

                End If

                tidy.Kill()

                tidy.WaitForExit(2000)

            End Using

        End Using

        Return ""

    End Function

 

 

    ''' <summary>

    ''' InputAndOutputToEnd: Given a started process, this lets you supply a string as input if you want,

    ''' and will read all output and error to the end. This function has no timeout: if we give it an input string

    ''' but the process fails to read it to completion, or if we ask for standard-output/error but the process

    ''' fails to close these streams, then the function will block indefinitely. The function will throw

    ''' an exception if there was an error reading from the streams. The caller is expected to have started

    ''' the process before calling the function, and the caller is expected to wait for the process to close

    ''' and to dispose of it afterwards. If the caller uses this function, then the caller should do no

    ''' other input/output to the process.

    ''' </summary>

    <Runtime.CompilerServices.Extension()> Sub InputAndOutputToEnd(ByVal p As Diagnostics.Process, ByVal StandardInput As String, ByRef StandardOutput As String, ByRef StandardError As String)

        If p Is Nothing Then Throw New ArgumentException("process must be non-null", "p")

        ' Assume p has started. Alas there's no way to check.

        If p.StartInfo.UseShellExecute Then Throw New ArgumentException("Set StartInfo.UseShellExecute to false")

        If (p.StartInfo.RedirectStandardInput <> (StandardInput IsNot Nothing)) Then Throw New ArgumentException("Provide a non-null Input only when StartInfo.RedirectStandardInput")

        If (p.StartInfo.RedirectStandardOutput <> (StandardOutput IsNot Nothing)) Then Throw New ArgumentException("Provide a non-null Output only when StartInfo.RedirectStandardOutput")

        If (p.StartInfo.RedirectStandardError <> (StandardError IsNot Nothing)) Then Throw New ArgumentException("Provide a non-null Error only when StartInfo.RedirectStandardError")

        '

        ' MSDN notes, http://msdn.microsoft.com/en-us/library/system.diagnostics.processstartinfo.redirectstandardoutput.aspx,

        ' that "Synchronous read operations introduce a dependency between the caller reading from the StandardOutput stream

        ' and the child process writing to that stream. These dependencies can cause deadlock conditions." We avoid the deadlock

        ' by running in a separate thread.

        '

        Dim outputData As New InputAndOutputToEndData

        Dim errorData As New InputAndOutputToEndData

        '

        If p.StartInfo.RedirectStandardOutput Then

            outputData.Stream = p.StandardOutput

            outputData.Thread = New Threading.Thread(AddressOf InputAndOutputToEndProc)

            outputData.Thread.Start(outputData)

        End If

        If p.StartInfo.RedirectStandardError Then

            errorData.Stream = p.StandardError

            errorData.Thread = New Threading.Thread(AddressOf InputAndOutputToEndProc)

            errorData.Thread.Start(errorData)

        End If

        '

        If p.StartInfo.RedirectStandardInput Then

            p.StandardInput.Write(StandardInput)

            p.StandardInput.Close()

        End If

        '

        If p.StartInfo.RedirectStandardOutput Then outputData.Thread.Join() : StandardOutput = outputData.Output

        If p.StartInfo.RedirectStandardError Then errorData.Thread.Join() : StandardError = errorData.Output

        If outputData.Exception IsNot Nothing Then Throw outputData.Exception

        If errorData.Exception IsNot Nothing Then Throw errorData.Exception

    End Sub

 

    Private Class InputAndOutputToEndData

        Public Thread As Threading.Thread

        Public Stream As IO.StreamReader

        Public Output As String

        Public Exception As Exception

    End Class

 

    Private Sub InputAndOutputToEndProc(ByVal data_ As Object)

        Dim data = DirectCast(data_, InputAndOutputToEndData)

        Try : data.Output = data.Stream.ReadToEnd : Catch e As Exception : data.Exception = e : End Try

    End Sub

 

End Module 

 

 

Posted by ljw1004 | 1 Comments
Filed under:

System.Diagnostics.Process: redirect StandardInput, StandardOutput, StandardError

Sometimes you want to launch an external utility and send input to it and also capture its output. But it's easy to run into deadlock this way...

' BAD CODE

Using p As New System.Diagnostics.Process

    p.StartInfo.FileName = "cat"

    p.StartInfo.UseShellExecute = False

    p.StartInfo.RedirectStandardOutput = True

    p.StartInfo.RedirectStandardInput = True

    p.Start()

    p.StandardInput.Write("world" & vbCrLf & "hello")

    ' deadlock here if p needs to write more than 12k to StandardOutput

    p.StandardInput.Close()

    Dim op = p.StandardOutput.ReadToEnd()

    p.WaitForExit()

    p.Close()

    Console.WriteLine("OUTPUT:") : Console.WriteLine(op)

End Using

The deadlock in this case arises because "cat" (a standard unix utility) first reads from StandardInput, then writes to StandardOutput, then reads again, and so on until there's nothing left to read. But if its StandardOutput fills up with no one to read it, then it can't write any more, and blocks.

The number "12k" is arbitrary and I wouldn't rely on it...

' BAD CODE

Using p As New System.Diagnostics.Process

    p.StartInfo.FileName = "findstr"

    p.StartInfo.UseShellExecute = False

    p.StartInfo.RedirectStandardOutput = True

    p.StartInfo.RedirectStandardError = True

    p.Start()

    ' deadlock here if p needs to write more than 12k to StandardError

    Dim op = p.StandardOutput.ReadToEnd()

    Dim err = p.StandardError.ReadToEnd()

    p.WaitForExit()

    Console.WriteLine("OUTPUT:") : Console.WriteLine(op)

    Console.WriteLine("ERROR:") : Console.WriteLine(err)

End Using

The MSDN documentation says, "You can use asynchronous read operations to avoid these dependencies and their deadlock potential. Alternately, you can avoid the deadlock condition by creating two threads and reading the output of each stream on a separate thread." So that's what we'll do...

Using threads to redirect without deadlock

' GOOD CODE: this will not deadlock.

Using p As New Diagnostics.Process

    p.StartInfo.FileName = "sort"

    p.StartInfo.UseShellExecute = False

    p.StartInfo.RedirectStandardOutput = True

    p.StartInfo.RedirectStandardInput = True

    p.Start()

    Dim op = ""

    ' do NOT WaitForExit yet since that would introduce deadlocks.

    p.InputAndOutputToEnd("world" & vbCrLf & "hello", op, Nothing)

    p.WaitForExit()

    p.Close()

    Console.WriteLine("OUTPUT:") : Console.WriteLine(op)

End Using

 

 

''' <summary>

''' InputAndOutputToEnd: a handy way to use redirected input/output/error on a p.

''' </summary>

''' <param name="p">The p to redirect. Must have UseShellExecute set to false.</param>

''' <param name="StandardInput">This string will be sent as input to the p. (must be Nothing if not StartInfo.RedirectStandardInput)</param>

''' <param name="StandardOutput">The p's output will be collected in this ByRef string. (must be Nothing if not StartInfo.RedirectStandardOutput)</param>

''' <param name="StandardError">The p's error will be collected in this ByRef string. (must be Nothing if not StartInfo.RedirectStandardError)</param>

''' <remarks>This function solves the deadlock problem mentioned at http://msdn.microsoft.com/en-us/library/system.diagnostics.p.standardoutput.aspx</remarks>

<Runtime.CompilerServices.Extension()> Sub InputAndOutputToEnd(ByVal p As Diagnostics.Process, ByVal StandardInput As String, ByRef StandardOutput As String, ByRef StandardError As String)

    If p Is Nothing Then Throw New ArgumentException("p must be non-null")

    ' Assume p has started. Alas there's no way to check.

    If p.StartInfo.UseShellExecute Then Throw New ArgumentException("Set StartInfo.UseShellExecute to false")

    If (p.StartInfo.RedirectStandardInput <> (StandardInput IsNot Nothing)) Then Throw New ArgumentException("Provide a non-null Input only when StartInfo.RedirectStandardInput")

    If (p.StartInfo.RedirectStandardOutput <> (StandardOutput IsNot Nothing)) Then Throw New ArgumentException("Provide a non-null Output only when StartInfo.RedirectStandardOutput")

    If (p.StartInfo.RedirectStandardError <> (StandardError IsNot Nothing)) Then Throw New ArgumentException("Provide a non-null Error only when StartInfo.RedirectStandardError")

    '

    Dim outputData As New InputAndOutputToEndData

    Dim errorData As New InputAndOutputToEndData

    '

    If p.StartInfo.RedirectStandardOutput Then

        outputData.Stream = p.StandardOutput

        outputData.Thread = New Threading.Thread(AddressOf InputAndOutputToEndProc)

        outputData.Thread.Start(outputData)

    End If

    If p.StartInfo.RedirectStandardError Then

        errorData.Stream = p.StandardError

        errorData.Thread = New Threading.Thread(AddressOf InputAndOutputToEndProc)

        errorData.Thread.Start(errorData)

    End If

    '

    If p.StartInfo.RedirectStandardInput Then

        p.StandardInput.Write(StandardInput)

        p.StandardInput.Close()

    End If

    '

    If p.StartInfo.RedirectStandardOutput Then outputData.Thread.Join() : StandardOutput = outputData.Output

    If p.StartInfo.RedirectStandardError Then errorData.Thread.Join() : StandardError = errorData.Output

    If outputData.Exception IsNot Nothing Then Throw outputData.Exception

    If errorData.Exception IsNot Nothing Then Throw errorData.Exception

End Sub

 

Private Class InputAndOutputToEndData

    Public Thread As Threading.Thread

    Public Stream As IO.StreamReader

    Public Output As String

    Public Exception As Exception

End Class

 

Private Sub InputAndOutputToEndProc(ByVal data_ As Object)

    Dim data = DirectCast(data_, InputAndOutputToEndData)

    Try : data.Output = data.Stream.ReadToEnd : Catch e As Exception : data.Exception = e : End Try

End Sub

 

Posted by ljw1004 | 0 Comments

Romeo and Juliette and Windows Azure

1. Juliette sends a message "I'll take a drug which makes me look dead but I'm not really"
2. Romeo receives the message
3. Romeo finds Juliette looking dead, but knows she's not really dead
4. They live happily ever after

vs.

1. Juliette sends a message "I'll take a drug which makes me look dead but I'm not really"
                [the message is lost in a plague-related network outage]
2. Romeo never receives the message
3. Romeo finds Juliette looking dead, thinks she's dead, and kills himself
4. Juliette wakes to find Romeo dead and kills herself too.

Cloud computing is about distributed protocols with message-passing usually in XML. As VB developers, you'll be the ones responsible for saving Romeo and Juliette. How? The quick answer is "make your operations idempotent". For the long answer, read on...

Every distributed protocol has a window of vulnerability, where one party can't be sure that the other party has received the message. This is a fundamental law of computer science. I've been learning about Windows Azure. As a distributed architect, my first question is always going to be: "where are the windows of vulnerability?"

The Azure infrastructure

 Azure Roles

This diagram shows the basic Azure infrastructure. Everything is done by message-passing, and all parties are potentially distributed. When an end-user makes an HTTP request, it arrives at an instance of the WebRole, written in ASP. The WebRole might make a synchronous request to fetch from a Table in the StorageService to help construct the page it gives back. I'm think I'm going to write my WebRoles entirely in VB, and have them return just XML literals. The WebRole might also synchronously deposit messages to the Queue in the StorageService in case further work needs to be done, e.g. generating thumbnails or submitting an expense claim. The message can be in any format.

Messages in the queue will be picked up by an instance of the WorkerRole. The WorkerRole is a DLL written by you in VB or C# or other .NET languages. Azure loads this DLL in a VM and invokes its StartReceiving method. You implement this method, usually by creating a thread which (in a loop) synchronously gets messages from the Queue and handles them appropriately, maybe by calling third-party web-services, or maybe by fetching data from blobs and writing them to other blobs.

The Azure framework might create multiple instances of WebRole or of WorkerRole to cope with load, each instance indistinguishable. It might also "freeze" a role and unfreeze it again on a new machine. It might also terminate a Role, by calling its Stop method. A WorkerRole can create and use temporary local storage on the machine it's running on, using the ILocalResource interface, but this local storage is discarded each time the WorkerRole is stopped or frozen.

NOTE: the current Azure community preview runs all of these things in just a single datacenter. The chance of message-loss between them is therefore vanishingly small. Things will only start to get interesting once Azure becomes geo-located. At that time, presumably you'll be able to chose in which datacenter each part of your service runs.

Each of the interactions sketched out above has a window of vulnerability. I'm going to focus just on the WorkerRole's message-loop. In the samples the message-loop is written like this:

1. WorkerRole sends a GET request to the StorageService queue.
                [Message A: GET message in transit from WorkerRole to Queue]
2. Queue receives the GET request
3. Queue figures which message to return, and marks it "invisible" for the next 120 seconds.
4. Queue sends back a "200 OK" status response along with the content of that message
                [Message B: OK message in transit from Queue to WorkerRole]
5.Worker receives the "200 OK" message
6. Worker processes the message
7. Worker sends a DELETE to the StorageService queue asking it to delete the message
                [Message C: DELETE message in transit from WorkerRole to Queue]
8. Queue receives the DELETE request
9. Queue deletes the message
10. Queue sends back a "200 OK" status response
                [Message D: OK message in transit from Queue to WorkerRole]
11. Worker receives the response.

Correctness

Now we have to do an exhaustive case analysis of every case where one of the four messages is lost, and also an exhaustive case analysis of every case where one of the two machines crashes in between steps. I'm going to focus on the case where the third message, Message "C", is lost in transit.

If message "C" is lost, then (step 8) the Queue will never receive a DELETE request, and (step 11) the Worker will never receive an "OK". After a delay the worker will time out waiting for the OK. At this point it could decide to retry, but the damage has already been done. (The decision of whether to retry is governed by a user-settable  "RetryPolicy" property which defaults to RetryPolicies.NoRetry. The timout is governed by a user-settable "Timeout" property which defaults to 30 seconds).

Probably by now the Queue's timeout of 120 seconds will have expired, and the Queue will mark the message in question as "Visible" again. Now this instance of the WorkerRole, or another instance, will pick up the work item again and process it. The end result is that the message will have been processed TWICE.

For sake of correctness, you have to structure your program so that whatever work you do in your WorkerRole (step 6), it doesn't matter if that work accidentally happens twice. Such work is called idempotent.

Some easy operations are already idempotent, e.g. generating a thumbnail of an image. It doesn't matter if the same thumbnail is generated twice and stored in a blob.

Other operations are harder to make idempotent, e.g. adding "+1" to a number stored in a blob. The storage APIs provide atomicity through HTTP1.1 conditional headers, e.g. "Put this blob only if it has not been modified since <datestamp>", or "Put this blob only if its contents match <etag>". This is similar to Interlocked.CompareExchange in .NET. Out of this atomicity you will have to build idempotency.

Performance

For performance tuning, what values should you use for the two timeouts? And what retry strategy should you use? At this stage I have no idea. I have an idea that the WorkerRole would store metrics on its historical performance in its local storage, via ILocalResource.

I'm especially eager to experiment with Azure programming using VB's XML literals to send data and construct web-pages. I have to stress that I'm just at the start of learning about Azure. I might have misunderstood parts of Azure, and parts of Azure might still change. If you have any corrections, questions or comments, then please post!

 

Posted by ljw1004 | 1 Comments
Filed under:

Where are the SDK tools? Where is ildasm?

C:\Program Files\Microsoft SDKs\Windows\v6.0\bin\ildasm.exe
C:\Program Files\Microsoft SDKs\Windows\v6.0A\bin\ildasm.exe
C:\Program Files\Microsoft SDKs\Windows\v7.0A\bin\ildasm.exe
C:\Program Files\Microsoft SDKs\Windows\v7.0A\bin\x64\ildasm.exe
C:\Program Files\Microsoft SDKs\Windows\v7.0A\bin\NETFX 4.0 Tools\ildasm.exe
C:\Program Files\Microsoft SDKs\Windows\v7.0A\bin\NETFX 4.0 Tools\x64\ildasm.exe
C:\Program Files (x86)\Microsoft SDKs\Windows\v7.0A\bin\ildasm.exe
C:\Program Files (x86)\Microsoft SDKs\Windows\v7.0A\bin\x64\ildasm.exe
C:\Program Files (x86)\Microsoft SDKs\Windows\v7.0A\bin\NETFX 4.0 Tools\ildasm.exe
C:\Program Files (x86)\Microsoft SDKs\Windows\v7.0A\bin\NETFX 4.0 Tools\x64\ildasm.exe

What are all these different versions for? Which one should I use? Sumit Kumar is the Project Manager for “Windows SDK”, so I sat down with him to find out. The Windows SDK blog is at http://blogs.msdn.com/windowssdk/.

(Note: the last four directories exist but don't actually contain a copy of ildasm.exe. I just put them there for dramatic effect.)

What are the SDK tools?

The SDK tools are a sort of grab-bag of utilities that advanced programmers will find useful. Some are general tools put forward by the Windows and C++ teams. Others are for .Net development from the CLR or Visual Studio teams. There are over 100 tools but so far I’ve only used these:

                ildasm.exe – dumps out the IL bytecode from a .net assembly.
                peverify.exe – verifies that Visual Basic produces valid .net assemblies.
                gacutil.exe – checks which assemblies are in the Global Assembly Cache
                guidgen.exe – generates a new GUID
                consume.exe – consumes memory or CPU or page-file or disk, as a stress test

The Windows SDK blog is a great place to find out about them. Bonus: a .net game called Terrarium!

The “Windows release track of the SDK tools” means that whenever a new version of Windows is released, they release a new version of the SDK tools as a free download. The intention is that this release of the tools is good for targeting this release of Windows. So when Version 6.0 of Windows was released, i.e. Vista, they released the tools in “C:\Program Files\Microsoft SDKs\Windows\v6.0”.

The “Visual Studio release track” means that whenever a new version of Visual Studio is released, it incorporates a version of the SDK tools as well. The intention is that this release of the tools is good for targeting anything that this release of Visual Studio can target. (Because VS can target most prior versions of the .Net framework, it means that the VS version of the SDK tools will also include prior versions of the .Net-specific tools.) For the VS release track, the tools directory name has the suffix “A” after it. So Visual Studio 9 (which came out after v6.0 of Windows) went in “C:\Program Files\Microsoft SDKs\Windows\v6.0A”.

Sharp readers will look at the list above and think “Ahah! In the list above he wrote v7.0A! The “A” means it’s a release of Visual Studio. And the 7.0 means that the next Windows will be released before Visual Studio 2010!” Hold your horses! I’m sorry to disappoint you, but we picked these numbers and directory names for the SDK a long time in advance, there's no guarantee that they're final, and the numbers end up bearing no relation to release schedules.

 

.Net 4.0 ?  64 bit ?

Visual Studio 2010 will ship with a new version of .Net, version 4.0. But it supports "multi-targeting", where you can target older versions of .Net as well. And so it has to ship .Net3.5 versions of the tools as well as .Net4.0 versions of the tools.

Also, some of the tools have 64bit versions; others are 32-bit only. The 64bit versions are only installed if you have a 64bit operating system. The 64bit versions are found in subdirectories called "...\x64". If the x64 directories does not contain a particular tool, then it means that we haven't shipped a 64bit version of that tool, and you're expected fall back to the 32bit version.

So here's the complete table.  Only the shaded rows actually exist. The rest are all potential future versions that Microsoft won't commit to until it ships them.

 

Release "A"

Version

32/64bit

CLR

C:\Program Files\Microsoft SDKs\Windows\...

...\v6.0\bin\ildasm.exe

Windows

Vista

x86

.Net3.5

...\v6.0A\bin\ildasm.exe

VS

2008

x86

.Net3.5

...\v6.1\bin\ildasm.exe

Windows

Server2008

x86

.Net3.5

...\v7.0\bin\ildasm.exe

Windows

?? next version

x86

??

...\v7.0\bin\x64\ildasm.exe

Windows

?? next version

x64

??

...\v7.0A\bin\ildasm.exe

VS

2010

x86

.Net3.5

...\v7.0A\bin\x64\ildasm.exe

VS

2010

x64

.Net3.5

...\v7.0A\bin\NETFX 4.0 Tools\ildasm.exe

VS

2010

x86

.Net4.0

...\v7.0A\bin\NETFX 4.0 Tools\x64\ildasm.exe

VS

2010

x64

.Net4.0

 

C:\Program Files (x86)\Microsoft SDKs\Windows\...

This directory only exists on x64 systems, and it doesn't contain any tools executables.
Ignore it.

 

 

Registry and folder locations

When you launch the "Visual Studio Command Line Tools" shortcut from the start menu, it sets the PATH environment variable to point to appropriate versions of the tools.

But if you have your own tooling and want to point it to the correct locations, you should use the registry. For release 7.0A of the tools we expect to use these registry keys. Once again, these are not final and might change when we ship Visual Studio 2010.

 [HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SDKs\Windows\v7.0A\WinSDK-NetFx35Tools-x64]
"InstallationFolder"="C:\Program Files\Microsoft SDKs\Windows\v7.0A\bin\x64\"

 [HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SDKs\Windows\v7.0A\WinSDK-NetFx35Tools-x86]
"InstallationFolder"="C:\Program Files\Microsoft SDKs\Windows\v7.0A\bin\"

[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SDKs\Windows\v7.0A\WinSDK-NetFx40Tools-x64]
"InstallationFolder"="C:\Program Files\Microsoft SDKs\Windows\v7.0A\bin\NETFX 4.0 Tools\x64\"

[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SDKs\Windows\v7.0A\WinSDK-NetFx40Tools-x86]
"InstallationFolder"="C:\Program Files\Microsoft SDKs\Windows\v7.0A\bin\NETFX 4.0 Tools\"

[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SDKs\Windows\v7.0A\WinSDK-VSHeadersLibs]
"InstallationFolder"="C:\Program Files\Microsoft SDKs\Windows\v7.0A\"

 [HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SDKs\Windows\v7.0A\WinSDK-VSTools]
"InstallationFolder"="C:\Program Files\Microsoft SDKs\Windows\v7.0A\"

 [HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SDKs\Windows\v7.0A\WinSDK-VSWin32Tools]
"InstallationFolder"="C:\Program Files\Microsoft SDKs\Windows\v7.0A\"

 [HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SDKs\Windows\v7.0A\WinSDKIntellisenseRefAssys]
"InstallationFolder"="C:\Program Files\Microsoft SDKs\Windows\v7.0A\"

 

The set of registry keys isn't complete for all past versions of the tools. To compute the path to earlier versions of the tools is awkward. That's because different languages use different paths (e.g. Italian versions of Windows use C:\Programmi) and because users can themselves choose different paths (e.g. E:\Apps).

Notionally in VB you should use My.Computer.FileSystem.SpecialDirectories.ProgramFiles. This returns "C:\Program Files" or its equivalent, and works great most of the time. The only exception is if you're running on a 64bit operating system and your assembly is set to target "CPU:x86" rather than AnyCPU or x64. In this case it returns "C:\Program Files (x86)" or its equivalent, which isn't any use.

Anyway, the Windows SDK tools use non-localized directory and filenames. So once you get to the path to the program files directory, you can append "\Microsoft SDKs\Windows\v6.0\bin\ildasm.exe" or similar to point to a particular tool.

Posted by ljw1004 | 0 Comments
Filed under:

My dog has no type (Expressions with "Superposition" types)

"My dog has no type."
"How does he smell?"
"Awful!"

This article originally was called "Expressions with no types" but that was a misleading title. This article is really about expressions which have a "quantum superposition" of several types: that's to say, the expression on its own could be one of several different types, but then the immediate context of that expression makes it collapse down to just one of those types.

Superposition types are used in two of the new features of VB10 -- array literals and multi-line lambdas. Really, for just about all programming in VB, users of the language don't need to care about them: the language just picks the types that are obviously correct. It's only language lawyers and pedants who will want to understand these special expressions. I remember at my very first undergraduate computer science lecture, the lecturer Frank King handed out the class list and asked for corrections. One student raised his hand, apologized for being pedantic, and said that his name had an "å" with a ring above it, not a plain "a". Dr King praised him for the correction, saying "computer science needs pedants". Pedants, please read on...

For completeness, I'll discuss all the VB expressions that have superposition type: Nothing, Lambdas, Array Literals, AddressOf and CallsToSubs.

To stress again, we're not talking about "Option Strict Off" and untyped expressions. We're talking about strongly-typed VB, with Option Strict On, and about which type the compiler picks for these expressions. Answer: it depends on the context they're in...

Nothing

Dim x0 As Integer = Nothing ' can reclassify Nothing as Integer

Dim y0 As Date = Nothing ' can reclassify Nothing as Date

 

Dim x1 As Integer = 0

Dim y1 As Date = x1  ' error: "cannot convert Integer to Date"

Dim z = Nothing ' infers "z As Object"

Let's start with the biggest example of an expression that has a superposition of types, Nothing. You can assign Nothing to an Integer and to a Date, even though there's no conversion allowed between Integers and Dates. So "Nothing" is obviously special in some way:

·         In a context where the target type is known (e.g. x0 and y0 above), then the "Nothing" expression has the type of that target, and its value is the default value of that type.

·         In a context where the target type is unknown (e.g. z above), then the "Nothing" expression has type Object and its value is a null pointer.

Lambdas

Dim f0 As Func(Of String) = Function() "hello" ' can reclassify lambda as delegate

Dim f1 As Expressions.Expression(Of Func(Of String)) = Function() "hello" ' can reclassify lambda as expression

 

Dim f2 = Function() "hello" ' infers "f2 As VB$AnonymousDelegate(Of String)"

 

Dim f3 As Func(Of String) = f0

Dim f4 As Func(Of String) = f1 ' error: "cannot convert Expression(Of Func(Of String)) to Func(Of String)"

Dim f5 As Func(Of String) = f2

f2 = f5 ' error: "cannot convert Func(Of String) to VB$AnonymousDelegate(Of String)"

 

Dim f6 As Expressions.Expression(Of Func(Of String)) = f0 ' error: "cannot convert Func(Of String) to Expression(Of Func(Of String))"

Dim f7 As Expressions.Expression(Of Func(Of String)) = f1

Dim f8 As Expressions.Expression(Of Func(Of String)) = f2 ' error: "cannot convert Func(Of String) to Expression(Of Func(Of String))"

 

I've structured this source code similarly to that for Nothing because they work in similar ways. You can assign a lambda to a Func(Of String) [f0], and to a Expression(Of Func(Of String)) [f1], but you can't convert a Func(Of String) to an Expression(Of Func(Of String)) [f6] nor vice versa [f4].

·         In a context where the target type is known and a delegate, then a lambda expression has the type of that target, and its value is a newly constructed delegate.

·         In a context where the target type is known and an Expression(Of DelegateType), then a lambda expression as the type of that Expression and its value is an expression-tree corresponding to its body. Note: in VB10 this only works for single-line lambdas: it is an error for multi-line lambdas.

·         In contexts where the target type is unknown, or where it is known but neither a delegate nor an Expression(Of DelegateType), then the lambda expression has the appropriately constructed VB$AnonymousDelegate generic type, and its value is a newly constructed delegate.

This VB$AnonymousDelegate type is interesting because it can be converted to any compatible named delegate type [f5], even though the reverse isn't true [f2=f5]. Sometimes the conversion from VB$AnonymousDelegate to named delegate can be done just by a CLR cast; sometimes it requires the construction of an intermediate lambda. I can write more about this if anyone asks!

Array Literals

Dim i0 As Integer() = {1, 2, 3}

Dim d0 As Double() = {1, 2, 3}

 

Dim d1 As Double() = i0 ' error: "Cannot convert Integer() to Double()"

Dim i1 As Integer() = d0 ' error: "Cannot convert Double() to Integer()"

 

Dim i2 = {1, 2, 3} ' infers "i2 As Integer()"

 

Again, array literals work in similar ways to Nothing. You can assign an array literal to an Integer() [i0], and you can assign the same array literal to a Double() [d0], even though you can't convert from Integer() to Double() nor vice versa.

·         In a context where the target type is known and an array, then an array literal has the type of that target, and its value is a newly constructed array. Each of its element expressions are interpreted in a context where their target type is the array element type.

·         In a context where the target type is unknown, or is known but not an array type, then an array literal has type "Array of T with rank r" where T is the dominant type of the element expressions, and the r is the inferred rank of the array literal. The "dominant type of expressions" algorithm is new to VB10. It is also used in multi-line lambdas, and for the IF operator. I'll write more about it in a later post.

·         Special case: in an initializer context "Dim x(,) = {}" the target type is partially known: it's rank is known, but its element type is not. In the special case where the target rank is known to be r, where target element type is unknown, and where the array literal is empty, then the array literal has the type "Array of Object with rank r" and its value is an empty array. In all other cases of partially known target types, we interpret the array literal as though its target type were unknown.

The special case looks odd. We needed it because "Dim x(,) = {}" looks like a reasonable statement that should work, and to maintain backwards compatibility with Visual Studio 2008.

Actually, there's one case where we broke backwards compatibility. "Dim x() = {1,2,3}" would infer "x As Object()" in VS2008, but will infer "x As Integer()" in VS2010. We take backwards compatibility breaks very seriously. In this case we opted for the break to make array literals more intuitive.

AddressOf

Dim a1 As Action(Of Integer) = AddressOf f

a1(Nothing) ' prints "integer"

 

Dim a2 As Action(Of String) = AddressOf f

a2(Nothing) ' prints "string"

 

Dim a0 = AddressOf f

' Error: "AddressOf can't convert to Object since its not a delegate"

 

 

Sub f(ByVal x As Integer)

    Console.WriteLine("integer")

End Sub

 

Sub f(ByVal x As String)

    Console.WriteLine("string")

End Sub

 

As seen here, when you write "AddressOf f", the compiler can't figure out which "f" you're referring to without also knowing the target type.

·         In a context where the target type is known and a delegate (e.g. a1 and a2 above), then the "AddressOf f" expression has the type of that target, and its value is a newly constructed delegate.

·         In a context where the target type is unknown, or known and not a delegate, then the "AddressOf f" expression is an error

Calls to Subs

Dim f = Sub() Main()

Dim g = Function() Main() ' error: "Expression does not produce a value"

 

I've added this section just for completeness. In a single-line lambda, the body of the lambda is an expression. In this case the expression is a call to the sub "Main". What is the type of this call-to-sub expression? Answer: it doesn't have a type. That's why [g] gives the error message it does.

Posted by ljw1004 | 2 Comments
Filed under:

LiveRun - a VS plugin to see the output of your program immediately

Say you're demonstrating a compiler at a conference. What's the best way to do it?

Should you just type in code in the code window? Doing this, you're relying on the audience's imagination -- that they form a mental picture of how the program will behave. You're also relying on their trust that your code really does what you say it does.

Or should you execute your code every minute or so, so the program's output window pops up and the audience can see that the code really works? Here it's risky because each time you switch it breaks the flow. And you're relying on the audience to remember what the code each time they look at the output.

I think this is one of those problems that can be solved by technology! I wrote small plugin for Visual Studio 2008. It looks at what the current text buffer contains, compiles it in the background, and displays the output in a topmost window. It does this every two seconds or so. You don't even need to save or recompile to see the output. It only makes sense for standalone console programs that don't take input. Here's a screenshot:

screenshot of LiveRun

 

 

The source code is small and straightforward, and available for download at the link above.

 

There were two "gotcha" moments. The first was to do with multi-threading. I wanted the source code to be compiled in a background thread so it wouldn't interfere with the Visual Studio UI. But to grab the text of the current buffer you have to be in the UI thread, and also to display the output you have to be in the UI thread. I used a System.Timers.Timer, which fires its events in the background thread, and called form.Invoke(...) for any tasks that needed the UI thread.

I also used a "non-AutoReset" timer. I wanted it to get the source code and compile+run+display it, then pause for two seconds, then get the source code and compile+run+display it, then pause for two seconds, and so on. In other words the timer interval has to be two seconds after the end of handling the previous timer event.

''' <summary>

''' OnTimer handles the non-autoreset timer signal. It runs in a background thread. It gets the source

''' code from the current buffer, and compiles it, and displays the output.

''' </summary>

''' <remarks></remarks>

Sub OnTimer() Handles t.Elapsed

    Try

        Dim oldsrc = src

        ' We're in a background thread. But the source can only be obtained from the UI thread...

        ' This delegate will get the source and store it in the "src" field

        f.Invoke(New Action(AddressOf GetSource))

        If src <> oldsrc Then

            Dim oldoutput = output

            ' We want to compile-and-run in the background thread

            output = CompileAndRun(src)

            If output <> "" OrElse oldoutput = "" Then

                ' Displaying the output on-screen must be done in the UI thread.

                ' This delegate gets the content of the "output" field and displays it

                f.Invoke(New Action(AddressOf ShowOutput))

            End If

        End If

    Finally

        t.Start()

    End Try

End Sub

 

The other "gotcha" moment had to do with how to execute the code and capture its output. VB has very nice helper functions surrounding this, in the "My" namespace. My main concern was to recover from exceptions gracefully without leaving any mess. (Note: the code for getting a temporary filename isn't quite correct: the mere fact that you got a temporary unused filename one statement ago does not mean that the filename will still be unused; nor does it mean that the filename with ".vb" appended to it will be unused. But doing it more correctly didn't seem worth the bother; in any case, the exception handling means we'll recover okay from problems.)

 

Function CompileAndRun(ByVal src As String) As String

    Dim fn_exe = ""

    Dim fn_src = ""

    Dim vbc As System.Diagnostics.Process = Nothing

    Dim exe As System.Diagnostics.Process = Nothing

    Try

        ' Prepare for compilation

        fn_src = My.Computer.FileSystem.GetTempFileName() & ".vb"

        My.Computer.FileSystem.WriteAllText(fn_src, src, False)

        fn_exe = My.Computer.FileSystem.GetTempFileName() & ".exe"

        Dim framework = Environment.ExpandEnvironmentVariables("%windir%\Microsoft.Net\Framework")

        Dim latest_framework = (From d In My.Computer.FileSystem.GetDirectories(framework) Where d Like "*\v*" Select d).Last

 

        ' Compile it

        vbc = System.Diagnostics.Process.Start(New ProcessStartInfo _

                            With {.CreateNoWindow = True, _

                                  .UseShellExecute = False, _

                                  .FileName = latest_framework & "\vbc.exe", _

                                  .Arguments = String.Format("/out:""{0}"" /target:exe ""{1}""", fn_exe, fn_src)})

        Dim vbc_done = vbc.WaitForExit(3000)

        If Not vbc_done Then Return ""

        If vbc.ExitCode <> 0 Then Return ""

 

        ' Execute it

        Dim pinfo = New ProcessStartInfo With {.CreateNoWindow = True, _

                                               .UseShellExecute = False, _

                                               .FileName = fn_exe, _

                                               .RedirectStandardOutput = True}

        exe = New System.Diagnostics.Process With {.StartInfo = pinfo}

        exe.Start()

        Dim output = exe.StandardOutput.ReadToEnd

        Dim exe_done = exe.WaitForExit(3000)

        If Not exe_done Then Return ""

        Return output

    Finally

        ' Close the VBC process as neatly as we can

        If vbc IsNot Nothing Then

            If Not vbc.HasExited Then

                Try : vbc.Kill() : Catch ex As Exception : End Try

                Try : vbc.WaitForExit() : Catch ex As Exception : End Try

            End If

            Try : vbc.Close() : Catch ex As Exception : End Try

            vbc = Nothing

        End If

 

        ' Close the EXE as neatly as we can

        If exe IsNot Nothing Then

            If Not exe.HasExited Then

                Try : exe.Kill() : Catch ex As Exception : End Try

                Try : exe.WaitForExit() : Catch ex As Exception : End Try

            End If

            Try : exe.Close() : Catch ex As Exception : End Try

            exe = Nothing

        End If

 

        ' Delete leftover files

        Try : My.Computer.FileSystem.DeleteFile(fn_exe) : Catch ex As Exception : End Try

        Try : My.Computer.FileSystem.DeleteFile(fn_src) : Catch ex As Exception : End Try

    End Try

End Function

 

As always, I love to hear suggestions and bugfixes and code improvements and comments!

Posted by ljw1004 | 1 Comments
Filed under:

Reflection on COM objects

I'd like to own a "Gestalt Camera". When you photograph an object it wouldn't just save a flat 2-dimensional projection of the object onto an SD card; instead it'd record the "gestalt", an understanding of the whole object and its complete web of relations. This would include a 3d representation of the object from all angles, an essay on its historical significance, a description of the cultural and economic role it plays, detailed internal diagrams showing how it works, a set of hyperlinks to related topics -- and it'll save all this in a wikipedia article.

How would you build such a camera? It's easy! Just take an existing Gestalt Camera, point it at a mirror, and have it take a gestalt photo of its own reflection! Here's the result: http://en.wikipedia.org/wiki/Gestalt_camera. [just a joke: the link doesn't really work.]

That's my roundabout introduction to reflection...

Reflection on .Net objects is done through System.Type and is very easy. For instance, "Dim type = GetType(System.String)" and now you can look at all the members and inheritance hierarchy of the System.String class.

Reflection on COM types is also easy if they have an interop assembly. For instance, add a project reference to the COM Microsoft Speech Library and again do "GetType(SpeechLib.SpVoice)". This lets you reflect on the .Net "Runtime Callable Wrapper" that's in the interop assembly, that was generated from the COM type's type library, and that contains all information that the type library had.

But sometimes you'll be given COM objects that don't have .Net interop assemblies in your code. I ran into this when I wrote a managed plugin for Visual Studio. For reflection here you have to use ITypeInfo instead of System.Type. Here's code to get that ITypeInfo, then dig through it and print out all the members. I'm a novice at COM programming, so I'd welcome suggestions and improvements. (Note: I deliberately didn't attempt to invent some API that would wrap ITypeInfo/TYPEDESC, but it looks ripe for it...)

' REFLECTION ON COM OBJECTS. Lucian Wischik, October 2008.

' (with thanks to Eric Lippert and Sonja Keserovic for their help)

'

' CLR objects let you use .Net reflection on them, via GetType().

' But for COM objects you sometimes have to use the more awkward COM reflection via ITypeInfo/TYPEDESC.

' It all boils down to type libraries...

' * If the COM object's type library has been translated into a managed Runtime Callable Wrapper (RCW)

'   then you can reflect on it using .Net reflection. RCWs are generated automatically when you

'   add a reference to a COM library.

' * If there's no RCW, then you have to use ITypeInfo to query the type library.

'   An ITypeInfo is a pointer to an COM type's information within the type library, and gives

'   you the same kind of information as does System.Type. Incidentally, Visual Studio uses the same kind

'   of reflection to provide intellisense for COM objects.

' * And if there's no type library at all, then you can't do any reflection on an object

'   (unless it happens to implement IDispatchEx -- which we don't go into here).

'

' ITypeInfo -- Represents a class/interface/structure defined in a type library

' TYPEDESC -- Represents atomic types (e.g. Integer), and also compound types

'             (e.g. an array whose element type is an ITypeInfo, or a reference

'             to an ITypeInfo). Used to describe function parameter types and

'             return types.

'

' Here's how to reflect using ITypeInfo...

'

 

Option Strict On

Imports System.Runtime.InteropServices

 

 

Module Module1

 

    ''' <summary>

    ''' UnmanagedCreateCOM: this is an unmanaged function which calls CoCreateInstance

    ''' to create an instance of CLSID_WebBrowser.

    ''' </summary>

    ''' <returns>returns a new COM object. The caller is expected to AddRef on it.</returns>

    <DllImport("createcom.dll", SetLastError:=False)> _

    Function UnmanagedCreateCOM() As IntPtr

    End Function

 

 

    Sub Main()

        ' Reflection on .net objects is straightforward:

        Console.WriteLine("=== REFLECTION ON .NET TYPE VIA .NET REFLECTION ===")

        ReflectOnDotNetType(GetType(System.String))

 

        ' Reflection on COM objects is easy when they've been added as references...

        ' We have added a COM reference to the Microsoft Speech Library. And now we reflect

        ' on it using normal .net reflection:

        Console.WriteLine("=== REFLECTION ON RCW'D COM TYPE VIA .NET REFLECTION ===")

        ReflectOnDotNetType(GetType(SpeechLib.SpVoice))

 

        ' But .net reflection gives pointless results on COM objects which lack an interop assembly:

        ' GetObjectForIUnknown just creates a tiny stub RCW for them with a handful of common functions.

        Console.WriteLine("=== REFLECTION ON NON-RCW'D COM TYPE VIA ITYPEINFO REFLECTION ===")

        ReflectOnDotNetType(Marshal.GetObjectForIUnknown(UnmanagedCreateCOM()).GetType())

 

        ' Instead we have to reflect using ITypeInfo:

        Console.WriteLine("=== REFLECTION ON NON-RCW'D COM TYPE VIA COM REFLECTION ===")

        ReflectOnCOMObjectThroughITypeInfo(Marshal.GetObjectForIUnknown(UnmanagedCreateCOM()))

    End Sub

 

 

 

    ''' <summary>

    ''' ReflectOnDotNetType: reflects on a System.Type using .Net reflection

    ''' </summary>

    ''' <param name="tt">the type to reflect upon</param>

    Sub ReflectOnDotNetType(ByVal tt As System.Type)

        Dim qt As New Queue(Of System.Type)

        qt.Enqueue(tt)

        While qt.Count > 0

            Dim t = qt.Dequeue

            Console.WriteLine("TYPE {0}", t.ToString)

            For Each i In t.GetInterfaces

                Console.WriteLine("  inherits {0}", i.ToString)

                qt.Enqueue(i)

            Next

            For Each m In t.GetMembers

                Console.WriteLine("  member {0}", m.ToString)

            Next

        End While

    End Sub

 

    ''' <summary>

    ''' IDispatch: this is a managed version of the IDispatch interface

    ''' </summary>

    ''' <remarks>We don't use GetIDsOfNames or Invoke, and so haven't bothered with correct signatures for them.</remarks>

    <ComImport(), Guid("00020400-0000-0000-c000-000000000046"), InterfaceType(ComInterfaceType.InterfaceIsIUnknown)> _

    Interface IDispatch

        Sub GetTypeInfoCount(ByRef pctinfo As UInteger)

        Sub GetTypeInfo(ByVal itinfo As UInteger, ByVal lcid As UInteger, ByRef pptinfo As IntPtr)

        Sub GetIDsOfNames_unused()

        Sub Invoke_unused()

    End Interface

 

 

    ''' <summary>

    ''' ReflectOnCOMObjectThroughITypeInfo: given a com object that supports IDispatch, attempts

    ''' to get its ITypeInfo interface (which represents the object's entry in its type-library),

    ''' and reflect on the object through this.

    ''' </summary>

    ''' <param name="com">the com object upon which to reflect</param>

    Sub ReflectOnCOMObjectThroughITypeInfo(ByVal com As Object)

        ' How do we get ITypeInfo for a COM object?

        ' It would be nice to use Marshal.GetITypeInfoForType. But that fails when the com object

        ' doesn't have an interop assembly (e.g. when the com object was created for us

        ' by native code). So instead we have to use IDispatch::GetTypeInfo.

        Dim idisp = CType(com, IDispatch)

        Dim count As UInteger = 0 : idisp.GetTypeInfoCount(count)

        If (count < 1) Then Throw New ArgumentException("No type info", "com")

        Dim _typeinfo As IntPtr : idisp.GetTypeInfo(0, 0, _typeinfo)

        If (_typeinfo = IntPtr.Zero) Then Throw New ArgumentException("No ITypeInfo", "com")

        Dim typeInfo = CType(Marshal.GetTypedObjectForIUnknown(_typeinfo, GetType(ComTypes.ITypeInfo)), ComTypes.ITypeInfo)

        Marshal.Release(_typeinfo) ' to release the AddRef that GetTypeInfo did for us.

 

        AddTypeInfoToDump(typeInfo)

        While typeInfosToDump.Count > 0

            DumpTypeInfo(typeInfosToDump.Dequeue())

        End While

    End Sub

 

 

    ''' <summary>

    ''' DumpType: prints information about an ITypeInfo type to the console -- name, inheritance, members

    ''' </summary>

    ''' <param name="typeInfo">the type to dump</param>

    Sub DumpTypeInfo(ByVal typeInfo As ComTypes.ITypeInfo)

 

        ' Name:

        Dim typeName = "" : typeInfo.GetDocumentation(-1, typeName, "", 0, "")

        Console.WriteLine("TYPE {0}", typeName)

 

 

        ' TypeAttr: contains general information about the type

        Dim pTypeAttr As IntPtr : typeInfo.GetTypeAttr(pTypeAttr)

        Dim typeAttr = CType(Marshal.PtrToStructure(pTypeAttr, GetType(ComTypes.TYPEATTR)), ComTypes.TYPEATTR)

 

 

        ' Inheritance:

        For iImplType = 0 To typeAttr.cImplTypes - 1

            Dim href As Integer : typeInfo.GetRefTypeOfImplType(iImplType, href)

            ' "href" is an index into the list of type descriptions within the type library.

            Dim implTypeInfo As ComTypes.ITypeInfo = Nothing : typeInfo.GetRefTypeInfo(href, implTypeInfo)

            ' And GetRefTypeInfo looks up the index to get an ITypeInfo for it.

            Dim implTypeName = "" : implTypeInfo.GetDocumentation(-1, implTypeName, "", 0, "")

            Console.WriteLine("  Implements {0}", implTypeName)

            AddTypeInfoToDump(implTypeInfo)

        Next

 

 

        ' Function/Sub/Property members:

        ' Note that property accessors are flattened, e.g. for a property "Fred as Integer"

        ' it will be represented as two members "[Get] Function Fred() As Integer", and "[Put] Sub Fred(Integer)"

        ' Each member is uniquely identified by an integer "MEMID".

        ' This memid is what's used e.g. when invoking the member.

        For iFunc = 0 To typeAttr.cFuncs - 1

 

            ' FUNCDESC is the key datastructure here:

            Dim pFuncDesc As IntPtr : typeInfo.GetFuncDesc(iFunc, pFuncDesc)

            Dim funcDesc = CType(Marshal.PtrToStructure(pFuncDesc, GetType(ComTypes.FUNCDESC)), ComTypes.FUNCDESC)

 

            ' Each function notionally has a list of names associated with it. I'll just pick the first.

            Dim names As String() = {""}

            typeInfo.GetNames(funcDesc.memid, names, 1, 0)

            Dim funcName = names(0)

 

            ' Function formal parameters:

            Dim cParams = funcDesc.cParams

            Dim s = ""

            For iParam = 0 To cParams - 1

                Dim elemDesc = CType(Marshal.PtrToStructure(New IntPtr(funcDesc.lprgelemdescParam.ToInt64 + Marshal.SizeOf(GetType(ComTypes.ELEMDESC)) * iParam), GetType(ComTypes.ELEMDESC)), ComTypes.ELEMDESC)

                If s.Length > 0 Then s &= ", "

                If (elemDesc.desc.paramdesc.wParamFlags And 2) <> 0 Then s &= "out "

                s &= DumpTypeDesc(elemDesc.tdesc, typeInfo)

            Next

 

            ' And print out the rest of the function's information:

            Dim props = ""

            If (funcDesc.invkind And ComTypes.INVOKEKIND.INVOKE_PROPERTYGET) <> 0 Then props &= "Get "

            If (funcDesc.invkind And ComTypes.INVOKEKIND.INVOKE_PROPERTYPUT) <> 0 Then props &= "Set "

            If (funcDesc.invkind And ComTypes.INVOKEKIND.INVOKE_PROPERTYPUTREF) <> 0 Then props &= "Set "

            Dim isSub = (funcDesc.elemdescFunc.tdesc.vt = VarEnum.VT_VOID)

            s = props & If(isSub, "Sub ", "Function ") & funcName & "(" & s & ")"

            s &= If(isSub, "", " as " & DumpTypeDesc(funcDesc.elemdescFunc.tdesc, typeInfo))

            Console.WriteLine("  " & s)

            typeInfo.ReleaseFuncDesc(pFuncDesc)

        Next

 

 

        ' Field members:

        For iVar = 0 To typeAttr.cVars - 1

            Dim pVarDesc As IntPtr : typeInfo.GetVarDesc(iVar, pVarDesc)

            Dim varDesc = CType(Marshal.PtrToStructure(pVarDesc, GetType(ComTypes.VARDESC)), ComTypes.VARDESC)

            Dim names As String() = {""}

            typeInfo.GetNames(varDesc.memid, names, 1, 0)

            Dim varName = names(0)

            Console.WriteLine("  Dim {0} As {1}", varName, DumpTypeDesc(varDesc.elemdescVar.tdesc, typeInfo))

        Next

 

        Console.WriteLine()

    End Sub

 

 

 

    ''' <summary>

    ''' DumpTypeDesc: given a TYPEDESC, dumps it out into a string e.g. "Ref Int" or

    ''' "Array of MyTypeInfo". Also calls AddTypeInfoToDump for every ITypeInfo encountered.

    ''' </summary>

    ''' <param name="tdesc">the TYPEDESC to dump</param>

    ''' <param name="context">the ITypeInfo that contained this TYPEDESC, for context</param>

    ''' <returns>a string representation of the TYPEDESC</returns>

    Function DumpTypeDesc(ByVal tdesc As ComTypes.TYPEDESC, ByVal context As ComTypes.ITypeInfo) As String

        Dim vt = CType(tdesc.vt, VarEnum)

        Select Case vt

 

            Case VarEnum.VT_PTR

                Dim tdesc2 = CType(Marshal.PtrToStructure(tdesc.lpValue, GetType(ComTypes.TYPEDESC)), ComTypes.TYPEDESC)

                Return "Ref " & DumpTypeDesc(tdesc2, context)

 

            Case VarEnum.VT_USERDEFINED

                Dim href = tdesc.lpValue.ToInt32()

                Dim refTypeInfo As ComTypes.ITypeInfo = Nothing : context.GetRefTypeInfo(href, refTypeInfo)

                AddTypeInfoToDump(refTypeInfo)

                Dim refTypeName = "" : refTypeInfo.GetDocumentation(-1, refTypeName, "", 0, "")

                Return refTypeName

 

            Case VarEnum.VT_CARRAY

                Dim tdesc2 = CType(Marshal.PtrToStructure(tdesc.lpValue, GetType(ComTypes.TYPEDESC)), ComTypes.TYPEDESC)

                Return "Array of " & DumpTypeDesc(tdesc2, context)

                ' lpValue is actually an ARRAYDESC structure, which also has information on the array dimensions,

                ' but alas .Net doesn't predefine ARRAYDESC.

 

            Case VarEnum.VT_VOID ' e.g. IUnknown::QueryInterface(Ref GUID, out Ref Ref Void)

                Return "Void"

            Case VarEnum.VT_VARIANT

                Return "Object"

            Case VarEnum.VT_UNKNOWN

                Return "IUnknown*"

 

            Case VarEnum.VT_BSTR

                Return "String"

            Case VarEnum.VT_LPWSTR

                Return "wchar*"

            Case VarEnum.VT_LPSTR

                Return "char*"

 

            Case VarEnum.VT_HRESULT

                Return "HResult"

 

            Case VarEnum.VT_BOOL

                Return "Bool"

            Case VarEnum.VT_I1

                Return "SByte"

            Case VarEnum.VT_UI1

                Return "Byte"

            Case VarEnum.VT_I2

                Return "Short"

            Case VarEnum.VT_UI2

                Return "UShort"

            Case VarEnum.VT_I4, VarEnum.VT_INT ' I don't know the difference

                Return "Integer"

            Case VarEnum.VT_UI4, VarEnum.VT_UINT ' I don't know the difference

                Return "UInteger"

            Case VarEnum.VT_I8

                Return "Long"

            Case VarEnum.VT_UI8

                Return "ULong"

 

            Case Else

                ' There are many other VT_s that I haven't special-cased yet.

                ' That's just because I haven't encountered them yet in my test-cases.

                Return vt.ToString()

        End Select

    End Function

 

 

    Dim typeInfosToDump As New Queue(Of ComTypes.ITypeInfo)

    Dim typeInfosDumped As New HashSet(Of String)

    '

    Sub AddTypeInfoToDump(ByVal typeInfo As ComTypes.ITypeInfo)

        Dim typeName = "" : typeInfo.GetDocumentation(-1, typeName, "", 0, "")

        If typeInfosDumped.Contains(typeName) Then Return

        typeInfosToDump.Enqueue(typeInfo)

        typeInfosDumped.Add(typeName)

    End Sub

 

End Module 

 

Posted by ljw1004 | 1 Comments
Filed under:

Co- and contra-variance: how do I convert a List(Of Apple) into a List(Of Fruit)?

This is the first in a series of posts exploring how we might implement generic co- and contra-variance in a hypothetical future version of VB. This is not a promise about the next version of VB; it's just one possible proposal, written up here to get early feedback from potential users.

 

Sub EatFruit(ByVal x As IEnumerable(Of Fruit))

...

 

Dim x As New List(Of Apple)

x.Add(New GrannySmith)

x.Add(New GoldenDelicious)

EatFruit(x)

' ERROR: cannot convert List(Of Apple) to IEnumerable(Of Fruit)

Look at the above code. You'd think it should work. It's a common enough scenario: there's a library function which handles some kind of data type, but you've inherited from that type for your own purposes. How can you pass a collection of your own inherited type into the library function?

We're considering a VB language feature to support this kind of conversion. The topic is called "Co- and contra-variance", or just "variance" for short. Variance has actually been in the CLR since 2005 or so, but no one's yet released a .net language that uses it. There are other languages with it, though. Here are some links to what people have written on the topic.

I'll talk about how you could use variance practically in VB, where it could make your code easier or cleaner, and what problems it might solve if we implement it. There's much more to variance than just converting apples into fruit, and it gets trickier as the above articles show, but I think the practical syntax and examples that we're proposing for VB could demystify it.

Here's a practical problem I had just yesterday that could have been solved by variance:

 

Function Call(instance As Expression, method As MethodInfo, arguments As IEnumerable(Of Expression)) As MethodCallExpression

...

 

' Create a new callsite that takes two arguments:

Dim args As New List(Of ConstantExpression)

args.Add(Expression.Constant("x"))

args.Add(Expression.Constant("y"))

'

Dim call1 = Expression.Call(instance, method, args)

' args inherits from IEnumerable(Of ConstantExpression), which

' variance-converts to IEnumerable(Of Expression)

 

For this first article, though, we'll stick to just fruit.

 

' some example classes to get us started

Class Food : End Class

Class Fruit : Inherits Food : End Class

Class Apple : Inherits Fruit : End Class

Class GrannySmith : Inherits Apple : End Class

Class GoldenDelicious : Inherits Apple : End Class

 

' GoldenDelicious < Apple < Fruit < Food

' using < in the mathematical sense of "is smaller than",

' and in the VB sense of "can be converted to"

 

Class AppleBasket

    Implements IReadOnly(Of Apple)

    Implements IWriteOnly(Of Apple)

End Class

 

 

"Out" parameters

We're thinking of using contextual keywords "Out" and "In" to introduce variance:

Interface IReadOnly(Of Out T)

    Function Read() As T

End Interface

' "Out" declares that T will only ever be used

' as return type of functions *

 

Dim x As IReadOnly(Of Apple) = New AppleBasket

Dim y As IReadOnly(Of Fruit) = x

 

Dim f As Fruit = y.Read()

' This is guaranteed not to throw InvalidCastException

 

When the interface declares its type parameter as "Out", it makes a promise to only ever use that type for function returns (* or other places where it outputs data). The interface will be held to that promise: if it tries to do "Sub f(ByVal x As T)" then it's a compile-time error. (A lot of the design is constrained by how the CLR uses and represents variance; we want compatibility with other .Net languages.)

It's this "Out" promise that lets the CLR convert the interface:

 

 

' GoldenDelicions < Apple < Fruit < Food < Object

 

Dim apples As IReadOnly(Of Apple) = New AppleBasket

 

' It is allowed to change to an IReadOnly of something bigger:

Dim fruits As IReadOnly(Of Fruit) = apples

Dim foods As IReadOnly(Of Food) = apples

Dim things As IReadOnly(Of Object) = fruits

 

' It is an ERROR to change to an IReadOnly that is smaller:

Dim golds As IReadOnly(Of GoldenDelicious) = apples

 

' Also an ERROR to change to something unrelated

Dim cars As IReadOnly(Of Car) = apples

 

 

In general, if you have a generic interface IReadOnly(Of Out T), then you can cast from it from "Of T" to something that T converts to. And it's typesafe, for obvious reasons.

Variance conversions are typesafe and efficient. It takes only a single IL instruction to do a variance conversion. There are NO runtime checks required. (This differs from arrays, which have to do a runtime type-check every time you put something into the array.)

Interfaces with "Out" parameters are called covariant in the literature.

"In" parameters

Interface IWriteOnly(Of In T)

    Sub Write(ByVal x As T)

End Interface

' "In" declares that T will only ever be used

' as ByVal arguments to functions.

 

Dim x As IWriteOnly(Of Apple) = New AppleBasket

Dim z As IWriteOnly(Of GoldenDelicious) = x

 

z.Write(New GoldenDelicious)

 

"In" parameters are the opposite. When an interface declares one of its type parameter T as "In", it's promising only ever to use T for ByVal arguments (* or other places where the interface takes data in). Again the interface will be held to that promise: if it tries to do "Function f() as T" then it's a compile-time error.

And "In" parameters let you do the opposite kinds of conversion:

' GoldenDelcious < Apple < Fruit < Food < Object

 

Dim apples As IWriteOnly(Of Apple) = New AppleBasket

 

' It is allowed to convert to an IWriteOnly of something smaller:

Dim golds As IWriteOnly(Of GoldenDelicious) = apples

 

' It is an ERROR to convert to something bigger, or unrelated:

Dim foods As IWriteOnly(Of Food) = apples

Dim cars As IWriteOnly(Of Car) = apples

 

Interfaces with "In" parameters are called contravariant in the literature.

"In" and "Out" together

Up until the early 1990s, people used to argue about whether "In" or "Out" parameters were the right thing to have. We now know that they're both right! The first convincing argument for this was in 1995 in Giuseppe Castagna's 1995 research paper "Conflict Without A Cause" [PDF].

Here are two examples for why they're both right, and how they both work together:

 

Class AppleBasket

  Implements IReadOnly(Of Apple)

  Implements IWriteOnly(Of Apple)

 

  Private m_value As Apple

 

  Public Function Read() As Apple Implements IReadOnly(Of Apple).Read

    Return m_value

  End Function

 

  Public Sub Write(ByVal x As Apple) Implements IWriteOnly(Of Apple).Write

    m_value = x

  End Sub

End Class

 

 

Pipes: using "In" and "Out" for internal and external contracts

 

' Here we implement a Pipe. Each element in the pipe is an ICollection.

'    IList <  ICollection  <  IEnumerable

'

' When we give out reader ("Out") access to the public, we force it so

' readers can only ever assume that elements are IEnumerable.

' And when we give out writer ("In") access, we force it so

' that writers must always put in IList

'

' This future-proofs our code in TWO directions: it forces the

' implementation to provide IList in case in the future we want

' to expose more to the clients; but it does so without making

' a public commitment to the clients that future implementations

' would have to uphold.

 

Class MyPipe(Of T)

  Implements IWriteOnly(Of T)

  Implements IReadOnly(Of T)

 

  Private contents As New Stack(Of T)

 

  Public Sub Write(ByVal x As T) Implements IWriteOnly(Of T).Write

    contents.Push(x)

  End Sub

 

  Public Function Read() As T Implements IReadOnly(Of T).Read

    Return contents.Pop()

  End Function

End Class

 

We are eager for customer feedback as we consider whether to add this feature to the VB language, and think about how it might work. Please add your comments.

I'll be writing more on variance (a lot more) in the weeks to come.

PS. As for the title of this article, here's what we envisage...

Dim x As New List(Of Apple)

Dim y As List(Of Fruit) = x

'

' ERROR: List(Of Fruit) cannot be converted to List(Of Apple)

' Consider using IEnumerable(Of Fruit) instead.

 

 

 

Posted by ljw1004 | 4 Comments
Filed under:

Hello!

Hello! I'm starting this blog as a way to communicate with VB users -- to hear what you want, to answer what questions I can, and to share my ideas about things the language could include in the future.

I've recently become the Visual Basic specification lead, taking over this role from Paul Vick. He did a great job, and I'm happy that he'll continue as "VB Language Designer Emeritus". Since joining the VB compiler team a year ago I've worked on new features for Visual Studio 2010 relating to type inference, lambdas and generic covariance. In my four years at Microsoft I've also worked on the Robotics SDK and concurrency, and published several academic papers on the subject. Before Microsoft I did my PhD in concurrency theory at the University of Cambridge with Philippa Gardner, and worked as a researcher at Bologna University in Italy with Cosimo Laneve. Despite this theoretical slant, I'm most at home when writing practical code.

Posted by ljw1004 | 0 Comments
 
Page view tracker