One thing I'd better post about is a tool I've written and posted on the Internet, Regex Builder. You can find it on a GotDotNet workspace. In my own work (testing my parts of Visual Studio) I frequently work with strings, and I've discovered that Regular Expressions are a very nice way to take care of many tasks. Unfortunately, they can be very painful to debug when something is wrong; the framework can't tell you much more than "Nope, no match".

Regex Builder is a tool designed to help with that. It's not terribly complex - you provide the Source Text, the Expression, identify the options. It will tell you about the Exception thrown, if any, and otherwise will indicate the number of matches. It will also give you a tree of all of the Matches, Groups, and Captures in your expression. You can select anything in the tree to see it highlighted in the Source Text, which is useful for seeing how whitespace is being matched. Also, you can select a portion of the expression to execute just that portion. I use this to separate the components of my expression to quickly determine where the problem is. Finally, there's a menu which helps by showing you many of the syntactical elements of .NET Regular Expressions and easy links to the help.

Want an example? Today I was trying to write an expression to verify that two Label controls in an ASP.NET page were placed directly next to each other. The (desired) markup looks like this:

<asp:Label
    id="Label1"
    runat="server"
    Text="Label" /><asp:Label id="Label2"
    runat="server" Text="Label" />

My initial guess at a match for the expression was this: <asp:Label[^>]+Label1[^>]/><asp:Label>. Conceptually, I'm looking for a Label which contains the string "Label1", without leaving the tag, followed immediately by a second <asp:Label>. This, of course, didn't match.

Fortunately, the support code which executes my expression grabs the failure and generates an XML file with the Source, Expression, and Options, and this file format can be read by Regex Builder. It then writes to the log a command I can run from the command line to diagnose the problem. I copy and paste this into a command line, and I'm seeing the non-match.

First, I selected the first half of the expression, "<asp:Label[^>]+Label1[^>]/>". This didn't match anything, so there's a problem here. I selected just "<asp:Label[^>]+", which did match, and then noticed that the second [^>] does not have a '+' after it. This means I only want one character after Label1 - oops!. I changed this half to "<asp:Label[^>]+Label1[^>]+/>", which successfully matched the correct part of the file.

However, the whole expression still isn't matching. Hmm.. The end portion is quite simple, and I notice that I put the '>' right on the end of it - I've been working on reviewing the documentation recently and that's the way they write out tags. I removed that, and now things are working nicely. Three minutes spent, and "<asp:Label[^>]+Label1[^>]/><asp:Label>" became "<asp:Label[^>]+Label1[^>]+/><asp:Label".