I’ve described the ScriptDom parser previously on this blog. The visitor pattern is great when you know exactly what you are looking for in the parse tree, but as you would have seen, due to the polymorphic nature of the ScriptDom classes, it is quite difficult for a truly generic tree walk when you don’t know what to expect.

The subtypes problem

For example, let’s look at the SelectStatement class. Here are the members of that class:

SelectStatement
              QueryExpression QueryExpression
              SchemaObjectName Into
              List<> ComputeClauses
              WithCtesAndXmlNamespaces WithCtesAndXmlNamespaces
              …
              List<> ScriptTokenStream

The QueryExpression in turn looks like the below:

QueryExpression
               OrderByClause OrderByClause
              OffsetClause OffsetClause
              ForClause ForClause
              …
              IList<> ScriptTokenStream

QueryExpression is actually an abstract class. In the real world, what you end up getting one of the subtypes of QueryExpression:

  • QueryParenthesisExpression
  • BinaryQueryExpression
  • QuerySpecification

This means you will end up hard-coding some IF…ELSE structures if you want to handle each of the above 3 cases. This quickly becomes very complex when you consider the pattern repeats in many classes. For example, within QuerySpecification you have a FromClause:

QuerySpecification
              UniqueRowFilter UniqueRowFilter
              TopRowFilter TopRowFilter
              IList<> SelectElements
              FromClause FromClause
              WhereClause WhereClause
               GroupByClause GroupByClause
              HavingClause HavingClause
              OrderByClause OrderByClause
               OffsetClause OffsetClause
              ForClause ForClause
              …
              IList<> ScriptTokenStream

The FromClause is in turn a List<> of TableReference. That class in turn has a hierarchy of subtypes, which if you navigate (painfully sometimes) you will typically find two types of interest:

This goes on and on Smile So it can be very difficult for a normal developer to figure out what classes to expect for a given T-SQL snippet. What we need is a way to walk the tree for a given T-SQL fragment and document exactly what classes are present for that T-SQL.

Creating a XML ‘Visualizer’

XML is a very convenient way to document parse trees. This approach has been previously applied to another T-SQL parser from Microsoft SQL product team. (That parser was never publicly supported and it is now unavailable and has effectively been replaced by the ScriptDom parser.) You can find a thread about the older parser and the XML it produced at this thread.

So unfortunately for our ScriptDom parser we don’t have a Xml property which would neatly provide the parse tree in XML format. Instead we have to do the hard work ourselves. Actually it is not very hard – with a little bit of knowledge of the Reflection API, one can easily do this. The advantage of the reflection method is that it allows us to see the actual derived class which has been produced by the parser.

Sample application

I’ve attached a sample C# console application which does this and writes the output XML to a file on disk. I personally find this very useful to visualize the parse tree offline. You can use the free XML Notepad or XML Explorer tools to examine the resultant XML. You can then run XPath queries as well to look for patterns.

Here are some sample fragments of the XML produced by the application for the AdventureWorks procedure called uspGetManagerEmployees.

Here is the FromClause depicted in XML:

image

Here is the JOIN within the FROM clause:

image

That’s it for now. Hope you enjoyed this post, do leave your comments on the blog or get in touch with me via Twitter.

Disclaimer

This Sample Code is provided for the purpose of illustration only and is not intended to be used in a production environment.  THIS SAMPLE CODE AND ANY RELATED INFORMATION ARE PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE IMPLIED WARRANTIES OF MERCHANTABILITY AND/OR FITNESS FOR A PARTICULAR PURPOSE.  We grant You a nonexclusive, royalty-free right to use and modify the Sample Code and to reproduce and distribute the object code form of the Sample Code, provided that You agree: (i) to not use Our name, logo, or trademarks to market Your software product in which the Sample Code is embedded; (ii) to include a valid copyright notice on Your software product in which the Sample Code is embedded; and (iii) to indemnify, hold harmless, and defend Us and Our suppliers from and against any claims or lawsuits, including attorneys’ fees, that arise or result from the use or distribution of the Sample Code.

This posting is provided "AS IS" with no warranties, and confers no rights. Use of included script samples are subject to the terms specified at http://www.microsoft.com/info/cpyright.htm.