One of the features I would like to see in our tools for SQL Server is a T-SQL language service. This would give customers features like intellisense, text completion, refactoring (e.g. renaming a table everywhere at once), error highlighting (as in Word’s red squiggly underlining), and host of other features that C# and Visual Basic have in Visual Studio 2005. Unfortunately, implementing a language service for T-SQL requires solving some difficult problems. Below I discuss the implementation problems and some ideas I have for fixing them. It would be great to hear your thoughts too.

 

T-SQL has a large and complex syntax with features that make parsing it with conventional parsers difficult. Language constructs like selectfrom … make intellisense hard to do since you need to have the from-clause written before you can offer intellisense on the select-clause yet T-SQL encourages people to write the select clause first. The only viable solution, I can think of, is when the user has typed select to automatically add the from-clause and move the cursor there. Perhaps in a future version of T-SQL we could support a fromselect … syntax which would also be a more natural syntax. Moving users to the from-clause may be too intrusive, what do you think?

 

To allow intellisense and refactoring work well, an accurate semantic analysis of the T-SQL script needs to take place. However, T-SQL frequently refers to other databases and some parsing depends on server settings (e.g. ANSI nulls). To provide enough information for the language service to work correctly and efficiently, we will need to cache server and database related information locally in database models. Since we need to do this, it makes sense to build the language service on the modeling engine in Microsoft’s new graphical DSL framework. Some changes will be required but the modeling framework is close to a good AST framework and so it makes sense to reuse it.

 

Batch parsers are fast but for large files, they are not fast enough to support character at a time reparsing. There are ways to disguise some of this delay but ultimately intellisense becomes too slow to be usable and frustrates users. In addition, batch parsers require heavy modification to minimize the amount of change events from the object model they expose to tools. For example, imagine an explorer that lets you explore your T-SQL code like class view lets you explore your C# code. It would be unacceptable if at every key stroke all of the tables defined in the file disappeared and then reappeared in the explorer. Thus, a best practice for language services is to base them on a new incremental parser for the language. This involves a lot of work that we could use to improve other aspects of the T-SQL development experience. Is intellisense, refactoring, etc worth it?

 

A language service needs to handle errors robustly since users may be partly through typing a statement but intellisense etc should continue to work well. Batch parsers (i.e. compiler parsers) must handle large amounts of correct text quickly. Batch parsers are optimized for throughput not error recovery so adapting them for a language service is challenging. Adding to the complexity, in SQL Server 2005, T-SQL may contain imbedded XQuery expressions. For a language service parser, I would normally recommend either a recursive descent parser (for speed and great error recovery options) or an incremental GLR parser (for minimal reparsing costs, good error recovery and good ambiguity handling). However, the need to deal with multiple languages makes an incremental GLR parser the best choice. I particularly like the incremental GLR parser in Tim A. Wagner’s thesis when paired with the incremental semantic analysis in William Maddox’s thesis.

 

One of the nice things in Visual Studio 2005 is the managed language service (the wonderful new Visual Studio XML editor uses it). Provided you have a good incremental parser and an almost LR language then it is easy to build a great language service. It would be great to combine our graphical DSL framework, the managed language service and an incremental GLR parser framework to create a framework for building textual DSLs. This would allow people to create their own special purpose languages complete with intellisense, text completion, etc.

 

This posting is provided "AS IS" with no warranties, and confers no rights.