This is a part of the series where we take a deep dive into the inner workings and extensibility of the Razor parser. In this post we are going to take a detailed look at the parse tree that is generated
To learn more about the workings of parser please see the post by Andrew
In Razor parser the document starts in markup mode(MarkupBlock). Depending on what comes next the parser switches between markup and code mode. This post takes you through the different kinds of markup/code switches that can happen. There is a tool that you can download which generates a parse tree based on Razor syntax input. The tool works for both C# and VB
So lets dive into the building blocks.
At a high level the parsed tree comprises of Blocks which signify what type of Block is the parser parsing. Blocks are the non leaf nodes of the parsed tree. Blocks can contain blocks which eventually terminate in Spans
Following is a diagram which shows the high level structure for the Razor Parse Tree
Blocks can be of the following types
grid.Column("Description", format:@<i>@item.Description</< CODE>i>)
Blocks contain the following information
BlockType: One of the above kinds
SourceLocation: Location of the char in the file where the block started which is of the following representation (AbsoulteIndex: LineIndex: CharIndex :: Length of Block)
Blocks are divided into the following Spans. Think of Spans as the leaf node in the ParseTree.
The Spans contain information around the position of the span(line, col) and the content being parsed. This is useful in the cases of error reporting, syntax highlighting in the editor
I hope by now you would have a high level idea about the structure of the parse tree. At this point I have a sample input for razor file and I will walk you through the generated parse tree.
Spans contain the following information
SpanType: One of the above kinds
SourceLocation: Location of the char in the file where the block started which is of the following representation (AbsoulteIndex: LineIndex: CharIndex :: Length of Span)
Content: Content which is parsed as a Span
1 + 1 = @(1+1)
Markup Block at (0:0,0)::16
As you know Razor parser starts parsing with MarkupBlock, so in this case the first block is the MarkupBlock. After creating the markup block the parser sees that the next char in markup so it creates a MarkupSpan and puts all the markup content in this span
When the parser sees @, it knows that the next characters have to do with code so it creates an ExpressionBlock. After creating the ExpressionBlock the parser parses the @ as a TransitionSpan which means that we have transitioned from Markup-Code. ExpressionBlock have the following signature @() so the parser parses the ( as MetaCode span. At this point the parser parsers the remaining characters as CodeSpan until it sees the terminator char ) which is parsed as MetaCodeSpan
After the CodeSpan, the ExpressionBlock does not anything else to be parsed and this the parser consumes the newline character as part of the MarkupBlock
If you found the above description about the ParseTree that gets generated for the Razor syntax, interesting then you should download this tool which lets generates the Parsed Tree for a given Razor syntax.
This tool can be used for debugging your application, though I would say it is an advanced use. If you think that the parser is not parsing the input as expected, then you can use this tool to see the parse tree that gets generated and figure out what is wrong.
Screenshot of the tool
1. You need to have Asp.Net WebPages installed on the machine
2. If you select the “View In Browser” option then the tool generates a temp file “test.htm”
Hopefully this would help you understand the structure of the generated parse tree for razor syntax.