This post provides a simple introduction to customizing the default output projection produced by MGrammar when input is parsed against a grammar. See the attached sample for the final version of the grammar and some corresponding legal input....

Consider the following MGrammar:

module MLanguage
{
    language Scrum
    {
        // Keywords
        token kwYesterday = "Yesterday";
        token kwToday = "Today";   
        token kwBlocked = "Blocked";
        token kwNotBlocked = "Not Blocked";
       
        token TaskStart = "*" " "*;
        token TaskText = ^"."+;
        token TaskEnd = ".";
        nest syntax Task = TaskStart TaskText TaskEnd;
       
        syntax Yesterday = kwYesterday Task+;
        syntax Today = kwToday Task+;
        syntax Status = kwBlocked | kwNotBlocked;
       
        syntax Main = Yesterday Today Status;
       
        interleave Skippable = " " | "\r" | "\n";
    }
}

Given the following input:

Yesterday
* Some work.
Today
* Some work.
* More work.
Blocked

The following projection is generated by MGrammar:

Main[
  Yesterday[
    "Yesterday",
    [
      Task["* ", "Some work", "."]
    ]
  ],
  Today[
    "Today",
    [
      Task["* ", "Some work", "."],
      Task["* ", "More work", "."]
    ]
  ],
  Status["Blocked"]
]

By default, MGrammar generates a default projection (for successfully parsed input) as a label-directed graph. The graph contains labeled nodes for all non-leaf nodes (for syntax rules) and leaf text nodes (for input matching token rules). Each non-leaf node can have zero or more non-leaf and leaf nodes.

The structure and content of the default projection is the "kitchen sink": everything is included. If you don't like the default, MGrammar provides a variety of simple tools for you to reshape the projection, including adding, removing, and relabeling non-leaf nodes.

Projection is performed at the syntax production level; you customize the default by creating a right-hand side for the production, using the => operator, and defining your own node:

...
nest syntax Task = TaskStart TaskText TaskEnd => Task[];
...

Here, the Task syntax rule's default projection is overridden with a custom Task node. The projection now looks like the following:

Main[
  Yesterday[
    "Yesterday",
    [
      Task[]
    ]
  ],
  Today[
    "Today",
    [
      Task[],
      Task[]

    ]
  ],
  Status["Blocked"]
]

Each custom Task node is empty because no child nodes have been specified. One way to specify child nodes is to bind an expression in the production to a variable that you reference from the new node, like the variable binding t in the following:

...
nest syntax Task = TaskStart t:TaskText TaskEnd => Task[t];
...

In this example, the variable t is referenced from the new node and to produce the following projection:

Main[
  Yesterday[
    "Yesterday",
    [
      Task["Some work"]
    ]
  ],
  Today[
    "Today",
    [
      Task["Some work"],
      Task["More work"]
    ]
  ],
  Status["Blocked"]
]

As you can see, by selectively including nodes we want to the new node, we are also selectively excluding others, such as the "*" and "." text; these nodes are for tokens that are required in input, to help the user write the input, but aren't required in the projection.

Let's use the same technique to get rid of other superfluous non-leaf nodes:

...
nest syntax Task = TaskStart t:TaskText TaskEnd => Task[t];
       
syntax Yesterday = kwYesterday t:Task+ => Yesterday[t];
syntax Today = kwToday t:Task+ => Today[t];
...

This produces the following projection:

Main[
  Yesterday[
    [
      Task["Some work"]
    ]
  ],
  Today[
    [
      Task["Some work"],
      Task["More work"]
    ]
  ],
  Status["Blocked"]
]

In the projection, you can see that two nodes (children of the Yesterday and Today nodes, respectively) that don't have a label. By default, MGrammar produces a new anonymous node for each expression that uses repetition (eg ?, +, and *). This node is needed to contain the corresponding zero or more possible child nodes that could match. In our example, the Task nodes are surrounded by an unlabeled node that is generated by the t:Task+ expression of the Yesterday and Today productions:

...
nest syntax Task = TaskStart t:TaskText TaskEnd => Task[t];
       
syntax Yesterday = kwYesterday t:Task+ => Yesterday[t];
syntax Today = kwToday t:Task+ => Today[t];
...

We don't need an anonymous node in this case, as the surrounding Yesterday[...] and Today[...] nodes provide enough containment. We can remove the anonymous node from the output by telling MGrammar to take the child nodes from the anonymous node and make them children of the parent Yesterday and Today nodes instead. We can do this using valuesof in the custom nodes:

...
syntax Yesterday = kwYesterday t:Task+ => Yesterday[valuesof(t)];
syntax Today = kwToday t:Task+ => Today[valuesof(t)];
...

We can also replace a node with its child node. For example, if we decided we didn't need the Status node, we could simply include just the status value ("Blocked" or "Not Blocked"), like so:

...
syntax Main = y:Yesterday t:Today s:Status => Main[y, t, valuesof(s)];
...

The same effect can be created without using valuesof:

...
syntax Status = s:kwBlocked => s | s:kwNotBlocked => s;
...

Here, each production of the Status rule generates a leaf node whose parent will be the node for any production that references the Status rule. In this case, that's the Main rule. Of course, the Main rule is not the right label for the output, which is really Scrum data. To change a node's label value, you can create a custom node with the desired label value:

...
syntax Main = y:Yesterday t:Today s:Status => Scrum[y, t, s];
...

This produces the following:

Scrum[ ... ]

So far, the values of our node labels are hard-coded. Although the name for both the Yesterday and Today nodes is really the same as the value of the kwYesterday and kwToday tokens. In this case, we can use the id operator to get the values of the keyword tokens instead of hard-coding them:

...
syntax Yesterday = kw:kwYesterday t:Task+ => id(kw)[valuesof(t)];
syntax Today = kw:kwToday t:Task+ => id(kw)[valuesof(t)];
...

Which produces the same projection as before:

Scrum[
  Yesterday[
    Task["Some work"]
  ],
  Today[
    Task["Some work"],
    Task["More work"]
  ],
  "Blocked"
]

What's nice about this is the extra layer of abstraction provided by id - if you change the values of the keyword tokens, the changes are automatically reflected by the corresponding node labels eg given the following grammar:

...
token kwYesterday = "WhatIDidYesterday";
token kwToday = "WhatImDoingToday"; 
...

and the following input:

WhatIDidYesterday
* Some work.
WhatImDoingToday
* Some work.
* More work.
Blocked

The labels of the corresponding nodes are automatically updated without requiring the right-hand sides for the Yesterday and Today rules to be updated:

Scrum[
  WhatIDidYesterday[
    Task["Some work"]
  ],
  WhatImDoingToday[
    Task["Some work"],
    Task["More work"]
  ],
  "Blocked"
]

As you can see, a few simple tools enable you to manipulate the shape of the output projection to your liking. As with anything you see discussed on this blog, we'd love to hear your feedback on this topic, via the "Oslo" Connect site.