ReuvenLax

Extending Value Numbering 2

In the last post we extended the value-numbering algorithm to extended basic blocks.  This is generally a win, but still leaves us wanting a bit more.  Consider the following contrived program:

x = y + z         B1

if x == 0
    a = y + z     B2
else
    b = y + z     B3

c = y + z         B4

This program contains 4 basic blocks as marked above.  The control-flow graph for this program presents three extended basic blocks:  B1,B2  B1, B3  and B4.  If we apply value numbering to these extended basic blocks, we improve the program to the following

x = y + z

if x == 0
    a = x
else
    b = x

c = y + z   

An improvement surely, but what about block B4?  B4 has both B2 and B3 as predecessors, so we can't include it in any extended block bigger than itself.  However, it's obvious from inspection that the expression y+z evaluated in B4 is redundant.  No matter which branch of the if statement is taken, y+z has already been evaluated by the time B4 is reached.

In order to optimize B4, we need to generalize our concept of extended basic blocks a bit.  We mentioned last time that the key property of an extended basic block B1, B2, ..., Bn (at least for value numbering) is that whenever some Bi executes it's guaranteed that all previous blocks in the list have already executed in order.  We can formalize this property a bit more by introducing dominators:

 Definition:  Let G be a control-flow graph with start node s.  Given two nodes x and y in G, x is said to dominate y if every path from s to y must pass through x.

We've now formalized the key property of extended basic blocks.  Each of the blocks is dominated by all predecessors in the extended basic block.  By the time any block has executed, all it predecessors have executed in order. 

Let's introduce one more idea. that of immediate dominators:

Definition:  Let G be a control-flow graph with start node s, and let x be an arbitrary node in G that is not s.  The immediate dominator of x, referred to as IDOM(x), is the unique dominator of x that is not x itself with the property that any other dominator of x, not including x, dominates IDOM(x).  The nodes of G form a tree structure rooted at s such that the parent of any node x is IDOM(x).

Each node in an extended basic block is the immediate dominator of the next node in the extended block.  This means that every extended basic block appears somewhere in the immediate-dominator tree as a subtree.  So, if we apply our scoped hash table algorithm from last time to the immediate-dominator tree, we should get all the benefit of extended-basic block value numbering, and then some!

In our example above, the immediate-dominator tree contains paths      B1, B2  B1, B3 and B1, B4.  Running our scoped hash table value numbering over this gives us the following

x = y + z

if x == 0
    a = x
else
    b = x

c = x

eliminating all common subexpressions.

In reality, value numbering over immediate-dominator trees has other complications due to the static single assignment form I mentioned previously.  If you're interested in this, take a look at the paper by Briggs, Cooper, and Simpson http://citeseer.ist.psu.edu/329118.html

Tomorrow we'll move on to global methods of finding common expressions.

Published Thursday, December 15, 2005 7:28 PM by ReuvenLax
Anonymous comments are disabled

© 2009 Microsoft Corporation. All rights reserved. Terms of Use  |  Trademarks  |  Privacy Statement
Microsoft
Page view tracker