In the last few posts, we examined ways of eliminating redundant expressions in basic blocks and in tree-structures of basic blocks. We examined the tree of extended basic blocks and then enhanced this algorithm to run over the immediate-dominator tree of a procedure. These algorithms all worked on acyclic regions of the control-flow graph. In this post, we're going to start examining algorithms that work on the entire control-flow graph at one time.
A particularly elegant algorithm for globally discovering redundant expressions comes froma 1988 paper by Alpern, Wegman, and Zadeck. They define a notion of congruent operations.
Definition: Let x and y be two operations in a procedure. x and y are said to be congruent if they have the same operator and their operands are congruent.
Notice that two congruent operations aren't necessarily redundant. This definition does not take control flow into account, so there may be no control-flow path that includes x and y.
In this paper, Alpern,Wegman, and Zadeck propose an algorithm to determine which operations are congruent. This algorithm is based off of Hopcroft's algorithm for minimizing finite-state machines. The idea is quite simple. To start off with, any operation that loads an immediate value is obviously congruent to any other operation that loads the same immediate value. We start by assuming that all operations the same operator are congruent and that all immediate loads are congruent only if they load the same immediate value. We create a worklist containing all of these congruence classes, and we keep trying to make the classes smaller. As usual, the control-flow graph must have this static-single assignment property that any variable use is the result of only one variable definition.
Partition all operations into classes by opcode
Put all of the classes into Worklist
While Worklist is not empty
Remove a class c from Worklist
For each class s (including classes not on Worklist)
If s uses the result of any operation in c
split s into two classes. One which contains all operations that use some operation in c,
and one which contains all other operations in s.
put the smaller of these two classes back on the worklist
When this algorithm terminates, we're guaranteed that all operations in each class are congruent.
There are a couple interesting differences between this algorithm and the previous value-numbering algorithms. For one thing, value numbering is pessimistic. Value number starts off by assuming that no expressions are redundant, and it builds up it's hash table as it find expressions in the code. This partitioning algorithm, on the other hand, is optimistic. It starts by assuming that all operations with the same operation are congruent, and then refines this assumption. Optimistic algorithms often find more opportunities then pessimistic algorithms, however one advantage of pessimistic algorithms is that they can be terminated early and still provide benefit. If this partitioning algorithm is terminated early, all of the classes computed must be thrown away.
Another key difference, is that the hash-based, value-numbering algorithms worked in an online manner. The algorithm would replace common subexpressions as it found them. This partitioning algorithm doesn't replace any expressions. It merely computes a bunch of congruency classes and leaves us with the problem of how to use this information. In the paper, Alpern, Wegman, and Zadeck suggest using the dominator information we talked about last time. Remember, that an operation x is said to dominate another operation y if every control flow path that reaches y flows through x. Using this idea, we can do the following:
for every congruence class c
for every pair of operations x and y in c
if one operation dominates the other
rewrite the dominated operation as a load of the result of the other
Others have proposed better ways of using this information.
Partitioning and hash-based value numbering are hard to compare directly, since they find very different sets of expressions. However, partitioning has some disadvantges with respect to dominator value numbering:
- In practice, partitioning can be very slow compared to value numbering.
- It's not obvious how to handle algebraic identities such as x+0 = 0+x = x. Value numbering is great at this.
- 2*a and a+a will always be in seperate classes.
In practice, dominator-based value numbering (especially after it's been improved in various ways) is often better than the partitioning method. If you're not constricted by compile times though, there's no reason why you couldn't run both.