[The sixth in a series of posts on the evolution of TransForth]

There’s beginning to be more Forth than F# in these posts! The last major piece we’re missing in the language, aside from some compile-time trickery we’ll get into later, is the standard Forth looping constructs. We’re just about done with the whole core language!

 

Recursion

 

We’re going to add the standard Forth words for structured loops, but I should mention that “recursion is the new iteration” and it’s already working. For example, we can define a plain recursive factorial function like:

 

: FAC   DUP 1 > IF DUP 1- FAC * THEN ;

 

Since words are added to the dictionary and are visible to find as soon as they’re CREATEd, we’re free to compile into them references to themselves as they’re still being compiled. Many (classic) Forth’s don’t allow this actually, but I’m taking a page from the man himself in his colorForth design.

 

Aside: Just so you know though, classic Forth flags words with a so called “smudge” bit while being compiled; hiding them from be found. The colon word HIDEs, semicolon REVEALs and in between RECURSE can be used as a self-reference. We’ll simplify things and not add this mechanism or these extra words. The only drawback is that it makes redefinition of words in terms of their old selves a little more cumbersome. For example, if we want to redefine SWAP to also print the stack (maybe for debugging), we can’t just say : SWAP SWAP .S ; as we could in classic Forth. This would be an infinite loop in our Forth! Instead we’d need to first alias the name with : OLDSWAP SWAP ; and then : SWAP OLDSWAP .S ;.

 

Indefinite Loops

 

A simple loop can be formed by BEGINUNTIL:

 

 

For example:

 

: LOOPY BEGIN 1+ DUP . DUP 9 > UNTIL ;

0 LOOPY
1 2 3 4 5 6 7 8 9 10 ok

5 LOOPY
6 7 8 9 10 ok

 

Very similar to the way IF/ELSE/THEN words work (added in the last post), these looping words are immediate and are defined in terms of the bare-metal branch words:

 

: BEGIN   HERE@ ; IMMEDIATE

: UNTIL   ' 0BRANCH , , ; IMMEDIATE

 

 

BEGIN pushes the current address to the stack for use by the matching UNTIL which appends a 0BRANCH followed by the beginning address.

 

By the way, you’re free you make infinite loops if you like with : INFINITE BEGIN … 0 UNTIL ;. Notice however, that this pair of words is something like a do { … } while(…) loop in your favorite curly brace language. It will execute once even if the condition (… 9 > UNTIL) isn’t met:

 

100 LOOPY
101 ok

 

For something more like while(…) { … } give BEGIN … WHILE … REPEAT a try:

 

 

For example now:

 

: LOOPY BEGIN DUP 10 < WHILE 1+ DUP . REPEAT ;

0 LOOPY
1 2 3 4 5 6 7 8 9 10 ok

100 LOOPY
ok

 

They’re a bit more complicated, but WHILE and REPEAT too are built with the same basic technique.

 

: WHILE   ' 0BRANCH , HERE@ 0 , ; IMMEDIATE

: REPEAT   ' BRANCH , HERE@ 1+ SWAP ! , ; IMMEDIATE

 

WHILE prepares to branch out of the loop if the condition isn’t met and REPEAT unconditionally branches back to the beginning, but first fills in WHILE’s dummy address.

 

You can also more explicitly exit a loop with a LEAVE (presumably inside an IF). Here’s an example:

 

: LOOPY BEGIN 1+ DUP 10 > IF LEAVE THEN DUP . REPEAT ;

 

LEAVE branches out of the loop and does a little tricky swap to get the (expected) following ELSE/THEN and REPEAT to update the correct branches.

 

: LEAVE   ' BRANCH , HERE@ SWAP 0 , ; IMMEDIATE

 

Painstaking Example

 

Let’s walk through a sample in detail to make sure we have the compile-time behavior correct.

 

: TEDIOUS   BEGIN FOO IF BAR LEAVE THEN BAZ REPEAT ;

 

 

First, colon (:) creates a new word called “TEDIOUS” and switches to compiling mode. Then, BEGIN pushes the current address (say 0100) to the stack but doesn’t append anything to the dictionary. Then FOO is appended by the outer interpreter and IF (defined in the last post) appends a 0BRANCH, pushes the current address (102) and appends a dummy zero address. Next, BAR is appended. So far we have 100 and 102 on the stack and the following in the dictionary:

 

0100    FOO
0101    0BRANCH
0102    0
(becomes 106)

0103    BAR

 

Now LEAVE is a little tricky. It appends a BRANCH and pushes the current address (105) but, to trick the following THEN and REPEAT into doing the right thing, it swaps the addresses before appending a dummy zero. So we have the following added to the dictionary:

 

0104    BRANCH
0105    0
(becomes 109)

And have 100 (BEGIN), 105 (LEAVE), and 102 (IF) on the stack in that order. Next, THEN (also defined in the last post) updates the value at 102 (IF) with the current address (106). BAZ is appended:

 

0106    BAZ

Finally REPEAT finishes things off; appending a BRANCH, updating the value at 105 (LEAVE) with just beyond the end address (109) and appends 100 (BEGIN).

 

0107    BRANCH
0108    100
0109   


Now pretend you’re an inner interpreter and step through that compiled code. It works!

 

Return Stack

 

If you took the time to walk through the painstaking example above, then you probably need a break from thinking about loops. Let’s go on a little tangent (which will relate to the next section on DO … LOOPs I assure you).

 

All this time we’ve been dealing with “the stack” but in a standard Forth there are actually two stacks. The data stack we have.  We’re missing the return stack. Remember our inner interpreter?

 

let interpret w () =

    let r = i

    i <- (index w).Def

    while mem.[i] <> -1 do

        let d = index mem.[i]

        i <- i + 1

        d.Code()

    i <- r

 

We’re cheating and holding the return address in a local (r) in the call stack frame as we recurse into d.Code(). A real Forth doesn’t rely on such F#-world niceties. Instead, we should be maintaining our own stack. Notice that all we need is a stack of return pointers, not of whole frames as in most languages. It’s precisely this light-weight approach that makes refactoring into many tiny functions viable (nearly free) in Forth.

 

let mutable rstack = []

let rpush value = rstack <- value :: rstack

let rpop () = match rstack with h :: t -> rstack <- t; h | _ -> underflow ()

 

let interpret w () =

    rpush i

    i <- (index w).Def

    while mem.[i] <> -1 do

        let d = index mem.[i]

        i <- i + 1

        d.Code()

    i <- rpop ()

 

It would be very un-Forth-like to leave this as some hidden internal mechanics. The following are standard words for moving values to (>R) and from (R>) the return stack as well as to peek (R@) at them.

 

define ">R" (fun () -> pop () |> rpush)

define "R>" (fun () -> rpop () |> push)

define "R@" (fun () -> push rstack.Head)

 

A lot of craziness can be accomplished by playing with the return stack. Perhaps in another post we’ll talk about manually affecting tail recursion. That kind of thing is not very common, but it is routine to use the return stack as a temporary workspace. This is safe to do as long as you leave things the way you found them by the time your word exits (otherwise you can break the inner interpreter).

 

As an example, think of how you’d write a word to solve the quadratic formula ax² + bx + c. Even after factoring out to x(ax + b) + c and assuming an ideal parameter order (c b a x) it’s not easy to do without a lot of stack manipulation:

 

: QUADRATIC ( c b a x -- n) DUP -ROT * ROT + * + ;

 

The main issue is the need to tuck x away to use twice. The return stack can help:

 

: QUADRATIC ( c b a x -- n) >R R@ * + R> * + ;

 

It’s not any more concise really, but is perhaps clearer.

 

Definite Loops

 

Back to loops. The final form of loop we’ll be adding is a DO … LOOP:

 

 

This is a counted loop that iterates from one to the other of a pair of numbers from the stack.

 

: DECADE   10 0 DO I . LOOP ;


DECADE
0 1 2 3 4 5 6 7 8 9 ok

 

The I word we’ve been using is for easy access to the loop counter. You can nest loops as well and use J and K.

 

You’re free to make use of the data stack within the loop of course:

 

: MULTIPLICATIONS 11 1 DO DUP I * . LOOP DROP ;

7 MULTIPLICATIONS
7 14 21 28 35 42 49 56 63 70 ok

 

But wait, how does that work? These DO and LOOP words have runtime behavior after all. The others we’ve talked about were compile-time only and could get away with using the stack without interfering with runtime use of the same stack. So where are the count and limit values for the DO … LOOP being kept while you’re messing with the stack? Ah, but of course! This is why we needed to expose the return stack. Here are the definitions:

 

: DO   HERE@ ' >R , ' >R , ; IMMEDIATE

: LOOP   ' R> , ' R> , ' 1+ , ' 2DUP , ' = , ' 0BRANCH , , ' 2DROP , ; IMMEDIATE

 

Notice the that only word executed at compile-time is the HERE@ (and the ticks and commas of course). Everything else is being compiled into the definitions to be executed at runtime. We’re used to the technique by now. DO pushes the current address to be later picked up by LOOP (at compile time). DO also appends runtime words to stow away the count and limit values on the return stack. Nested loops just bury these values deeper. Later, LOOP restores them, increments the count, compares with the limit and branches back until they become equal. The definition is pretty funny looking with all the ticks (') and commas (,). We’ll develop a better way to handle this kind of thing in the future.

 

The I, J and K words are easily defined. All they’re doing is peeking at the return stack. Technically we could define them in Forth, but I think F# is feeling lonely in this post (notice we’ve only added a few lines of F#!):

 

let counter name offset = define name (fun ()-> List.nth rstack offset |> push)

counter "I" 1

counter "J" 3

counter "K" 5

 

Give nested loops a try now:

 

: NESTED   5 1 DO 5 1 DO I J * . LOOP LOOP ;

NESTED
1 2 3 4 2 4 6 8 3 6 9 12 4 8 12 16 ok

 

One last kind of loop is a DO … +LOOP in which you can supply a counter increment, which may even be negative to count backward:

 

: +LOOP   ' R> , ' R> , ' ROT , ' + , ' 2DUP , ' = , ' 0BRANCH , , ' 2DROP , ; IMMEDIATE

: COUNTDOWN 0 100 DO I . -10 +LOOP ;

COUNTDOWN
100 90 80 70 60 50 40 30 20 10 ok

 

Tests

 

case "1 2 3 .S >R >R >R R@ . R> . R> . R> ." "1 2 3 1 1 2 3 " // return stack operators

case ": FAC   DUP 1 > IF DUP 1- FAC * THEN ; 7 FAC . FORGET FAC" "5040 " // recursive definition

case ": QUADRATIC ( a b c x -- n) >R SWAP ROT R@ *  +  R> *  + ; 2 7 9 3 QUADRATIC . FORGET QUADRATIC" "48 " // taken from Starting Forth, Pg 100

case ": LOOPY BEGIN 1+ DUP . DUP 9 > UNTIL ; 0 LOOPY 5 LOOPY 100 LOOPY FORGET LOOPY" "1 2 3 4 5 6 7 8 9 10 6 7 8 9 10 101 " // BEGIN ... UNTIL

case ": LOOPY BEGIN DUP 10 < WHILE 1+ DUP . REPEAT ; 0 LOOPY 5 LOOPY 100 LOOPY FORGET LOOPY" "1 2 3 4 5 6 7 8 9 10 6 7 8 9 10 " // BEGIN ... WHILE ... UNTIL

case ": LOOPY BEGIN 1+ DUP 10 > IF LEAVE THEN DUP . REPEAT ; 0 LOOPY 5 LOOPY 100 LOOPY FORGET LOOPY" "1 2 3 4 5 6 7 8 9 10 6 7 8 9 10 " // BEGIN ... IF ... LEAVE ... THEN ... UNTIL

case ": DECADE   10 0 DO I . LOOP ; DECADE FORGET DECADE" "0 1 2 3 4 5 6 7 8 9 " // DO ... LOOP

case ": MULTIPLICATIONS 11 1 DO DUP I * . LOOP DROP ; 7 MULTIPLICATIONS FORGET MULTIPLICATIONS" "7 14 21 28 35 42 49 56 63 70 " // DO ... LOOP with stack work

case ": NESTED   5 1 DO 5 1 DO I J * . LOOP LOOP ; NESTED FORGET NESTED" "1 2 3 4 2 4 6 8 3 6 9 12 4 8 12 16 " // nested DO ... LOOPs

case ": COUNTDOWN 0 100 DO I . -10 +LOOP ; COUNTDOWN FORGET COUNTDOWN" "100 90 80 70 60 50 40 30 20 10" // +LOOP

 

Next>