I talked about Rijndael in a couple of previous posts: Efficient multiplication and division in GF(28), Sieving irreducible monic polynomials over a finite field, Addition and multiplication table for GF(22).

I'm going to talk some more about it today.

One of the more interesting steps used in the Rijndael transformation is the non-linear S-box. This is a function S(a) that takes an element of GF(28) (which could be represented as a byte) and returns another one. It is invertible but non-linear.

The spec defines S(a) in terms of two other invertible functions g(a) and f(a). In particular S(a) = f(g(a)). It follows that S-1(a) = g-1(f-1(a)).

g(a) is the non-linear piece, and is quite simple:
g(00) is defined as 00.
g(a) = a-1 for all other a in GF(28). In particular, if a = 03b, then g(a) = 03255 - b.
This is clearly invertible; in fact, it is its own inverse.

Daemen and Rijmen seem to have been almost embarrassed by the simplicity of g so they introduced f. To quote from section 3.4.1:

By definition, g has a very simple algebraic expression. This could allow algebraic manipulations that can be used to mount attacks such as interpolation attacks. Therefore, we built the S-box as the sequence of g and an invertible affine transformation f.

I don't know if I buy the "simplicity" argument. It seems to me that if Rijndael, without f, is robust, then it's robust. And if you add f to a non-robust scheme, I don't understand how that makes it robust; I do see that it complicates analysis, but that seems like a drawback rather than an advantage.

But I'll play along for now. What is f?

b = f(a) is defined using the following matrix multiplication over GF(2) (each entry can be represented as a bit; a row or column as a byte.) Multiplication can be implemented as bitwise AND, and addition by bitwise XOR.

In particular, note that f(00) = 63. So S(00) = f(g(00)) = f(00) = 63. So S-1(63) = 00.

The reference implementation in the book uses hardcoded 256-byte lookup tables for S(a) and S-1(a). I wrote a Perl script which generates these lookup tables and prints them out. This script is attached.

Here's its output, which matches the listing in the book:

>perl -w s-box.pl
                          S(xy)
   | x0 x1 x2 x3 x4 x5 x6 x7 x8 x9 xa xb xc xd xe xf
---+------------------------------------------------
0y | 63 7c 77 7b f2 6b 6f c5 30 01 67 2b fe d7 ab 76
1y | ca 82 c9 7d fa 59 47 f0 ad d4 a2 af 9c a4 72 c0
2y | b7 fd 93 26 36 3f f7 cc 34 a5 e5 f1 71 d8 31 15
3y | 04 c7 23 c3 18 96 05 9a 07 12 80 e2 eb 27 b2 75
4y | 09 83 2c 1a 1b 6e 5a a0 52 3b d6 b3 29 e3 2f 84
5y | 53 d1 00 ed 20 fc b1 5b 6a cb be 39 4a 4c 58 cf
6y | d0 ef aa fb 43 4d 33 85 45 f9 02 7f 50 3c 9f a8
7y | 51 a3 40 8f 92 9d 38 f5 bc b6 da 21 10 ff f3 d2
8y | cd 0c 13 ec 5f 97 44 17 c4 a7 7e 3d 64 5d 19 73
9y | 60 81 4f dc 22 2a 90 88 46 ee b8 14 de 5e 0b db
ay | e0 32 3a 0a 49 06 24 5c c2 d3 ac 62 91 95 e4 79
by | e7 c8 37 6d 8d d5 4e a9 6c 56 f4 ea 65 7a ae 08
cy | ba 78 25 2e 1c a6 b4 c6 e8 dd 74 1f 4b bd 8b 8a
dy | 70 3e b5 66 48 03 f6 0e 61 35 57 b9 86 c1 1d 9e
ey | e1 f8 98 11 69 d9 8e 94 9b 1e 87 e9 ce 55 28 df
fy | 8c a1 89 0d bf e6 42 68 41 99 2d 0f b0 54 bb 16

                        S^(-1)(xy)
   | x0 x1 x2 x3 x4 x5 x6 x7 x8 x9 xa xb xc xd xe xf
---+------------------------------------------------
0y | 52 09 6a d5 30 36 a5 38 bf 40 a3 9e 81 f3 d7 fb
1y | 7c e3 39 82 9b 2f ff 87 34 8e 43 44 c4 de e9 cb
2y | 54 7b 94 32 a6 c2 23 3d ee 4c 95 0b 42 fa c3 4e
3y | 08 2e a1 66 28 d9 24 b2 76 5b a2 49 6d 8b d1 25
4y | 72 f8 f6 64 86 68 98 16 d4 a4 5c cc 5d 65 b6 92
5y | 6c 70 48 50 fd ed b9 da 5e 15 46 57 a7 8d 9d 84
6y | 90 d8 ab 00 8c bc d3 0a f7 e4 58 05 b8 b3 45 06
7y | d0 2c 1e 8f ca 3f 0f 02 c1 af bd 03 01 13 8a 6b
8y | 3a 91 11 41 4f 67 dc ea 97 f2 cf ce f0 b4 e6 73
9y | 96 ac 74 22 e7 ad 35 85 e2 f9 37 e8 1c 75 df 6e
ay | 47 f1 1a 71 1d 29 c5 89 6f b7 62 0e aa 18 be 1b
by | fc 56 3e 4b c6 d2 79 20 9a db c0 fe 78 cd 5a f4
cy | 1f dd a8 33 88 07 c7 31 b1 12 10 59 27 80 ec 5f
dy | 60 51 7f a9 19 b5 4a 0d 2d e5 7a 9f 93 c9 9c ef
ey | a0 e0 3b 4d ae 2a f5 b0 c8 eb bb 3c 83 53 99 61
fy | 17 2b 04 7e ba 77 d6 26 e1 69 14 63 55 21 0c 7d

And here's the interesting part, the implementation of g(a) and f(a):

sub g($) {
    my ($a) = shift;

    # g(0) we define to be 0
    return 0 unless $a;

    # otherwise g(a) = a^(-1)
    # a = (x + 1)^loga
    # so a^(-1) = (x + 1)^(-loga) = (x + 1)^(255 - loga)
    my $loga = $log_xplusone_of[$a];
    my $logb = 255 - $loga;
    $logb -= 255 if $logb >= 255;

    return $xplusone_to[$logb];
}

# f(a) = b is defined as follows:
#
# [ b7 ]   ( [ 1 1 1 1 1 0 0 0 ] [ a7 ] )   [ 0 ]
# [ b6 ]   ( [ 0 1 1 1 1 1 0 0 ] [ a6 ] )   [ 1 ]
# [ b5 ]   ( [ 0 0 1 1 1 1 1 0 ] [ a5 ] )   [ 1 ]
# [ b4 ] = ( [ 0 0 0 1 1 1 1 1 ] [ a4 ] ) + [ 0 ]
# [ b3 ]   ( [ 1 0 0 0 1 1 1 1 ] [ a3 ] )   [ 0 ]
# [ b2 ]   ( [ 1 1 0 0 0 1 1 1 ] [ a2 ] )   [ 0 ]
# [ b1 ]   ( [ 1 1 1 0 0 0 1 1 ] [ a1 ] )   [ 1 ]
# [ b0 ]   ( [ 1 1 1 1 0 0 0 1 ] [ a0 ] )   [ 1 ]
#
# where the + is XOR
sub f($) {
    my ($a) = @_;

    # start with the addition
    my $b = 0x63; # 0b01100011;

    # do the matrix multiplication
    # one matrix column at a time
    for (
        my ($c, $i) = (0x8f, 0x80); # 0b10001111, 0b10000000
        $i;
        $c = rotate_byte_right($c), $i >>= 1
    ) {
        # i is used to select a bit out of the a column
        # c is the matrix column which is multiplied by that bit
        # the resulting product influences the eventual b column

        # printf("%02x %02x\n", $c, $i);

        # if this bit in the a column is 0, all of the products are 0, so don't bother
        next unless $a & $i;

        # this bit in the a column is 1
        # so XOR b with the matrix column
        $b ^= $c;
    }

    return $b;
}