What do these books have in common?

CRC Standard Mathematical Tables and Formulae, Numerical Recipes in C, Seminumerical Algorithms, Applied Cryptography, The Twofish Encryption Algorithm

For starters, they are all published earlier than 1999, mostly before 1996, and in one case, 1969.  They all cost about $50 apiece.  They all have information that was probably at one point or another classified as munitions by the US government.  In my computing history, they have all led to my understanding of encryption, random numbers, and other esoteric subjects required to do some basic computing tasks.

Why do I look at them now?  Well, this is one of those cases where all the information they contain can be found on the internet in one form or another.  So, it's time to de Treasure them.  But, before they go, I thought I'd look at some of the bookmarks that I had made and see what I was interested in almost 10 years ago.

From the CRC, I find a bookmark on the page headed:



Linear Congruential generators.

Turns out, you can generate pseudo random numbers by a simple formula.  Pseudo random because they're not random at all.  You can predict exactly what the next number will be based on the current state of the calculation, and the initial state of the calculation.

Xn = aXn-1 + b (mod M)

That is, a 'random' number Xn can be generated by taking Xn -1, multiplying by 'a', adding 'b' and modulus 'M'.  It works out nicely because it's a simple multiplication, and add, and some other little thing at the end.  Really fast stuff.  Depending on what numbers you pick for a, b, M , and the initial seed, you can generate these numbers rippin fast on modern computing hardware.

What's a Mersenne prime?  I don't know, I'm sure you can look it up on the internet.  It turns out that if you use one of these as the 'M', you get really fast computations.

One concern with LCG RNGs (Linear Congruential Random Number Generators) is the periodicity.  That is, how many times can you generate a number before the sequence begins to repeat itself?  The trick is in the numbers you choose.

There are plenty of tables for what a, b, and M should be

a               b           M            Source

7(5)          0        2(31)-1    Park-Miller

131           0        2(35)       Neave

16333   25887    2(15)       Oakenfull

3432       6789     9973       Oakenfull

171            0       30269      Wichman-Hill


For those who are in the know, I'm sure these names mean much, and a hush goes over the audience when they speak.  For most of us though, the intricacies of LCG constants are lost in the bowels of the rand() function.  Most people neither know, nor care about the details.  And that my friends is the beginning of all our troubles with writing secure software.

What about shift register generators, Lagged-Fibonacci generators, and chained LCGs?  What kind of random numbers are you consuming in your applications?  Do you care about the distribution of the lower bits?  Do you care that your numbers may overflow before the modulus?  Do you care that you are losing some precision when you're flipping between int and float?

These are little tidbits from the CRC.  A book chock full of tables, formulas, and the like.  I used the CRC when I was in high school to look up log tables.  And I'm sure it's been in existance in various forms for a couple hundred years before that.

 Donald Knuth goes over the top in “Seminumerical Algorithms”.  Knuth spends an unprecedented 177 pages on the subject of random numbers, random sequences, and anything esoteric with the word 'random' in it.  Can you believe it?  177 pages?  Surely the spec, docs, code, and tests for the typical rand() function do not take up that much space.  I'd guess they typically take up no more than two pages worth of text layed end to end.

What's the point?  I don't know, I guess I'm just being a little random.  How random you can't possibly know unless you understand where I started, and under what conditions I change the subject.

As I de treasure these books, I can't help but think that most programmers using modern programming languages today, aren't likely to amass such a library, nor understand the esoteric little pieces embodied in the deep literature.  Maybe we've gone beyond the point of having to care about such things.  To a certain extent, this is true.  The general computing public does not need to care about the multipliers that are used in the LCG found deep within our rand() implementation.  But, I hope the systems programmer who wrote that code sure understood what they were doing, and I hope they didn't de treasure their library of knowledge too soon.