This type of algorithm is known as Linear Congruential Random Number Generator or LCRNG for short. It consists of three or more fixed constants, usually large prime numbers, which have certain mathematical properties.
Bad Seeds
In a general way, a bad seed is one that can be guessed after a relatively small number of tries or in a short interval of time. If a hacker (or an Adversary, to use a cryptography term) can predict the value of the constants used in your random algorithm, for instance the values of a, b, c and d from Listing 1, the seed can be computed and the whole scheme is compromised.
A typical example of this is some earlier implementations of SSL key generation in web browsers. They would encrypt the message in a large sequence of random numbers.
Some of those seeds generators would use the time of the day as a means of obtaining the values for the constants used in their algorithm.
The basic algorithm can usually be reversed-engineered using a disassembler and the time of the day can be known to within a certain precision often better than a second, which means that there are only one million possible choices for the microsecond part (and often a lot less because the clocks on most computer systems do not have a true microsecond resolution). See Table 1 for the resolution of some Windows API functions.
| Function | Unit | Resolution |
| Now, Time, Timer | seconds | 1 second |
| GetTickCount | milliseconds | 10 ms |
| TimeGetTime | milliseconds | 10 ms |
| QueryPerformanceCounter | milliseconds | < 1 ms |
Table 1: Different time accuracy offered by Windows.
The values shown in Table 1 do not take into account the overheads of calling those functions. For example, it takes 5 to 10us (micro seconds) to call QueryPerformanceCounter, because it has to do several port I/O instructions to read the computer’s clock counter.
Note: If you decide to use QueryPerformanceCounter remember that it doesn't run at the same rate on all machines,
Another good example of a flawed seed generator algorithm is one that uses the process id (or PID) of an application. You should never consider PID’s a secret. These PID’s are usually present in message packets from the system.
You can see that generating a good random number is very hard, and as Donald Knuth once said, “A random number generator should not be chosen at random”. A good random process is not just a complicated or an obscure procedure. Another famous quote says “compromise of the system should not inconvenience the correspondents”. That means the system should remain secure even if all the details about its inner workings, down to the actual source code, are known to all.
Entropy = Randomness
In general the importance of generating a good sequence of random numbers is a bit more relaxed when we are working with mathematical or scientific applications given that we only need to fool our simulations or modeling programs into thinking that true random numbers have been used. This approach is not possible (or at least not wise) when writing security applications (to fool a strong-willed hacker or the NSA you need to do a lot better than simply shuffling your numbers).
Pseudo random numbers as the name itself implies are not really random. They just look random. As we saw earlier they are pseudo (or knowable) because they come from a mathematical function.
To the increase the randomness of our selected numbers we should collect as much entropy as possible before using a system. Anything that is determined by external factors can be and should be used as input, such as the time between keystrokes, the timing of disk interrupts, number of network packages arrived, and the like. On a multi-user system, the number of page faults, the number of disk read/writes, the time to wait until eligible to get the next time slice, and other such information that depends on the overall activity in a manner that is hard to predict or precisely control.
Typically a cryptographically sound “message digest”, such as MD5 or SHA (the Secure Hash Algorithm) is computed over a entropy pool and used as your next random number or seed.
It’s a good idea to have a tenth to a fifth as much entropy as the length of the key. For example if you are generating a 1024-bit key, it’s a good idea to have between 100 and 200 bits of entropy. More of course is always better but as you would expect it’s also slower. Sources of entropy usually carry a significant overhead.
Note: By one bit of entropy I mean the equivalent of a true random event with two, and only two, possible outcomes (such as a coin-flip for example).
There are many ways to get true random numbers. Some methods include making hardware devices that generate noise, observing cosmic ray flux, and observing light emissions from trapped mercury atoms. They're great in theory and in some sorts of practice, and there are some very high-quality random generator chips out there but most of the time we have to work with existing equipment.
Here is a list of ways to get entropic numbers:
- User input: A few examples are time between keystrokes, mouse movements etc. Users are among the most entropic (meaning unpredictable) things there are. The obvious drawbacks are that it takes time to get entropic responses from the user, and if you have no user (in the case of a server), this doesn't help. Nonetheless, if you do have one, you should use user inputs, as these are the hardest for the adversary to acquire or spoof.
- Hardware: If you have no user, you have to use your hardware for a stream of entropy. This is annoying, but not impossible. Some methods you might use are: clock, hard disk, video etc. Most of these devices will give you a source of nearly true random data if you know how/where to get them.
- Network: There are all sorts of things on a network that are unpredictable. The downside of these methods is that they are also accessible to other people as they are to you.
Microsoft Cryptography API
Starting from PIII processors Intel has added support for a Security Driver that provides software applications the ability to access the Firmware Hub's hardware Random Number Generator. This RNG is based on sampling the thermal noise in resistors. It uses SHA-1 as mixing function for the outputs. It also runs some FIPS 140-1 autotests.
A number of Microsoft Operating Systems are currently supported by Intel and you can take advantage of this feature using the Microsoft’s Cryptography API.
To start using this RNG in your programs you will have to first download the driver from Intel and then read the directions presented in the CryptoAPI documentation in the Platform SDK.
Some people consider this the ultimate solution. Others are afraid because all the entropy comes from one source (albeit a good one).
Note: In the SDK you can find some good example on how to use the CryptoAPI.
I will stop now and leave the testing of randomness for the next posting when we will look into makes a sequence of numbers a good or a bad apporximation of a "random" sequence.
Further Reading
• Using and Creating Cryptographic - Quality Random Numbers:
www.merrymeet.com/jon/usingrandom.html
• random.org:
http://www.random.org/essay.html.
• Generating a Truly Random Number by Leif Svalgaard:
http://cobolreport.com/columnists/leif/
• Use Query Performance Counterto Time Code:
http://support.microsoft.com/default.aspx?scid=http://support.microsoft.com:80/support/kb/articles/Q172/3/38.asp&NoWebContent=1
• The Cryptography API, or How to Keep a Secret by Robert Coleridge:
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dncapi/html/msdn_cryptapi.asp
• Intel Security Driver:
http://www.intel.com/design/software/drivers/platform/security.htm