Contest Winner
So here is my answer to Monday’s contest:
unsigned __int32 factorial( unsigned __int32 n )
{
static const unsigned __int32 values[] = {1,1,2,6,24,120,720,5040,40320,362880,3628800,39916800,479001600};
if( n >= sizeof( values ) / sizeof( values[0] ) )
return 0;
return values[n];
}
Note how there’s no need for any loops or recursion. Why? Because 13! is greater than 2^32-1, so it can’t be calculated with unsigned 32-bit integers. And as the problem stated, if it can’t be calculated, return 0. Thus only the first 13 values (0 through 12) need to return a number that’s not 0. Pre-calculating this values and storing them in a look up table saves us a lot of time at run time.
Some things that indicate a good answer:
· It’s complexity is O(1), so it executes in constant time.
· The look up values are static so they aren’t initialized each time we call the function.
· It works on any architecture.
Some things I saw in answers that made them less efficient:
· Loops, recursion
· The lookup table was not static, so it was pushed on the stack (or in some languages on the heap) each time the function would get called.
· The look up table was signed. This means we need to convert to unsigned before we return which is unnecessary extra work.
· The look up table was wrong (be sure to proof read!)
· The code would only work on one architecture (such as x64 or x86, but not both and not anything else).
The last point wasn’t specified in the problem, so that’s ok if yours did, but it just keep that in mind when you do lots of bit work. Are you in big-endian or not? Are you on x64 or x86 or IA64 or what?
Which brings me to the winning entry submitted by ‘shaoxp’:
unsigned __int32 factorial( unsigned __int32 n )
{
// return the factorial of n
// if it can't be calculated return 0
//
//the max number of unsigned __int32 is 4294967295,which is between 12! and 13!
//we use index 0 to store result for n >12
//
static unsigned __int32 fac[] = {0,1,1,2,6,24,120,720,5040,40320,362880,3628800,39916800,479001600};
return fac[(((signed)n-13)>>31)&(n+1)];
//by the way,if the function can be inline,i think, because there is no branch,the CPU can optimize the execution.
}
I like this one because it explicitly avoids the if statement. Why is this good (or bad)? CPUs need to wait for a branch instruction, such as an if statement, to fully execute before it knows which instruction to execute next. With no branches in the code, we never hit this case. However this solution is performing additional calculations than mine does. Also, modern day CPUs include branch prediction: the CPU will guess on which branch to take and begin executing that route. When the if statement finishes executing, it will see if its guess was right or wrong. If it was right, it’s already calculated what would have happened and continues on. If not, it wipes what it already had done and begins executing the other branch. So, depending on the program that uses this function would depend on how good the branch prediction is. Thus the extra calculations needed to avoid the branch may back fire as the CPU could avoid the branch all together. Either way, it’s a neat solution.
Thanks to everyone who submitted an answer! Stay tuned next Monday for your next chance to win the best contest on the Internet.