<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://blogs.msdn.com/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>Developing for Developers : Theory</title><link>http://blogs.msdn.com/devdev/archive/tags/Theory/default.aspx</link><description>Tags: Theory</description><dc:language>en-US</dc:language><generator>CommunityServer 2.1 SP1 (Build: 61025.2)</generator><item><title>P-complete and the limits of parallelization</title><link>http://blogs.msdn.com/devdev/archive/2007/09/07/p-complete-and-the-limits-of-parallelization.aspx</link><pubDate>Sat, 08 Sep 2007 06:11:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:4821073</guid><dc:creator>dcoetzee</dc:creator><slash:comments>3</slash:comments><comments>http://blogs.msdn.com/devdev/comments/4821073.aspx</comments><wfw:commentRss>http://blogs.msdn.com/devdev/commentrss.aspx?PostID=4821073</wfw:commentRss><description>&lt;P&gt;We're entering an era where CPU clock speeds will soon cease to scale upwards and instead CPU manufacturers are planning to put more and more independent cores on a chip.&amp;nbsp; &lt;A class="" href="http://news.com.com/2100-1006_3-6119618.html" mce_href="http://news.com.com/2100-1006_3-6119618.html"&gt;Intel plans to release an 80-core chip within 5 years&lt;/A&gt;. Consequently the research community is setting their eye on the manifold barriers to writing correct and efficient highly-parallel programs. Will we be able to continue to get the same boosts in computational performance that we have been getting through raw clock speed?&lt;/P&gt;
&lt;P&gt;You might think it's just a matter of building effective tools for producing programs that can take advantage of highly parallel platforms, and training programmers to take advantage of them. But there are&amp;nbsp;complexity theory results to show that there are some problems that, under widely accepted assumptions,&amp;nbsp;fundamentally cannot be parallelized, and this is the theory of &lt;A class="" href="http://en.wikipedia.org/wiki/P-complete" mce_href="http://en.wikipedia.org/wiki/P-complete"&gt;P-completeness&lt;/A&gt;. For these problems, it may be literally impossible to get any significant benefit out of throwing additional cores at them - such problems are called &lt;EM&gt;inherently sequential&lt;/EM&gt;. More to the point, however long it takes your computer to solve these problems &lt;EM&gt;right now&lt;/EM&gt;, that's about how fast it's going to stay for a long time.&lt;/P&gt;
&lt;P&gt;The complexity class NC (Nick's Class, named after &lt;A title="" href="http://en.wikipedia.org/wiki/Nick_Pippenger" originalTitle="Nick Pippenger" popData="undefined" hasPopup="true"&gt;Nick Pippenger&lt;/A&gt;&amp;nbsp;by Steven Cook) is the set of problems that can be solved in polylogarithmic time (O(log&lt;SUP&gt;&lt;I&gt;k&lt;/I&gt;&lt;/SUP&gt; &lt;I&gt;n&lt;/I&gt;) time for some integer &lt;EM&gt;k&lt;/EM&gt;) by a machine with polynomially many processors. For example, given a list of &lt;EM&gt;n&lt;/EM&gt; numbers, the typical way to add them up on a single processor would be with a loop, as in this C# snippet:&lt;/P&gt;&lt;PRE&gt;    public static int AddList(List&amp;lt;int&amp;gt; list) {
        int sum = 0;
        foreach (int i in list) {
            sum += i;
        }
        return sum;
    }
&lt;/PRE&gt;
&lt;P&gt;But now suppose we had &lt;EM&gt;n&lt;/EM&gt;/2 processors. Then each processor can add just two of the numbers, producing a list of &lt;EM&gt;n&lt;/EM&gt;/2 numbers with the same sum. We add these in pairs as well, continuing until just one number remains, the sum. This fan-in algorithm requires only O(log &lt;EM&gt;n&lt;/EM&gt;) time, placing the problem in NC (or technically the decision problem version of it).&lt;/P&gt;
&lt;P&gt;We can use similar ideas to parallelize many algorithms. For example, the best known sequential algorithms for matrix multiplication requires time O(&lt;I&gt;n&lt;/I&gt;&lt;SUP&gt;2.376&lt;/SUP&gt;) for an &lt;I&gt;n&lt;/I&gt; × &lt;I&gt;n&lt;/I&gt; matrix, but given O(&lt;EM&gt;n&lt;/EM&gt;&lt;SUP&gt;3&lt;/SUP&gt;) processors, we can solve it in O(log &lt;EM&gt;n&lt;/EM&gt;) time: for each entry, we assign&amp;nbsp;&lt;EM&gt;n&lt;/EM&gt; processors to compute the products that are added to form that entry, then&amp;nbsp;use the fan-in procedure above to add them together. This can still be done in O(log &lt;EM&gt;n&lt;/EM&gt;) time. There are efficient parallel algorithms for other common problems too, like sorting, string matching, and algorithms in graph theory and computational geometry.&lt;/P&gt;
&lt;P&gt;In real life of course we don't have such a polynomial number of processors, so this is in some ways an impractical definition; all it really tells us is that if we have &lt;EM&gt;many&lt;/EM&gt; processors, we can solve the problem dramatically faster.&amp;nbsp;Practical parallel algorithms pay more attention to &lt;EM&gt;speedup&lt;/EM&gt; - how many times faster the algorithm is with &lt;EM&gt;k&lt;/EM&gt; processors than with one. Effective speedup requires a good way of dynamically decomposing the problem. But speedup is difficult to characterize formally in a machine-independent way, so we'll stick with the definition of NC.&lt;/P&gt;
&lt;P&gt;The famous P = NP problem asks whether all problems whose solutions can be verified in polynomial time, can also be solved in polynomial time. In this problem, the problems in P are considered "feasible to solve", and most computer scientists believe P ≠ NP - that is, that some problems which we can feasibly verify cannot be solved feasibly. We believe this in part because we have shown that many important problems are NP-hard. To say a problem is NP-hard means there is a polynomial time algorithm that can take any problem in NP and convert it into an instance of that problem; consequently, if an NP-hard problem can be solved in polynomial time, so can any problem in NP.&lt;/P&gt;
&lt;P&gt;But when you have a large number of processors, the question turns around: we no longer consider P feasible, because any algorithm using just one CPU would face a dramatic slowdown compared to algorithms that take advantage of all of them. Now NC algorithms are considered feasible, and we ask: can all problems with efficient sequential solutions be highly parallelized? That is, is&amp;nbsp;NC = P?&lt;/P&gt;
&lt;P&gt;Just as with P = NP, nobody knows. However, just as there are NP-hard problems, there are P-hard problems. A P-hard problem is a problem where there is an NC algorithm (running in polylogarithmic time on a polynomial number of processors) to convert any problem in P into an instance of that problem. If we had an NC algorithm to solve a P-hard problem, it would imply that NC = P, providing a constructive and mechanical way of parallelizing any problem we can solve sequentially in polynomial time. Just as most researchers believe P ≠ NP, most researchers believe&amp;nbsp;NC ≠ P, which implies that no P-hard problem has an NC algorithm to solve it.&lt;/P&gt;
&lt;P&gt;One of the most important P-complete problems is &lt;EM&gt;&lt;A class="" href="http://en.wikipedia.org/wiki/Linear_programming" mce_href="http://en.wikipedia.org/wiki/Linear_programming"&gt;linear programming&lt;/A&gt;&lt;/EM&gt;: given a linear target function in terms of some variables, and a number of linear inequalities expressing constraints between those variables, find the optimal (maximum) value of the target function. It finds uses in operations research, microeconomics, and business management, and it would be great if we could exploit large number of cores to tackle larger instances of these problems. But no one has found an efficient parallel algorithm for this problem.&lt;/P&gt;
&lt;P&gt;Another important example is the &lt;EM&gt;circuit-value problem&lt;/EM&gt;: given a digital logic circuit and its inputs, compute the output of the circuit. There is a trivial linear time solution in the number of gates: continually get the next gate that you know both inputs of and compute its output, until you're done. It would seem that certain gates don't influence each other and be computed in parallel. But because the wires can express complex interdependencies between values and can create long chains of dependencies, there's no clear way to parallelize a general circuit. Besides obvious applications to circuit design and simulation, many problems can be solved by families of circuits, and if we can evaluate them quickly we could solve these problems. But not only has no one has found a general way of parallelizing such evaluation, they've shown that even if we restrict the circuits so that wires can't cross, or such that only the inputs can be negated, the problem remains P-complete, and so difficult to parallelize (these are the &lt;EM&gt;planar &lt;/EM&gt;and &lt;EM&gt;monotone &lt;/EM&gt;circuit-value problems, respectively).&lt;/P&gt;
&lt;P&gt;The first problem ever shown to be NP-complete, as well as an important problem in practice, is the boolean satisfiability problem: given a formula using AND, OR, and NOT, with a number of variables set to either true or false, determine if there is some way of setting the variables to make the whole formula evaluate to true. The P-complete version of this problem is called Horn satifiability.&lt;/P&gt;
&lt;P&gt;Any boolean formula can be reduced efficiently to a form where it represents a three-layer circuit: a literal is either a variable or a negated variable; a clause is the OR of some literals; and an entire&amp;nbsp;formula is the AND of some clauses. A "Horn clause" is a clause where all of the variables are negated except possibly one. This provides an efficient solution to the problem, since Horn clauses can be rewritten as implications (not(x1) or not(x2) or x3 is the same thing as (x1 and x2) implies x3). Formulas made up of Horn clauses appear in the context of logical deduction systems, such as the resolution systems used to model problems in AI. Again, no one has found an effective way to parallelize finding solutions for these formulas.&lt;/P&gt;
&lt;P&gt;In the world of P = NP, people have long been studying ways to "get around" the infeasibility of finding a polynomial-time solution for NP-hard problems, such as approximation algorithms, special cases, fixed-parameter tractable solutions, and so on. We could ask the same questions about the NC = P question: could we find an approximate solution much more quickly by using all our cores, instead of running a classical algorithm to find an exact solution on just one? Are there special cases of these problems which are parallelizable? But comparatively little investigation has been done into this area, although some examples exist (like Karger and Motwani's "&lt;SPAN class=l&gt;&lt;A class="" href="http://citeseer.ist.psu.edu/karger97nc.html" mce_href="http://citeseer.ist.psu.edu/karger97nc.html"&gt;An NC Algorithm For Minimum Cuts&lt;/A&gt;").&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class=l&gt;Finally, I should mention that P-complete is relevant not only to the study of what problems can be efficiently parallelized, but also to what problems can be solved in a small amount of space. I'll leave this for another time though.&lt;/SPAN&gt;&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=4821073" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/devdev/archive/tags/Theory/default.aspx">Theory</category></item><item><title>Robin's theorem</title><link>http://blogs.msdn.com/devdev/archive/2007/07/16/robin-s-theorem.aspx</link><pubDate>Tue, 17 Jul 2007 00:12:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:3901480</guid><dc:creator>dcoetzee</dc:creator><slash:comments>2</slash:comments><comments>http://blogs.msdn.com/devdev/comments/3901480.aspx</comments><wfw:commentRss>http://blogs.msdn.com/devdev/commentrss.aspx?PostID=3901480</wfw:commentRss><description>&lt;P&gt;Most computer scientists are familiar with the P = NP problem, which asks essentially whether we can verify more problems in polynomial time than we can solve. So fundamentally does complexity theory hinge on this result that the Clay Mathematics Institute has labelled it one of their seven &lt;A class="" href="http://www.claymath.org/millennium/" mce_href="http://www.claymath.org/millennium/"&gt;Millennium Prize Problems&lt;/A&gt;, and will award $1,000,000 to the researchers that solve it.&lt;/P&gt;
&lt;P&gt;But there are six other Millennium Prize problems that are often less accessible to computer scientists, requiring background in areas such as topology and complex analysis. One of the most important of these is the Riemann Hypothesis, which is normally stated in terms of complex zeros of the Riemann Zeta function. Here, we will describe a recent characterization of the Riemann Hypothesis more suited to a computer scientist's background based on growth of integer functions.&lt;/P&gt;
&lt;P&gt;Consider the &lt;A class="" href="http://en.wikipedia.org/wiki/Divisor_function" mce_href="http://en.wikipedia.org/wiki/Divisor_function"&gt;sum-of-divisors function&lt;/A&gt; σ(&lt;EM&gt;n&lt;/EM&gt;) that sums up the divisors of the positive integer &lt;EM&gt;n&lt;/EM&gt;. For example, σ(15) = 1 + 3 + 5 + 15 = 24. If you've studied asymptotic notation, a natural question to ask is: how fast does σ(&lt;EM&gt;n&lt;/EM&gt;) grow?&lt;/P&gt;
&lt;P&gt;It's clear that since 1 and &lt;EM&gt;n&lt;/EM&gt; always divides &lt;EM&gt;n&lt;/EM&gt;, σ(&lt;EM&gt;n&lt;/EM&gt;) ≥ n + 1, so σ(&lt;EM&gt;n&lt;/EM&gt;) is Ω(&lt;EM&gt;n&lt;/EM&gt;). But σ(&lt;EM&gt;n&lt;/EM&gt;) cannot be O(n), because σ(&lt;EM&gt;n&lt;/EM&gt;)/&lt;EM&gt;n&lt;/EM&gt; takes on arbitrarily large values. To see this, consider &lt;EM&gt;n&lt;/EM&gt; equal to &lt;EM&gt;k&lt;/EM&gt; primorial, the product of the first &lt;EM&gt;k &lt;/EM&gt;primes. Because σ(&lt;EM&gt;n&lt;/EM&gt;)/&lt;EM&gt;n&lt;/EM&gt; is &lt;A class="" href="http://en.wikipedia.org/wiki/Multiplicative_function" mce_href="http://en.wikipedia.org/wiki/Multiplicative_function"&gt;multiplicative&lt;/A&gt;, σ(&lt;EM&gt;n&lt;/EM&gt;)/&lt;EM&gt;n&lt;/EM&gt; will equal the product of σ(&lt;EM&gt;p&lt;/EM&gt;)/&lt;EM&gt;p&lt;/EM&gt; = (1+&lt;EM&gt;p&lt;/EM&gt;)/&lt;EM&gt;p&lt;/EM&gt;&amp;nbsp;= (1 + 1/p)&amp;nbsp;for each prime factor. But if you expand out this product, you get a sum including 1/p for each prime factor, and the sums of the reciprocals of the primes, also known as the &lt;A class="" href="http://mathworld.wolfram.com/HarmonicSeriesofPrimes.html" mce_href="http://mathworld.wolfram.com/HarmonicSeriesofPrimes.html"&gt;harmonic series of primes&lt;/A&gt;,&amp;nbsp;diverges. But the harmonic series of primes grows quite slowly, and is O(ln ln &lt;EM&gt;n&lt;/EM&gt;).&lt;/P&gt;
&lt;P&gt;From this we might conjecture that σ(&lt;EM&gt;n&lt;/EM&gt;) grows at a rate of &lt;EM&gt;n&lt;/EM&gt; ln ln &lt;EM&gt;n&lt;/EM&gt;. &lt;A class="" href="http://mathworld.wolfram.com/GronwallsTheorem.html" mce_href="http://mathworld.wolfram.com/GronwallsTheorem.html"&gt;Gronwall's theorem&lt;/A&gt;, proven in 1913, shows that in fact:&lt;/P&gt;
&lt;P&gt;lim sup&lt;SUB&gt;&lt;EM&gt;n&lt;/EM&gt;→∞ &lt;/SUB&gt;σ(&lt;EM&gt;n&lt;/EM&gt;)&lt;EM&gt; &lt;/EM&gt;/ (&lt;EM&gt;n &lt;/EM&gt;ln ln &lt;EM&gt;n&lt;/EM&gt;) = &lt;EM&gt;e&lt;/EM&gt;&lt;SUP&gt;γ&lt;/SUP&gt;.&lt;/P&gt;
&lt;P&gt;Here, &lt;EM&gt;&lt;A class="" href="http://en.wikipedia.org/wiki/E_%28mathematical_constant%29" mce_href="http://en.wikipedia.org/wiki/E_%28mathematical_constant%29"&gt;e&lt;/A&gt;&lt;/EM&gt; and γ are both constants (the latter is the &lt;A class="" href="http://en.wikipedia.org/wiki/Euler%E2%80%93Mascheroni_constant" mce_href="http://en.wikipedia.org/wiki/Euler%E2%80%93Mascheroni_constant"&gt;Euler-Mascheroni constant&lt;/A&gt;), so &lt;EM&gt;e&lt;/EM&gt;&lt;SUP&gt;γ&lt;/SUP&gt; is a constant, equal to about 1.781. This is somewhat stronger than the statement that σ(&lt;EM&gt;n&lt;/EM&gt;) is O(&lt;EM&gt;n&lt;/EM&gt; ln ln &lt;EM&gt;n&lt;/EM&gt;) - the &lt;A class="" href="http://en.wikipedia.org/wiki/Limit_superior_and_limit_inferior" mce_href="http://en.wikipedia.org/wiki/Limit_superior_and_limit_inferior"&gt;lim sup&lt;/A&gt; actually says that for any ε &amp;gt; 0, some tail of the sequence σ(&lt;EM&gt;n&lt;/EM&gt;)&lt;EM&gt; &lt;/EM&gt;/ (&lt;EM&gt;n &lt;/EM&gt;ln ln &lt;EM&gt;n&lt;/EM&gt;) must be bounded above by &lt;EM&gt;e&lt;/EM&gt;&lt;SUP&gt;γ&lt;/SUP&gt; + ε.&lt;/P&gt;
&lt;P&gt;But instead of a lim sup, what we'd really like is&amp;nbsp;to have a hard bound; we'd like to say:&lt;/P&gt;
&lt;P&gt;σ(&lt;EM&gt;n&lt;/EM&gt;)&lt;EM&gt; &lt;/EM&gt;/ (&lt;EM&gt;n &lt;/EM&gt;ln ln &lt;EM&gt;n&lt;/EM&gt;)&amp;nbsp;&amp;lt; &lt;EM&gt;e&lt;/EM&gt;&lt;SUP&gt;γ&lt;/SUP&gt; for all &lt;EM&gt;n&lt;/EM&gt;.&lt;/P&gt;
&lt;P&gt;This is false, and a quick search identifies 27 small counterexamples: 2, 3, 4, 5, 6, 8, 9, 10, 12, 16, 18, 20, 24, 30, 36, 48, 60, 72, 84, 120, 180, 240, 360, 720, 840, 2520, and 5040. However, no search to date has located any larger counterexamples. From this we might conjecture:&lt;/P&gt;
&lt;P&gt;σ(&lt;EM&gt;n&lt;/EM&gt;)&lt;EM&gt; &lt;/EM&gt;/ (&lt;EM&gt;n &lt;/EM&gt;ln ln &lt;EM&gt;n&lt;/EM&gt;)&amp;nbsp;&amp;lt; &lt;EM&gt;e&lt;/EM&gt;&lt;SUP&gt;γ&lt;/SUP&gt; for all &lt;EM&gt;n &lt;/EM&gt;&amp;gt; 5040.&lt;/P&gt;
&lt;P&gt;In 1984, Guy Robin showed that in fact, this statement is true if and only if the Riemann hypothesis is true. This is &lt;A class="" href="http://mathworld.wolfram.com/RobinsTheorem.html" mce_href="http://mathworld.wolfram.com/RobinsTheorem.html"&gt;Robin's theorem&lt;/A&gt;. Thus, the Riemann hypothesis could, in theory, be proven false by exhibiting a single positive integer &lt;EM&gt;n&lt;/EM&gt; &amp;gt; 5040 such that σ(&lt;EM&gt;n&lt;/EM&gt;) &amp;gt; &lt;EM&gt;e&lt;/EM&gt;&lt;SUP&gt;γ &lt;/SUP&gt;(&lt;EM&gt;n &lt;/EM&gt;ln ln &lt;EM&gt;n&lt;/EM&gt;). Because most mathematicians believe the Riemann hypothesis is true (just as most computer scientists believe P ≠ NP) it's more likely that the above statement is true, and a proof of this fact would win you $1,000,000.&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=3901480" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/devdev/archive/tags/Theory/default.aspx">Theory</category></item><item><title>Modular arithmetic and primality testing</title><link>http://blogs.msdn.com/devdev/archive/2005/09/07/462070.aspx</link><pubDate>Wed, 07 Sep 2005 20:29:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:462070</guid><dc:creator>dcoetzee</dc:creator><slash:comments>7</slash:comments><comments>http://blogs.msdn.com/devdev/comments/462070.aspx</comments><wfw:commentRss>http://blogs.msdn.com/devdev/commentrss.aspx?PostID=462070</wfw:commentRss><description>&lt;p&gt;&lt;a href="http://en.wikipedia.org/wiki/Number_theory"&gt;Number theory&lt;/a&gt;
is, roughly speaking, the study of properties of integers. Often a
problem which is easy for real numbers, such as factoring or linear
programming, seems to be considerably more difficult when restricted to
integers (in fact, integer programming is NP-hard). Much of the focus
of modern number theory is on solving these problems efficiently.&lt;/p&gt;
&lt;p&gt;The practical importance of number theory was greatly increased by
the introduction of RSA cryptography, an unprecedented system whereby
information could be securely transmitted without first transmitting
the key and risking its interception. Central to RSA is the ability to
efficiently generate a semiprime, which is simply a product of two very
large primes of roughly equal magnitude. What's useful about these
numbers is that we can generate them efficiently, yet given just the
semiprime, determining the two prime factors is a hard,
unsolved&amp;nbsp;problem. It turns out that there are a lot of primes to
choose from: the probability that a randomly chosen k-bit number is
prime is between 0.69/&lt;em&gt;k &lt;/em&gt;and 0.87/&lt;em&gt;k.&lt;/em&gt; For example, about 7-9% of 10-bit numbers are prime. Consequently, we can find a &lt;em&gt;k&lt;/em&gt;-bit prime number in an expected O(&lt;em&gt;k&lt;/em&gt;) random guesses. The only remaining problem is to find an efficient algorithm for testing these guesses for primality.&lt;/p&gt;
&lt;p&gt;Brute force primality tests such as simply dividing the number by
all values less than the square root are ineffective for large numbers.
To get around this, we'll need&amp;nbsp;an important&amp;nbsp;tool called &lt;a href="http://en.wikipedia.org/wiki/Modular_arithmetic"&gt;modular arithmetic&lt;/a&gt;.
If you've written C, you may be surprised to know that you're already
familiar with modular arithmetic - the "wrap around" behavior of &lt;em&gt;k&lt;/em&gt;-bit integers is an implementation of arithmetic mod 2&lt;sup&gt;&lt;em&gt;k&lt;/em&gt;&lt;/sup&gt;. For example, this piece of code computes the value 17:&lt;/p&gt;&lt;pre&gt;    unsigned char c1 = 177, c2 = 96;&lt;br&gt;    unsigned char sum = c1 + c2;&lt;br&gt;&lt;/pre&gt;
&lt;p&gt;In mathematical notation, we would say:&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;177 + 96 ≡ 17 (mod 256)&lt;/p&gt;This
tells us that the numbers 177 + 96 and 17 differ by a multiple of 256,
or in other words have the same last 8 bits. Multiplication mod 256 is
likewise similar to multiplication of unsigned chars in C: &lt;pre&gt;    unsigned char c1 = 177, c2 = 23;&lt;br&gt;    unsigned char product = c1 * c2;&lt;br&gt;&lt;/pre&gt;
&lt;p&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;177 × 23 ≡ 231 (mod 256)&lt;/p&gt;
&lt;p&gt;Modular arithmetic works exactly the same for other moduli than 256;
the only difference is the number where it wraps around. For example,
if you compute 10 + 10 mod 17, you get 3.&lt;/p&gt;
&lt;p&gt;The interesting thing about modular arithmetic is that it can be shown that the values form a commutative &lt;a href="http://en.wikipedia.org/wiki/Ring_%28algebra%29"&gt;ring&lt;/a&gt;. This means, among other things, that:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Addition and multiplication are commutative (you can reverse their operands without changing the result). 
&lt;/li&gt;&lt;li&gt;The associative property holds for both addition and
multipliation: (177 × 23) × 45 gives the same thing as 177 × (23 × 45),
even if computed with unsigned chars. &lt;/li&gt;&lt;li&gt;The distributive property holds: (177&amp;nbsp;+ 23) × 45 is equal to (177 × 45)&amp;nbsp;+ (23 × 45).&lt;/li&gt;&lt;/ul&gt;
&lt;p&gt;If none of these surprise you, this might: if the modulus is a power of 2, as in machine arithmetic, every odd number &lt;em&gt;m&lt;/em&gt; has a multiplicative&amp;nbsp;inverse. An inverse is simply a number&amp;nbsp;&lt;em&gt;n&lt;/em&gt; such that &lt;em&gt;m &lt;/em&gt;× &lt;em&gt;n&lt;/em&gt; = 1.&amp;nbsp;What good is this? Well, suppose you want to divide a number &lt;em&gt;k&lt;/em&gt; by&amp;nbsp;7 on a machine with no hardware division. You know that &lt;em&gt;k&lt;/em&gt; is divisible by 7. The inverse of 7 mod 256 is 183. Because 7 × 183 = 1, we have 7 × 183 × &lt;em&gt;k&lt;/em&gt; = &lt;em&gt;k&lt;/em&gt;. Divide both sides by 7, and we get 183 × &lt;em&gt;k&lt;/em&gt; = &lt;em&gt;k&lt;/em&gt;/7. In other words, multiplying by a number's inverse is the same as dividing by that number, provided that &lt;em&gt;k&lt;/em&gt; is evenly divisible by the number.&lt;/p&gt;
&lt;p&gt;Now we come back to our original problem: how can modular arithmetic
help us determine primality? Well, it's not hard to show that if &lt;em&gt;p&lt;/em&gt; is prime, then:&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;For all &lt;em&gt;a &lt;/em&gt;with&lt;em&gt; &lt;/em&gt;0 &amp;lt; &lt;em&gt;a&lt;/em&gt; &amp;lt; &lt;em&gt;p&lt;/em&gt;, &lt;em&gt;a&lt;/em&gt;&lt;sup&gt;&lt;em&gt;p&lt;/em&gt;&lt;/sup&gt; ≡ &lt;em&gt;a&lt;/em&gt; (mod &lt;em&gt;p&lt;/em&gt;).&lt;/p&gt;
&lt;p&gt;This is called &lt;a href="http://en.wikipedia.org/wiki/Fermat%27s_little_theorem"&gt;Fermat's little theorem&lt;/a&gt;. What's useful about it is not only that it's true for all primes, but that if you find an &lt;em&gt;a&lt;/em&gt; such that it does not hold, you have proven that the number &lt;em&gt;p&lt;/em&gt; is composite. This can be checked efficiently because there is &lt;a href="http://en.wikipedia.org/wiki/Exponentiation_by_squaring"&gt;an efficient algorithm for computing large powers&lt;/a&gt;.
This is simply an existence proof - it tells us nothing about what the
factors are. It can be shown that for most composite numbers, if you
just select several &lt;em&gt;a&lt;/em&gt; at random and try them, one is likely to fail the test. Unfortunately, there are an infinite number of special numbers called &lt;a href="http://en.wikipedia.org/wiki/Carmichael_number"&gt;Carmichael numbers&lt;/a&gt; for which the above result holds for all &lt;em&gt;a&lt;/em&gt;, even though they are not prime.&lt;/p&gt;
&lt;p&gt;To get around this, we design a new, slightly more
complicated&amp;nbsp;test which is not susceptible to this problem. I take
this from the excellent book &lt;em&gt;&lt;a href="http://www.amazon.com/exec/obidos/tg/detail/-/0387252827/"&gt;Prime Numbers: A Computational Perspective&lt;/a&gt;&lt;/em&gt;, by Richard Crandall and Carl Pomerance. Suppose&amp;nbsp;&lt;em&gt;p&lt;/em&gt; is an odd prime, and that the binary representation of &lt;em&gt;p&lt;/em&gt; - 1 is the odd number &lt;em&gt;t&lt;/em&gt; followed by &lt;em&gt;s&lt;/em&gt; zeros. Then one of the following is true for every &lt;em&gt;a&lt;/em&gt; with 0 &amp;lt; &lt;em&gt;a&lt;/em&gt; &amp;lt; &lt;em&gt;p &lt;/em&gt;- 1:&lt;/p&gt;
&lt;p&gt;&lt;em&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;a&lt;sup&gt;t&lt;/sup&gt;&lt;/em&gt; ≡&amp;nbsp;1 (mod &lt;em&gt;p&lt;/em&gt;)&lt;br&gt;&lt;em&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;a&lt;sup&gt;&lt;em&gt;t&lt;/em&gt; &amp;lt;&amp;lt; &lt;em&gt;i&lt;/em&gt;&lt;/sup&gt;&lt;/em&gt; ≡&amp;nbsp;p - 1 (mod &lt;em&gt;p&lt;/em&gt;) for some &lt;em&gt;i&lt;/em&gt; with 0 ≤ &lt;em&gt;i&lt;/em&gt; &amp;lt; &lt;em&gt;s&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;Above "&amp;lt;&amp;lt;" denotes the shift-left operator. It can be shown
that for any odd composite number &amp;gt; 9, both of the above will fail
for at least 3/4 of the &lt;em&gt;a&lt;/em&gt; between 1 and &lt;em&gt;p&lt;/em&gt;-2. We can use this to design a simple probabilistic algorithm:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Choose an &lt;em&gt;a&lt;/em&gt; at random with 0 &amp;lt; &lt;em&gt;a&lt;/em&gt; &amp;lt; &lt;em&gt;p&lt;/em&gt;-1. 
&lt;/li&gt;&lt;li&gt;Check if one of the above formulas holds. If not, &lt;em&gt;p&lt;/em&gt; is composite. 
&lt;/li&gt;&lt;li&gt;If we have iterated at least T times,&amp;nbsp;claim that &lt;em&gt;p&lt;/em&gt; is prime. Otherwise, go back to step 1.&lt;/li&gt;&lt;/ol&gt;
&lt;p&gt;Here's an (untested) piece of C# code which implements this simple
algorithm, assuming that BigInteger is a suitable arbitrary-precision
integer type (the Framework provides no such type):&lt;/p&gt;&lt;pre&gt;    public bool IsProbablePrime(BigInteger p, int T) {&lt;br&gt;        int s = 0;&lt;br&gt;        BigInteger t = p - 1;&lt;br&gt;        while (t.IsEven()) {&lt;br&gt;            s++;&lt;br&gt;            t &amp;gt;&amp;gt;= 1;&lt;br&gt;        }&lt;br&gt;        for (int repeat = 0; repeat &amp;lt; T; T++) {&lt;br&gt;            BigInteger a = BigInteger.Random(1, p - 2);&lt;br&gt;            if (!PassesPrimalityTest(a, p, s, t)) {&lt;br&gt;                return false;&lt;br&gt;            }&lt;br&gt;        }&lt;br&gt;        return true;&lt;br&gt;    }&lt;br&gt;&lt;br&gt;    public bool PassesPrimalityTest(BigInteger a, BigInteger p, int s, BigInteger t) {&lt;br&gt;        BigInteger b = BigInteger.ModPower(a, t, p); // b = a^t mod p&lt;br&gt;        if (b == 1 || b == p - 1) return true;&lt;br&gt;        for (int j = 1; j &amp;lt; s; j++) {&lt;br&gt;            b = BigInteger.ModPower(b, 2, p);&lt;br&gt;            if (b == p - 1) return true;&lt;br&gt;        }&lt;br&gt;        return false;&lt;br&gt;    }&lt;br&gt;&lt;/pre&gt;
&lt;p&gt;Because each trial is independent, the chance of erroneously claiming that &lt;em&gt;p&lt;/em&gt; is prime is 1/4&lt;sup&gt;T&lt;/sup&gt;,
which is ridiculously tiny even for small T, like T = 20. We can now
use this to quickly locate large prime numbers and generate semiprimes
for RSA cryptography.&lt;/p&gt;
&lt;p&gt;Unfortunately, these tests identify composite numbers but reveal
nothing about their factors. This makes them useless for factoring
numbers. Unlike many other hard problems, the factoring problem is
believed to not be NP-complete, and so may very well have an efficient
algorithm that no one has found yet. Such an algorithm would make RSA
encryption useless and win you &lt;a href="http://www.rsasecurity.com/rsalabs/node.asp?id=2093"&gt;$625,000&lt;/a&gt;.
Combinations of advanced techniques have managed to factor very large
numbers, but only at an enormous expense in computing time and
hardware. This is one of the most active current areas of research in
number theory and computer science.&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/p&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=462070" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/devdev/archive/tags/Algorithms/default.aspx">Algorithms</category><category domain="http://blogs.msdn.com/devdev/archive/tags/Theory/default.aspx">Theory</category></item></channel></rss>