<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://blogs.msdn.com/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>Fabulous Adventures In Coding</title><link>http://blogs.msdn.com/b/ericlippert/</link><description>Eric Lippert&amp;#39;s Blog</description><dc:language>en-US</dc:language><generator>Telligent Evolution Platform Developer Build (Build: 5.6.50428.7875)</generator><item><title>GUID guide, part three</title><link>http://blogs.msdn.com/b/ericlippert/archive/2012/05/07/guid-guide-part-three.aspx</link><pubDate>Mon, 07 May 2012 15:18:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:10297199</guid><dc:creator>Eric Lippert</dc:creator><slash:comments>22</slash:comments><wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://blogs.msdn.com/b/ericlippert/rsscomments.aspx?WeblogPostID=10297199</wfw:commentRss><comments>http://blogs.msdn.com/b/ericlippert/archive/2012/05/07/guid-guide-part-three.aspx#comments</comments><description>&lt;div class="mine"&gt; &lt;p&gt;Let's recap: a GUID is a 128 bit integer that is used as a globally unique identifier. GUIDs are not a security system; they do not guarantee uniqueness in a world where hostile parties are deliberately attempting to cause collisions; rather, they provide a cheap and easy way for mutually benign parties to generate identifiers without collisions. One mechanism for ensuring global uniqueness is to generate the GUID so that its bits describe a unique position in spacetime: a machine with a specific network card at a specific time. The downside of this mechanism is that code artifacts with GUIDs embedded in them contain easily-decoded information about the machine used to generate the GUID. This naturally raises a privacy concern. &lt;/p&gt; &lt;p&gt;To address this concern, there is a second common method for generating GUIDs, and that is to choose the bits at random. Such GUIDs have a 4 as the first hex digit of the third section.&lt;/p&gt; &lt;p&gt;First off, what bits are we talking about when we say "the bits"? We already know that in a "random" GUID the first hex digit of the third section is always 4. Something I did not mention in the last episode was that there is additional version information stored in the GUID in the bits in the fourth section as well; you'll note that a GUID almost always has 8, 9, a or b as the first hex digit of the fourth section. So in total we have six bits reserved for version information, leaving 122 bits that can be chosen at random.&lt;/p&gt; &lt;p&gt;Second, why should we suppose that choosing a number at random produces uniqueness? Flipping a coin is random, but it certain does not produce a unique result! What we rely on here is probabilistic uniqueness. Flipping a single coin does not produce a unique result, but flipping the same coin 122 times in a row almost certainly produces a sequence of heads and tails that has never been seen before and will never be seen again. &lt;/p&gt; &lt;p&gt;Let's talk a bit about those probabilities. Suppose you have a particular randomly-generated GUID in hand. What is the probability that a &lt;em&gt;specific&lt;/em&gt; time that you randomly generate another GUID will produce a collision with your particular GUID? If the bits are chosen randomly and uniformly, clearly the probability of collision is one in 2&lt;sup&gt;122&lt;/sup&gt;. Now, what is the probability that over n generations of GUIDs, you produce a collision with your particular GUID? Those are independent rare events, so the probabilities add (*); the probability of a collision is n in 2&lt;sup&gt;122&lt;/sup&gt;. This 2&lt;sup&gt;122 &lt;/sup&gt;is an astonishingly large number. &lt;/p&gt; &lt;p&gt;There are on the order 2&lt;sup&gt;30&lt;/sup&gt; personal computers in the world (and of course lots of hand-held devices or non-PC computing devices that have more or less the same levels of computing power, but lets ignore those). Let's assume that we put all those PCs in the world to the task of generating GUIDs; if each one can generate, say, 2&lt;sup&gt;20&lt;/sup&gt; GUIDs per second then after only about 2&lt;sup&gt;72&lt;/sup&gt; seconds -- &lt;strong&gt;one hundred and fifty trillion years&lt;/strong&gt; -- you'll have a &lt;em&gt;very high&lt;/em&gt; chance of generating a collision with your specific GUID. And the odds of collision get pretty good after only thirty trillion years.&lt;/p&gt; &lt;p&gt;But that's looking for a collision with a specific GUID. Clearly the chances are a lot better of generating a collision &lt;em&gt;somewhere else&lt;/em&gt; along the way. Recall that a &lt;a href="http://blogs.msdn.com/b/ericlippert/archive/2010/03/22/socks-birthdays-and-hash-collisions.aspx"&gt;couple of years ago I analyzed how often you get any collision when generating random 32 bit numbers&lt;/a&gt;; it turns out that the probability of getting any collision gets extremely high when you get to around 2&lt;sup&gt;16&lt;/sup&gt; numbers generated. This generalizes; as a rule of thumb, the probability of getting a collision when generating a random n bit number gets large when you've generated around 2&lt;sup&gt;n/2&lt;/sup&gt; numbers. So if we put those billion PCs to work generating 122-bits-of-randomness GUIDs, the probability that two of them somewhere in there would collide gets really high after about 2&lt;sup&gt;61&lt;/sup&gt; GUIDs are generated. Since we're assuming that about 2&lt;sup&gt;30&lt;/sup&gt; machines are doing 2&lt;sup&gt;20&lt;/sup&gt; GUIDs per second, we'd expect a collision after about 2&lt;sup&gt;11&lt;/sup&gt; seconds, which is about an hour.&lt;/p&gt; &lt;p&gt;So clearly this system is not utterly foolproof; if we really, really wanted to, we could with high probability generate a GUID collision in only an hour, provided that we got every PC on the planet to dedicate an hour of time to doing so. &lt;/p&gt; &lt;p&gt;But of course we are not going to do that. The number of GUIDs generated per second worldwide is not anywhere even close to 2&lt;sup&gt;50&lt;/sup&gt;! I would be surprised if it were more than 2&lt;sup&gt;20 &lt;/sup&gt;GUIDs generated per second, worldwide, and therefore we could expect to wait about 2&lt;sup&gt;41&lt;/sup&gt; seconds, for there to be a reasonable chance of collision, which is about seventy thousand years. And if we are looking for a collision with a specific GUID, then again, it will take about a billion times longer than our initial estimate if we assume that a relatively small number of GUIDs are being generated worldwide per second.&lt;/p&gt; &lt;p&gt;So, in short: you should expect that any &lt;em&gt;particular&lt;/em&gt; random GUID will have a collision some time in the next thirty billion trillion years, and that there should be a collision between &lt;em&gt;any two&lt;/em&gt; GUIDs some time in the next seventy thousand years. &lt;/p&gt; &lt;p&gt;Those are pretty good odds.&lt;/p&gt; &lt;p&gt;Now, this is assuming that the GUIDs are chosen by a perfectly uniform random process. &lt;strong&gt;They are not&lt;/strong&gt;. GUIDs are in practice generated by a high-quality &lt;strong&gt;pseudo-random&lt;/strong&gt; number generator, not by a &lt;strong&gt;crypto-strength&lt;/strong&gt; random number generator. Here are some &lt;strong&gt;questions that I do not know the answers to&lt;/strong&gt;:&lt;/p&gt; &lt;ul&gt; &lt;li&gt;What source of entropy is used to seed that pseudo-random number generator?  &lt;li&gt;How many bits of entropy are used in the seed?  &lt;li&gt;If you have two virtual machines running on the same physical machine, do they share any of their entropy?  &lt;li&gt;Are any of those entropy bits from sources that would identify the machine (such as the MAC address) or the person who created the GUID?  &lt;li&gt;Given perfect knowledge of the GUID generation algorithm and a specific GUID, would it be possible to deduce facts about the entropy bits that were used as the seed?  &lt;li&gt;Given two GUIDs, is it possible to deduce the probability that they were both generated from a pseudo-random number generator seeded with the same entropy? (And therefore highly likely to be from the same machine.)&lt;/li&gt;&lt;/ul&gt; &lt;p&gt;I do not know the answers to any of these questions, and therefore it is wise for me to assume that the answers to the bottom four questions is "yes". Clearly it is far, far more difficult for someone to work out where and when a version-four GUID was create than a version-one GUID, which has that information directly in the GUID itself. But I do not know that it is &lt;em&gt;impossible&lt;/em&gt;. &lt;/p&gt; &lt;p&gt;There are yet other techniques for generating GUIDs. If there is a 2 in the first hex digit of the third section then it is a version 1 GUID with some of the timestamp bits have slightly different meanings. If there is a 3 or 5 then the bits are created by running a cryptographic hash function over a unique string; the uniqueness of the string is then derived from the fact that it is typically a URL. But rather than go into the details of those more exotic GUIDs, I think I will leave off here.&lt;/p&gt; &lt;p&gt;&lt;strong&gt;Summing up:&lt;/strong&gt;&lt;/p&gt; &lt;ul&gt; &lt;li&gt;GUIDs are guaranteed to be unique but not guaranteed to be random. &lt;strong&gt;Do not use them as random numbers.&lt;/strong&gt;  &lt;li&gt;GUIDs that are random numbers are &lt;strong&gt;not cryptographic strength&lt;/strong&gt; random numbers.  &lt;li&gt;GUIDs are only unique when everyone cooperates; if someone wants to re-use a previously-generated GUID and thereby artificially create a collision, you cannot stop them. &lt;strong&gt;GUIDs are not a security mechanism.&lt;/strong&gt;  &lt;li&gt;GUIDs have an internal structure; at least &lt;strong&gt;six of the bits are reserved&lt;/strong&gt; and have special meanings.  &lt;li&gt;GUIDs are &lt;strong&gt;allowed to be generated sequentially&lt;/strong&gt;, and in practice often are.  &lt;li&gt;GUIDs are &lt;strong&gt;only unique when taken as a whole&lt;/strong&gt;.  &lt;li&gt;GUIDs can be generated using&lt;strong&gt; a variety of algorithms&lt;/strong&gt;.  &lt;li&gt;GUIDs that are generated randomly are &lt;strong&gt;statistically highly unlikely to collide&lt;/strong&gt; in the foreseeable future.  &lt;li&gt;GUIDs &lt;strong&gt;could reveal information&lt;/strong&gt; about the time and place they were created, either directly in the case of version one GUIDs, or via cryptanalysis in the case of version four GUIDs.  &lt;li&gt;GUIDs might be generated by some &lt;strong&gt;entirely different algorithm&lt;/strong&gt; in the future.&lt;/li&gt;&lt;/ul&gt; &lt;hr&gt;  &lt;p&gt;(*) As the comments point out, this is an approximation that only holds if the probability is small and n is relatively small compared to the total number of possible outcomes.&lt;/p&gt;&lt;/div&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=10297199" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/b/ericlippert/archive/tags/Mathematics/">Mathematics</category><category domain="http://blogs.msdn.com/b/ericlippert/archive/tags/Hashing/">Hashing</category><category domain="http://blogs.msdn.com/b/ericlippert/archive/tags/GUIDs/">GUIDs</category></item><item><title>GUID guide, part two</title><link>http://blogs.msdn.com/b/ericlippert/archive/2012/04/30/guid-guide-part-two.aspx</link><pubDate>Mon, 30 Apr 2012 14:11:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:10295002</guid><dc:creator>Eric Lippert</dc:creator><slash:comments>15</slash:comments><wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://blogs.msdn.com/b/ericlippert/rsscomments.aspx?WeblogPostID=10295002</wfw:commentRss><comments>http://blogs.msdn.com/b/ericlippert/archive/2012/04/30/guid-guide-part-two.aspx#comments</comments><description>&lt;div class="mine"&gt; &lt;p&gt;So how is it that a GUID can be guaranteed to be unique without some sort of central authority that ensures uniqueness, a la the ISBN system?&lt;/p&gt; &lt;p&gt;Well, first off, notice that the number of possible GUIDs is &lt;em&gt;vastly&lt;/em&gt; larger than the number of possible ISBNs. Because the last of the thirteen digits is a checksum, there are only 10&lt;sup&gt;12&lt;/sup&gt; possible ISBNs. That is about a hundred unique ISBNs for every person on earth. That's almost exactly 2&lt;sup&gt;40&lt;/sup&gt;, so you could represent an ISBN by a 40 bit number (again, ignoring the checksum). There are 2&lt;sup&gt;128&lt;/sup&gt; possible GUIDs; that's about 40 billion billion billion unique GUIDs for every person on earth. This alone gives us the intuition that it ought to be pretty easy to ensure that two of them never collide; there are a &lt;em&gt;lot&lt;/em&gt; of GUIDs to choose from!&lt;/p&gt; &lt;p&gt;There are a number of possible strategies for making a unique GUID, and in fact information about the strategy used is encoded in the first four bits of the third "group"; almost every GUID you see will be of the form {xxxxxxxx-xxxx-&lt;font style="background-color: #ffff00"&gt;1xxx&lt;/font&gt;-xxxx-xxxxxxxxxxxx} or {xxxxxxxx-xxxx-&lt;font style="background-color: #ffff00"&gt;4xxx&lt;/font&gt;-xxxx-xxxxxxxxxxxx}. &lt;/p&gt; &lt;p&gt;If there is a one in that place then the algorithm used to guarantee uniqueness is essentially a variation on the ISBN strategy. The GUID is guaranteed to be unique "in space" by choosing some of the bits as the MAC address of the network card in the machine. (The tricky problem of ensuring that no two network cards in the world have the same MAC address is &lt;a href="http://en.wikipedia.org/wiki/Organizationally_Unique_Identifier"&gt;solved somehow by someone else&lt;/a&gt;; how that problem is solved, we don't particularly care. The cost of solving that problem is passed on to you, the consumer, in the purchase cost of the network card.) &lt;/p&gt; &lt;p&gt;(&lt;strong&gt;UPDATE&lt;/strong&gt;: &lt;a href="http://blogs.msdn.com/b/larryosterman/"&gt;Larry Osterman&lt;/a&gt; pointed out to me that of course, this MAC address solution is not foolproof. First, you could deliberately or accidentally set your MAC address to an already-used value. Second, a hardware manufacturer might forget to set the address at all and have it be all zeros. Third, two virtual machines might be using the same physical network card and those virtual machines could be generating GUIDs at exactly the same time fast enough to cause collisions.)&lt;/p&gt; &lt;p&gt;We know that we can rely on this as a source of uniqueness in space. Most of the remaining bits are a high-resolution timestamp. Therefore every generated GUID is unique in both space and time, and is therefore globally unique. &lt;/p&gt; &lt;p&gt;This system does in practice have a few weak spots. The most obvious one is that it does not work well if the machine does not have a network card! Version-one GUIDs generated on machines without network cards are not guaranteed to be unique. The less obvious one is that there is a slim chance that two GUIDs could be generated "at the same time". Perhaps two GUID generators were running on two different processors in the same machine at the same time. Or perhaps a GUID was generated, the clock in the machine was deliberately "turned back", and the same GUID was then generated again, just by bad luck. There are a few "extra" bits in a GUID that are used to mitigate these timing problems, so in practice they don't arise.&lt;/p&gt; &lt;p&gt;There are a number of interesting consequences of this algorithm. The first is that such GUIDs are almost completely &lt;em&gt;non-random&lt;/em&gt;. Many people seem to be of the mistaken belief that GUIDs are a source of &lt;em&gt;randomness&lt;/em&gt;, when in fact they are only guaranteed to be a source of &lt;em&gt;uniqueness&lt;/em&gt;.&lt;/p&gt; &lt;p&gt;The second is that it is possible for GUIDs generated on a particular machine with this algorithm to be &lt;em&gt;monotone increasing&lt;/em&gt;. This is actually a very nice property for GUIDs to have; GUIDs are often used as primary keys in databases, and it can be much cheaper to insert a large number of rows into an indexed-by-primary-key database table if the rows are already sorted and you're always inserting &lt;em&gt;after&lt;/em&gt; every previous entry. This again demonstrates that using a sequence of GUIDs as a sequence of random 128 bit numbers is a terrible idea; random numbers are typically not monotone increasing!&lt;/p&gt; &lt;p&gt;The third interesting consequence is that code or documents that contains a GUID generated with this first algorithm &lt;strong&gt;contain information that uniquely identifies the machine that was used to create the GUID&lt;/strong&gt;. Just as a skilled reader can determine interesting facts about a book from its ISBN, so too can a skilled reader determine when and where a GUID was generated if it has a one as its thirteenth hex digit. This fact was used when tracking down and prosecuting the author of the famous &lt;a href="http://en.wikipedia.org/wiki/Melissa_(computer_virus)"&gt;Melissa&lt;/a&gt; virus. (We'll discuss the implications of this in more detail in an upcoming episode.)&lt;/p&gt; &lt;p&gt;The fourth interesting consequence is that &lt;strong&gt;no proper subsequence of the bits of a GUID have the global uniqueness property&lt;/strong&gt;, &lt;a href="http://blogs.msdn.com/b/oldnewthing/archive/2008/06/27/8659071.aspx"&gt;as Raymond pointed out back in 2008&lt;/a&gt;. And indeed, we have no reason to expect that a smaller set of bits would have the same properties as a larger set of bits! You don't expect to be able to saw one aircraft in half and end up with two things that both fly.&lt;/p&gt; &lt;p&gt;&lt;strong&gt;Next time&lt;/strong&gt; we'll talk about GUIDs that have a four in their thirteenth hex digit; they use a completely different technique for ensuring uniqueness.&lt;/p&gt;&lt;/div&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=10295002" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/b/ericlippert/archive/tags/GUIDs/">GUIDs</category></item><item><title>GUID Guide, part one</title><link>http://blogs.msdn.com/b/ericlippert/archive/2012/04/24/guid-guide-part-one.aspx</link><pubDate>Tue, 24 Apr 2012 13:32:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:10294173</guid><dc:creator>Eric Lippert</dc:creator><slash:comments>37</slash:comments><wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://blogs.msdn.com/b/ericlippert/rsscomments.aspx?WeblogPostID=10294173</wfw:commentRss><comments>http://blogs.msdn.com/b/ericlippert/archive/2012/04/24/guid-guide-part-one.aspx#comments</comments><description>&lt;div class="mine"&gt; &lt;p&gt;What is a GUID? The acronym stands for "globally unique identifier"; GUIDs are also called UUIDs, which stands for "universally unique identifier". (It is unclear to me why we need two nigh-identical names for the same thing, but there you have it.) A GUID is essentially a 128 bit integer, and when written in its human-readable form, is written in hexadecimal in the pattern {xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx}.&lt;/p&gt; &lt;p&gt;The purpose of a GUID is, as the name implies, to &lt;em&gt;uniquely identify&lt;/em&gt; something, so that we can refer to that thing by its identifier and have confidence that everyone can agree upon what thing we are referring to. Think about this problem as it applies to, say, books. It is cumbersome to refer to a book by &lt;em&gt;quoting it in its entirety every time you mention it&lt;/em&gt;. Instead, we give every book an identifier in the form of its title. The problem with using a title as an identifier is that there may be many different books with the same title. I have three different books all entitled "&lt;u&gt;The C# Programming Language&lt;/u&gt;" on my desk right now; if I want to refer to one of them in particular, I'd typically have to give the edition number. But there is nothing (apart from their good sense) stopping some entirely different publisher from also publishing a book called "The C# Programming Language, fourth edition" that differs from the others. &lt;/p&gt; &lt;p&gt;Publishers have solved this problem by creating a globally unique identifier for each book called the &lt;a href="http://en.wikipedia.org/wiki/ISBN"&gt;International Standard Book Number&lt;/a&gt;, or ISBN. This is the 13-decimal-digit bar coded number you see on pretty much every book(*). How do publishers manage to get a unique number for each of the millions of books published? They divide and conquer; the digits of an ISBN each have a different meaning. Each country has been assigned a certain range of ISBN numbers that they can allocate; governments then further allocate subsets of their numbers to publishers. Publishers then decide for themselves how to assign the remaining digits to each book. The ISBNs for my three editions of the C# spec are 978-0-321-15491-6, 978-0-321-56299-9 and 978-0-321-74176-9. You'll notice that the first seven digits are exactly the same for each; they identify that this is a publishing industry code (978), that the book was published in a primarily English-speaking region (0), by Addison-Wesley (321). The next five digits are Addison-Wesley's choice, and the final digit is a checksum. If I wish to uniquely identify the fourth edition of the C# specification I need not state the ambiguous title at all; I can simply refer you to book number 978-0-321-74176-9, and everyone in the world can determine precisely which book I'm talking about.&lt;/p&gt; &lt;p&gt;An important and easily overlooked characteristic of the ISBN uniqueness system is that &lt;strong&gt;it only works if everyone who uses it is non-hostile&lt;/strong&gt;. If a rogue publisher decides to deliberately publish books with the ISBN numbers of existing books so as to create confusion then the usefulness of the identifier is compromised because it no longer uniquely identifies a book. ISBN numbers are not a &lt;em&gt;security system&lt;/em&gt;, and neither are GUIDs; &lt;strong&gt;ISBN numbers and GUIDs&amp;nbsp; prevent &lt;em&gt;accidental&lt;/em&gt; collisions&lt;/strong&gt;. Similarly, traffic lights only prevent accidental collisions &lt;em&gt;if everyone agrees to follow the rules of traffic lights&lt;/em&gt;; if anyone decides to go when the light is red then collisions might no longer be avoided, and if someone is attempting to deliberately cause a collision then traffic lights cannot stop them. &lt;/p&gt; &lt;p&gt;The ISBN system has the nice property that you can "decode" an ISBN and learn something about the book just from its number. But it has the enormous down side that it is &lt;em&gt;extraordinarily expensive to administer&lt;/em&gt;. There has to be international agreement on the general form of the identifier and on what the industry and language codes mean. In any given country there must be some organization (either a government body or private companies contracted by the government) to assign numbers to publishers. It can cost hundreds of dollars to obtain a unique ISBN.&lt;/p&gt; &lt;p&gt;GUIDs do not have this cost problem; GUIDs are free and there is no requirement that any governing body get involved to ensure their uniqueness. A GUID is a number that you can generate yourself and be guaranteed that no one else in the world will generate that same number. That seems a bit magical. How does that work? Over the next couple of episodes we'll take a look at how that magical property is achieved.&lt;/p&gt; &lt;hr&gt;  &lt;p&gt;(*) The attentive reader will note that there are usually &lt;em&gt;two&lt;/em&gt; bar codes on a book in the United States. The first one is the ISBN; the second bar code is the number 5 followed by a four digit number that is the publisher's suggested price of the book in American pennies. &lt;/p&gt;&lt;/div&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=10294173" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/b/ericlippert/archive/tags/Books/">Books</category><category domain="http://blogs.msdn.com/b/ericlippert/archive/tags/GUIDs/">GUIDs</category></item><item><title>null is not false, part three</title><link>http://blogs.msdn.com/b/ericlippert/archive/2012/04/19/null-is-not-false-part-three.aspx</link><pubDate>Thu, 19 Apr 2012 14:03:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:10293701</guid><dc:creator>Eric Lippert</dc:creator><slash:comments>11</slash:comments><wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://blogs.msdn.com/b/ericlippert/rsscomments.aspx?WeblogPostID=10293701</wfw:commentRss><comments>http://blogs.msdn.com/b/ericlippert/archive/2012/04/19/null-is-not-false-part-three.aspx#comments</comments><description>&lt;div class="mine"&gt; &lt;p&gt;Returning now to the subject at hand: we would like to allow user-defined "overloads" of the &lt;span class="code"&gt;&amp;amp;&lt;/span&gt; and &lt;span class="code"&gt;|&lt;/span&gt; operators in C#, and if we are going to have &lt;span class="code"&gt;&amp;amp;&lt;/span&gt; and &lt;span class="code"&gt;|&lt;/span&gt; be overloadable, it seems desirable to have &lt;span class="code"&gt;&amp;amp;&amp;amp;&lt;/span&gt; and &lt;span class="code"&gt;||&lt;/span&gt; be overloadable too. &lt;/p&gt; &lt;p&gt;But now we have a big design problem. We typically overload operators by making a &lt;strong&gt;method&lt;/strong&gt;:&lt;/p&gt; &lt;p&gt;&lt;span class="code"&gt;class C&lt;br&gt;{&lt;br&gt;&amp;nbsp; string s;&lt;br&gt;&amp;nbsp; public C(string s) { this.s = s; }&lt;br&gt;&amp;nbsp; public override string ToString() { return s; }&lt;br&gt;&amp;nbsp; public static C operator +(C x, C y) { return new C(x.s + "+" + y.s); }&lt;br&gt;}&lt;br&gt;...&lt;br&gt;Console.WriteLine(new C("123") + new C("456")); // "123+456"&lt;/span&gt;&lt;/p&gt; &lt;p&gt;But method arguments are eagerly evaluated in C#. We can't very well say:&lt;/p&gt; &lt;p&gt;&lt;span class="code"&gt;public static C operator &amp;amp;&amp;amp;(C x, C y)&amp;nbsp; { ... whatever ... } &lt;/span&gt;&lt;/p&gt; &lt;p&gt;because when you cay&lt;/p&gt; &lt;p&gt;&lt;span class="code"&gt;C c = GetFirstC() &amp;amp;&amp;amp; GetSecondC();&lt;/span&gt;&lt;/p&gt; &lt;p&gt;that is going to be rewritten as something like:&lt;/p&gt; &lt;p&gt;&lt;span class="code"&gt;C c = C.op_ShortCircuitAnd(GetFirstC(), GetSecondC());&lt;/span&gt;&lt;/p&gt; &lt;p&gt;which obviously evaluates both operands regardless of whether the left hand is "true" or "false". &lt;/p&gt; &lt;p&gt;Of course in modern-day C# we have a type which represents "perform this calculation that produces a result in the future, on demand"; that type is &lt;span class="code"&gt;Func&amp;lt;T&amp;gt;&lt;/span&gt;. We could implement this as:&lt;/p&gt; &lt;p&gt;&lt;span class="code"&gt;public static C operator &amp;amp;&amp;amp;(C x, Func&amp;lt;C&amp;gt; fy)&amp;nbsp; { ... whatever ... }&lt;/span&gt;&lt;/p&gt; &lt;p&gt;and now the rewrite becomes&lt;/p&gt; &lt;p&gt;&lt;span class="code"&gt;C c = C.op_ShortCircuitAnd(GetFirstC(), ()=&amp;gt;GetSecondC());&lt;/span&gt;&lt;/p&gt; &lt;p&gt;The method can then invoke the delegate only if it decides that it needs to evaluate the right hand side. &lt;/p&gt; &lt;p&gt;That would totally work, though it has a couple of problems. First, it is potentially expensive; even if we never use it, we go to all the trouble of allocating a delegate. Second, C# 1.0 did not have either lambdas or generic delegate types, so the whole thing would have been a non-starter back then.&lt;/p&gt; &lt;p&gt;What we settled on instead was to say that really there are two things going on here. First we must decide whether to evaluate the right hand side or not. If we do not evaluate the right hand side then the result can be the left hand side. If we do evaluate the right hand side then we combine the two evaluated sides using the non-short-circuiting operator.&lt;/p&gt; &lt;p&gt;It is that first operation -- decide whether or not to evaluate the right hand side -- that requires operator true and operator false. These are the "is this operand one that requires the right hand side to be implemented or not?" operators.&lt;/p&gt; &lt;p&gt;But now we face another problem. I said last time that we have a problem with the short-circuiting &lt;span class="code"&gt;&amp;amp;&amp;amp;&lt;/span&gt; and &lt;span class="code"&gt;||&lt;/span&gt; operators when considering values other than true or false: namely, does &lt;span class="code"&gt;x &amp;amp;&amp;amp; y&lt;/span&gt; mean "evaluate y if and only if x is true", or "evaluate y if and only if x is not false"? Obviously for straight-up Boolean x and y, those are the same, but for nullable Booleans they are not. And for our user-defined operator they are not the same either. After all, if you already had an unambiguous way to convert your type to true or false, you would simply implement an implicit conversion to bool and use the built-in &lt;span class="code"&gt;&amp;amp;&amp;amp;&lt;/span&gt; and &lt;span class="code"&gt;||&lt;/span&gt; operators. &lt;/p&gt; &lt;p&gt;The C# 1.0 design team decided that the rule is "evaluate y if and only if x is not false". We implement that rule by having an "operator false" that returns true if x is to be treated as false, and false if it is not to be treated as false. That's a bit confusing, I know. Maybe an example will help:&lt;/p&gt; &lt;p&gt;&lt;span class="code"&gt;class C&lt;br&gt;{&lt;br&gt;&amp;nbsp; string s;&lt;br&gt;&amp;nbsp; public C(string s) { this.s = s; }&lt;br&gt;&amp;nbsp; public override string ToString() { return s; }&lt;br&gt;&amp;nbsp; public static C operator &amp;amp;(C x, C y) { return new C(x.s + "&amp;amp;" + y.s); }&lt;br&gt;&amp;nbsp; public static C operator |(C x, C y) { return new C(x.s + "|" + y.s); }&lt;br&gt;&amp;nbsp; public static bool operator true(C x) { return x.s == "true"; }&lt;br&gt;&amp;nbsp; public static bool operator false(C x) { return x.s == "false"; }&lt;br&gt;}&lt;br&gt;...&lt;br&gt;C ctrue = new C("true");&lt;br&gt;C cfalse = new C("false");&lt;br&gt;C cfrob = new C("frob");&lt;/p&gt; &lt;p&gt;Console.WriteLine(ctrue &amp;amp;&amp;amp; cfrob); // true&amp;amp;frob&lt;br&gt;Console.WriteLine(cfalse &amp;amp;&amp;amp; cfrob); // false&lt;br&gt;Console.WriteLine(cfrob &amp;amp;&amp;amp; cfrob); // frob&amp;amp;frob&lt;/span&gt;&lt;/p&gt; &lt;p&gt;That is to say, &lt;span class="code"&gt;x &amp;amp;&amp;amp; y&lt;/span&gt; here is implemented as:&lt;/p&gt; &lt;p&gt;&lt;span class="code"&gt;C temp = x;&lt;br&gt;C result = C.op_false(temp) ? temp : temp &amp;amp; y;&lt;/span&gt;&lt;/p&gt; &lt;p&gt;And similarly, &lt;span class="code"&gt;x || y&lt;/span&gt; uses the "operator true" in the analogous manner.&lt;/p&gt; &lt;p&gt;A little known fact is that if you can use &lt;span class="code"&gt;&amp;amp;&amp;amp;&lt;/span&gt; and &lt;span class="code"&gt;||&lt;/span&gt; with a user-defined type, then you can also use them in control flows that take a bool, like &lt;span class="code"&gt;if&lt;/span&gt; and &lt;span class="code"&gt;while&lt;/span&gt;. This means that you can in fact say:&lt;/p&gt; &lt;p&gt;&lt;span class="code"&gt;if (ctrue &amp;amp;&amp;amp; cfrob) ...&lt;/span&gt;&lt;/p&gt; &lt;p&gt;because that becomes the moral equivalent of:&lt;/p&gt; &lt;p&gt;&lt;span class="code"&gt;C temp = ctrue;&lt;br&gt;C result = C.op_false(temp) ? temp : temp &amp;amp; cfrb;&lt;br&gt;bool b = C.op_true(result);&lt;br&gt;if (b) ...&lt;/span&gt;&lt;/p&gt; &lt;p&gt;Pretty neat, eh?&lt;/p&gt;&lt;/div&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=10293701" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/b/ericlippert/archive/tags/C_2300_/">C#</category><category domain="http://blogs.msdn.com/b/ericlippert/archive/tags/Language+Design/">Language Design</category></item><item><title>A brief digression</title><link>http://blogs.msdn.com/b/ericlippert/archive/2012/04/17/a-brief-digression.aspx</link><pubDate>Tue, 17 Apr 2012 14:04:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:10293794</guid><dc:creator>Eric Lippert</dc:creator><slash:comments>14</slash:comments><wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://blogs.msdn.com/b/ericlippert/rsscomments.aspx?WeblogPostID=10293794</wfw:commentRss><comments>http://blogs.msdn.com/b/ericlippert/archive/2012/04/17/a-brief-digression.aspx#comments</comments><description>&lt;div class="mine"&gt;
&lt;p&gt;Before we continue our exploration of truthiness in C#, a brief digression. I mentioned last time the "knights and knaves" puzzles of logician Raymond Smullyan. Though I do enjoy those puzzles, my favourite of his puzzles are his chess puzzles, and my second favourite are his combinatory logic puzzles. Here's an example of the latter, adapted from the puzzle on page 134 of &lt;span style="text-decoration: underline;"&gt;Satan, Cantor and Infinity&lt;/span&gt;.&lt;/p&gt;
&lt;p&gt;We have an alphabet consisting of the letters S, E, R, M, K and V. You can make strings out of these letters;&amp;nbsp;a "string" is defined as a ordered sequence of letters drawn from the given alphabet; the sequence can be&amp;nbsp;of any finite length.&amp;nbsp;Some strings are said to "name" other strings.&lt;/p&gt;
&lt;p&gt;I'll use "x" and "y" as variables that stand in for strings, and furthermore we can assume that x and y never stand for empty strings. (And that moreover, in the fourth rule, y is a string with at least two characters.)&amp;nbsp;The rules of naming are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;SxE names x&lt;/li&gt;
&lt;li&gt;if x names y then Rx names yy&lt;/li&gt;
&lt;li&gt;if x names y then Mx names SyE&lt;/li&gt;
&lt;li&gt;if x names y then Kx names the string formed by removing the leftmost character from y&lt;/li&gt;
&lt;li&gt;if x names y then Vx names the string formed by reversing the letters of y&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;So, for example, SRME names RM, by the first rule. Because SRME names RM, we know that RSRME names RMRM by the second rule. And thus KRSRME names MRM.&lt;/p&gt;
&lt;p&gt;There has been a small amount of confusion in the comments; note that I am not saying that some strings are "legal" and some are "illegal" in any way; rather, I am saying that some strings just happen to "name" other strings. A string that does not name any other string is still a perfectly "good" string; no string is more or less valid than any other. "RM" for example does not name any string.&lt;/p&gt;
&lt;p&gt;The challenge --&amp;nbsp;which probably will not be too hard for computer programmers, but will be a bit tricky for non-programmers who haven't read the fifteen pages of easier puzzles leading up&amp;nbsp;to this one&amp;nbsp;-- is to &lt;strong&gt;find a string that names itself&lt;/strong&gt;. Smullyan's solution is 18 letters long and he challenges the reader to find a shorter one. I have found a 12 letter solution. I conjecture that 12 letters is the smallest such string, but I have not proven it. I'll put both Smullyan's solution and my solution in the comments. Good luck!&lt;/p&gt;
&lt;p&gt;UPDATE: There is a &lt;strong&gt;nine&lt;/strong&gt;-letter solution, and it is minimal. Nice work!&lt;/p&gt;
&lt;/div&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=10293794" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/b/ericlippert/archive/tags/Puzzles/">Puzzles</category></item><item><title>null is not false, part two</title><link>http://blogs.msdn.com/b/ericlippert/archive/2012/04/12/null-is-not-false-part-two.aspx</link><pubDate>Thu, 12 Apr 2012 16:29:11 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:10293219</guid><dc:creator>Eric Lippert</dc:creator><slash:comments>16</slash:comments><wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://blogs.msdn.com/b/ericlippert/rsscomments.aspx?WeblogPostID=10293219</wfw:commentRss><comments>http://blogs.msdn.com/b/ericlippert/archive/2012/04/12/null-is-not-false-part-two.aspx#comments</comments><description>&lt;div class="mine"&gt; &lt;p&gt;In &lt;a href="http://en.wikipedia.org/wiki/Raymond_Smullyan"&gt;Raymond Smullyan&lt;/a&gt;'s delightful books about the &lt;a href="http://en.wikipedia.org/wiki/Knights_and_Knaves"&gt;Island of Knights and Knaves&lt;/a&gt; -- where, you'll recall, knights make only true statements and knaves make only false statements -- the knights and knaves are of course clever literary devices to explore problems in deductive (*) logic. Smullyan, to my recollection, never explores what happens when knights and knaves make statements which are disingenuous half-truths, &lt;a href="http://www.thisamericanlife.org/radio-archives/episode/460/retraction"&gt;authorial license in pursuit of a larger truth&lt;/a&gt;, or other forms of &lt;a href="http://en.wikipedia.org/wiki/Truthiness"&gt;truthiness&lt;/a&gt;. A nullable Boolean in C# gives us, if not quite the notion of &lt;em&gt;truthiness&lt;/em&gt;, at least the notion that true and false are not the only possible values of a predicate: there is also "null", whatever that means. &lt;/p&gt; &lt;p&gt;What does that mean? A null Boolean can mean "there is a truth state, but I just don't know what it is": for example, if you queried a database on December 1st to ask "were the sales figures for November higher than they were in October?" the answer is either true or false, but the database might not know the answer because not all the figures are in yet. The right answer in that case would be to say "null", meaning "there is an answer but I do not know what it is." &lt;/p&gt; &lt;p&gt;Or, a null Boolean can mean "the question has no answer at all, not even true or false". True or false: the present king of France is bald. The number of currently existing kings of France -- zero -- is equal to the number of currently existing bald kings of France, but it seems off-putting to say that a statement is "vacuously true" in this manner when we could more sensibly &lt;em&gt;deny the validity of the question&lt;/em&gt;. There are certainly analogous situations in computer programming where we want to express the notion that &lt;em&gt;the query is so malformed as to not have a truth value at all&lt;/em&gt;, and "null" seems like a sensible value in those cases.&lt;/p&gt; &lt;p&gt;Because null can mean "I don't know", almost every "lifted to nullable" operator in C# results in null if any operand is null. The sum of 123 and null is null because of course the answer to the question "what is the sum of 123 and something I don't know" is "I don't know!" The notable exceptions to this rule are equality, which says that two null values are equal, and the logical "and" and "or" operators, which have some very interesting behaviour. When you say &lt;span class="code"&gt;x &amp;amp; y&lt;/span&gt; for nullable Booleans, the rule is not "if either is null then the result is null". Rather, the rule is "if either is false then the result is false, otherwise, if either is null then the result is null, otherwise, the result is true". And similarly for &lt;span class="code"&gt;x | y&lt;/span&gt; -- the rule is "if either is true then the result is true, otherwise if either is null then the result is null, otherwise the result is false". These rules obey our intuition about what "and" and "or" mean logically provided that "null" means "I don't know". That is the truth value of "(something true) or (something I don't know)" is clearly true regardless of whether the thing you don't know is true or false. But if "null" means "the question has no answer at all" then the truth value of "(something true) or (something that makes no sense)" probably should be "something that makes no sense".&lt;/p&gt; &lt;p&gt;Things get weirder though when you start to consider the "short circuiting" operators, &lt;span class="code"&gt;&amp;amp;&amp;amp;&lt;/span&gt; and &lt;span class="code"&gt;||&lt;/span&gt;. As you probably know, the &lt;span class="code"&gt;&amp;amp;&amp;amp;&lt;/span&gt; and &lt;span class="code"&gt;||&lt;/span&gt; operators on Booleans are just like the &lt;span class="code"&gt;&amp;amp;&lt;/span&gt; and &lt;span class="code"&gt;|&lt;/span&gt; operators, except that the &lt;span class="code"&gt;&amp;amp;&amp;amp;&lt;/span&gt; operator does not even bother to evaluate the right hand side if the left hand side is false, and the &lt;span class="code"&gt;||&lt;/span&gt; operator does not evaluate the right hand side if the left hand side is true. After we've evaluated the left hand side of either operator, we *might* have enough information to know the final answer. We can therefore (1) save the expense of computing the other side, and (2) allow the evaluation of the right hand side to depend on a precondition established by the truth or falsity of the left hand side. The most common example of (2) is of course &lt;span class="code"&gt;if (s == null || s.Length == 0)&lt;/span&gt; because the right hand side would have crashed and burned if evaluated when the left hand side is true. &lt;/p&gt; &lt;p&gt;The &lt;span class="code"&gt;&amp;amp;&amp;amp;&lt;/span&gt; and &lt;span class="code"&gt;|| &lt;/span&gt;operators are not "lifted to nullable" because doing so is problematic. The whole point of the short-circuiting operator is to avoid evaluating the right hand side, but we cannot do so and still match the behaviour of the unlifted version! Suppose we have &lt;span class="code"&gt;x &amp;amp;&amp;amp; y&lt;/span&gt; for two nullable Boolean expressions. Let's break down all the cases:&lt;/p&gt; &lt;ul&gt; &lt;li&gt;x is false: We do not evaluate y, and the result is false.  &lt;li&gt;x is true: We do evaluate y, and the result is the value of y  &lt;li&gt;x is null: Now what do we do? We have two choices:  &lt;ul&gt; &lt;li&gt;We evaluate y, violating the nice property that y is only evaluated if x is true. The result is false if y is false, null otherwise.  &lt;li&gt;We do not evaluate y. The result must be either false or null.  &lt;ul&gt; &lt;li&gt;If the result is false even though y would have evaluated to null, then we have resulted in false incorrectly.  &lt;li&gt;If the result is null even though y would have evaluated to false, then we have resulted in null incorrectly.&lt;/li&gt;&lt;/ul&gt;&lt;/li&gt;&lt;/ul&gt;&lt;/li&gt;&lt;/ul&gt; &lt;p&gt;In short, either we sometimes evaluate y when we shouldn't, or we sometimes return a value that does not match the value that x &amp;amp; y would have produced. The way out of this dilemma is to cut the feature entirely.&lt;/p&gt; &lt;p&gt;I said last time that I'd talk about the role of &lt;span class="code"&gt;operator true&lt;/span&gt; and &lt;span class="code"&gt;operator false&lt;/span&gt; in C#, but I think I shall leave that to the next episode.&lt;/p&gt; &lt;hr&gt;  &lt;p&gt;(*) Smullyan's book of &lt;em&gt;combinatory&lt;/em&gt; &lt;em&gt;logic&lt;/em&gt; puzzles, &lt;u&gt;To Mock A Mockingbird&lt;/u&gt;, is equally delightful and I recommend it for anyone who wants a playful introduction to the subject.&lt;/p&gt;&lt;/div&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=10293219" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/b/ericlippert/archive/tags/C_2300_/">C#</category><category domain="http://blogs.msdn.com/b/ericlippert/archive/tags/Language+Design/">Language Design</category></item><item><title>null is not false</title><link>http://blogs.msdn.com/b/ericlippert/archive/2012/03/26/null-is-not-false.aspx</link><pubDate>Mon, 26 Mar 2012 16:24:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:10287623</guid><dc:creator>Eric Lippert</dc:creator><slash:comments>38</slash:comments><wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://blogs.msdn.com/b/ericlippert/rsscomments.aspx?WeblogPostID=10287623</wfw:commentRss><comments>http://blogs.msdn.com/b/ericlippert/archive/2012/03/26/null-is-not-false.aspx#comments</comments><description>&lt;div class="mine"&gt; &lt;p&gt;The way you typically represent a "missing" or "invalid" value in C# is to use the "null" value of the type. Every reference type has a "null" value; that is, the reference that does not actually refer to anything. And every "normal" value type has a corresponding "nullable" value type which has a null value.&lt;/p&gt; &lt;p&gt;The way these concepts are implemented is completely different. A reference is typically implemented behind the scenes as a 32 or 64 bit number. As we've discussed &lt;a href="http://blogs.msdn.com/b/ericlippert/archive/2009/02/17/references-are-not-addresses.aspx"&gt;previously in this space&lt;/a&gt;, that number should logically be treated as an "opaque" handle that only the garbage collector knows about, but in practice that number is the offset into the virtual memory space of the process that the referred-to object lives at, inside the managed heap. The number zero is reserved as the representation of null because the operating system reserves the first few pages of virtual memory as invalid, always. There is no chance that by some accident, the zero address is going to be a valid address in the heap.&lt;/p&gt; &lt;p&gt;By contrast, a nullable value type is simply an instance of the value type plus a Boolean that indicates whether the value is to be treated as a value, or as null. It's just a syntactic sugar for passing around a flag. This is because value types need not have any "special" value that has no other meaning; a byte has 256 possible values and every one of them is valid, so a nullable byte has to have some additional storage.&lt;/p&gt; &lt;p&gt;Some languages allow null values of value types or reference types, or both, to be implicitly treated as Booleans. In C, you can say:&lt;/p&gt; &lt;p&gt;&lt;span class="code"&gt;int* x = whatever();&lt;br&gt;if (x) ...&lt;/span&gt;&lt;/p&gt; &lt;p&gt;and that is treated as if you'd said "&lt;span class="code"&gt;if (x != null)&lt;/span&gt;". And similarly for nullable value types; in some languages a null value type is implicitly treated as "false".&lt;/p&gt; &lt;p&gt;The designers of C# considered those features and rejected them. First, because treating references or nullable value types as Booleans is a confusing idiom and a potential rich source of bugs. And second, because semantically it seems presumptuous to automatically translate null -- which should mean "this value is missing" or "this value is unknown" -- to "this value is logically false".&lt;/p&gt; &lt;p&gt;In particular, we want to treat nullable bools as having three states: true, false and null, and not as having three states: true, false and different-kind-of-false. Treating null nullable Booleans as false leads to a number of oddities. Suppose we did, and suppose x is a nullable bool that is equal to null:&lt;/p&gt; &lt;p&gt;&lt;span class="code"&gt;if (x) &lt;br&gt;&amp;nbsp; Foo();&lt;br&gt;if (!x) &lt;br&gt;&amp;nbsp; Bar();&lt;/span&gt;&lt;/p&gt; &lt;p&gt;Neither Foo nor Bar is executed because "not null" is of course also null. (The answer to "what is the opposite of this unknown value?" is "an unknown value".) Does it not seem strange that &lt;span class="code"&gt;x&lt;/span&gt; and &lt;span class="code"&gt;!x&lt;/span&gt; are both treated as false? Similarly, &lt;span class="code"&gt;if (x | !x)&lt;/span&gt; would also be treated as false, which also seems bizarre.&lt;/p&gt; &lt;p&gt;The solution to the problem of these oddities is to avoid the problem in the first place, and not make nulls behave as though they were false.&lt;/p&gt; &lt;p&gt;Next time we'll look at a different aspect of truth-determining: just what is up with those "true" and "false" user-defined operators?&lt;/p&gt;&lt;/div&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=10287623" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/b/ericlippert/archive/tags/C_2300_/">C#</category><category domain="http://blogs.msdn.com/b/ericlippert/archive/tags/Language+Design/">Language Design</category><category domain="http://blogs.msdn.com/b/ericlippert/archive/tags/reference+equality/">reference equality</category><category domain="http://blogs.msdn.com/b/ericlippert/archive/tags/references/">references</category></item><item><title>Why not automatically infer constraints?</title><link>http://blogs.msdn.com/b/ericlippert/archive/2012/03/09/why-not-automatically-infer-constraints.aspx</link><pubDate>Fri, 09 Mar 2012 19:09:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:10251901</guid><dc:creator>Eric Lippert</dc:creator><slash:comments>27</slash:comments><wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://blogs.msdn.com/b/ericlippert/rsscomments.aspx?WeblogPostID=10251901</wfw:commentRss><comments>http://blogs.msdn.com/b/ericlippert/archive/2012/03/09/why-not-automatically-infer-constraints.aspx#comments</comments><description>&lt;div class="mine"&gt; &lt;p&gt;UPDATE: Whoops! I accidentally set a draft of this article to automatically publish on a day that I was away on vacation. The fact that it was (1) not purple and (2) introduced the topic and then stopped in mid-sentence were both clues that this was an unfinished edit. Sorry about that; I thought I had set it to *not* publish on Monday. I am not so good with these computer thingies apparently.&lt;/p&gt; &lt;p&gt;I spent the weekend not thinking about computers by lying beside a pool in Palm Springs, which I can definitively report is an awesome way to spend a rainy weekend in Seattle. I can also definitely report that Peter Frampton has lost almost all his hair and absolutely none of his talent; the man is amazing. If you like 1970's hard rock guitar solos, you've got &lt;a href="http://frampton.com/live/"&gt;just a couple more weeks&lt;/a&gt; to hear him perform all of Frampton Comes Alive. &lt;/p&gt; &lt;p&gt;Right, let's actually finish off that article then:&lt;/p&gt; &lt;p&gt;--------------&lt;/p&gt; &lt;p&gt;Suppose you have a generic base type with a constraint:&lt;/p&gt; &lt;p&gt;&lt;span class="code"&gt;class Bravo&amp;lt;T&amp;gt; where T : IComparable&amp;lt;T&amp;gt; { ... }&lt;/span&gt;&lt;/p&gt; &lt;p&gt;If you make a generic derived class in the obvious way:&lt;/p&gt; &lt;p&gt;&lt;span class="code"&gt;class Delta&amp;lt;U&amp;gt; : Bravo&amp;lt;U&amp;gt; { ... }&lt;/span&gt;&lt;/p&gt; &lt;p&gt;then the C# compiler gives you an error:&lt;/p&gt; &lt;p&gt;&lt;span class="code"&gt;error CS0314: The type 'U' cannot be used as type parameter 'T' in the generic type or method 'Bravo&amp;lt;T&amp;gt;'. There is no boxing conversion or type parameter conversion from 'U' to 'System.IComparable&amp;lt;U&amp;gt;'.&lt;/span&gt;&lt;/p&gt; &lt;p&gt;Which seems reasonable; every construction of Bravo&amp;lt;T&amp;gt; is required to meet the constraints on T and we have no evidence whatsoever that the type supplied for U will meet those constraints.&lt;/p&gt; &lt;p&gt;But that's only one way of looking at the problem; another way of looking at it is that we &lt;em&gt;do&lt;/em&gt; have evidence that the person who declared U &lt;em&gt;expected that U meets the constraints of T&lt;/em&gt;, since they used it as T. Given that evidence one might reasonably then expect the compiler to simply silently place the same constraint upon U. U would then meet the constraint on T; any error then would not be on the declaration of Delta's base class, but rather, upon any code which constructs Delta&amp;lt;U&amp;gt; such that Bravo&amp;lt;T&amp;gt;'s constraints are violated.&lt;/p&gt; &lt;p&gt;I'm often asked why the compiler does not implement this feature or that feature, and of course the answer is always the same: &lt;strong&gt;because no one implemented it.&lt;/strong&gt; Features start off as unimplemented and only become implemented when people spend effort implementing them: no effort, no feature. This is an unsatisfying answer of course, because usually the person asking the question has made the assumption that the feature is so &lt;em&gt;obviously good&lt;/em&gt; that we need to have had a reason to &lt;strong&gt;not&lt;/strong&gt; implement it. I assure you that no, we actually don't need a reason to not implement any feature no matter how obviously good. But that said, it might be interesting to consider what sort of pros and cons we'd consider if asked to implement the "silently put inferred constraints on class type parameters" feature.&lt;/p&gt; &lt;p&gt;As C# has evolved, clearly more and more features involve the compiler silently making inferences on your behalf: method group conversions, method type inference, implicitly typed locals, implicitly typed arrays, implicitly typed lambdas, and so on, all involve a great deal of inference work by the compiler. You would think we could do the same thing for generic type constraints. However, the problem is not as easy as it looks. The easy case mentioned above is, well, easy. Things quickly get complicated:&lt;/p&gt; &lt;p&gt;&lt;span class="code"&gt;class Echo&amp;lt;W, X&amp;gt; where W : IComparable&amp;lt;W&amp;gt; where X : IComparable&amp;lt;X&amp;gt; { }&lt;br&gt;class Foxtrot&amp;lt;Y&amp;gt; : Echo&amp;lt;Y, Y&amp;gt; { }&lt;/span&gt;&lt;/p&gt; &lt;p&gt;The supposition here is now that Y must be implicitly constrained to be both IComparable&amp;lt;Y&amp;gt; and... IComparable&amp;lt;Y&amp;gt;.&amp;nbsp; OK, that seems reasonable. But what if we then change that to:&lt;/p&gt; &lt;p&gt;&lt;span class="code"&gt;class Echo&amp;lt;W, X&amp;gt; where W : Foo where X : Bar { }&lt;br&gt;class Foxtrot&amp;lt;Y&amp;gt; : Echo&amp;lt;Y, Y&amp;gt; { }&lt;/span&gt;&lt;/p&gt; &lt;p&gt;Suppose Foo and Bar are class types with no relationship between them. Now there is no possible type argument for Y that meets the inferred constraint. Is the compiler now required to tell you that fact? How smart does it have to be about that?&lt;/p&gt; &lt;p&gt;Let's make this a bit more complicated. I'll just make something up off the top of my head:&lt;/p&gt; &lt;p&gt;&lt;span class="code"&gt;class Golf&amp;lt;T&amp;gt; where T : Hotel&amp;lt;T&amp;gt; {}&lt;br&gt;class Hotel&amp;lt;U&amp;gt; : Golf&amp;lt;Indigo&amp;lt;U&amp;gt;&amp;gt; where U : IJuliet&amp;lt;Hotel&amp;lt;U&amp;gt;&amp;gt; {}&lt;br&gt;class Indigo&amp;lt;V&amp;gt; : Hotel&amp;lt;Golf&amp;lt;V&amp;gt;&amp;gt; where V : IKilo&amp;lt;V&amp;gt; {}&lt;br&gt;interface IJuliet&amp;lt;W&amp;gt; {}&lt;br&gt;interface IKilo&amp;lt;X&amp;gt;&lt;/span&gt;&lt;/p&gt; &lt;p&gt;What are the constraints that we have to infer? Well for T to be a Hotel&amp;lt;T&amp;gt;, T has got to be an IJuliet&amp;lt;Hotel&amp;lt;T&amp;gt;&amp;gt;, so we should add that constraint. But in order to be an IJuliet&amp;lt;Hotel&amp;lt;T&amp;gt;&amp;gt;, T has got to meet the constraints on Hotel&amp;lt;T&amp;gt;... oh, wait, we already added that constraint. Looks like our constraint adder is going to have to be resistant to cycles. But wait, do we have *all* the constraints on Hotel&amp;lt;T&amp;gt;? So far we have only evaluated the *stated* constraints. U in Hotel&amp;lt;U&amp;gt; not only has to be an IJuliet&amp;lt;Hotel&amp;lt;U&amp;gt;&amp;gt;, it also has to be a legal Indigo&amp;lt;U&amp;gt;, which means it needs to be an IKilo&amp;lt;U&amp;gt;, which means we should add a constraint to T that T be IKilo&amp;lt;T&amp;gt;. &lt;/p&gt; &lt;p&gt;And so on. I'm not going to go through a full analysis of this thing. The point is, the analysis is complicated, the analysis may go into infinite loops very easily, and it is not at all clear when you know that you've worked out the full set of type constraints when types refer to each other in arbitrary ways. (And we haven't even considered what happens if the interfaces are covariant or contravariant!) I don't like adding inference algorithms to the compiler that require cycle detection and do potentially unbounded amounts of work. The possibility of getting the algorithm wrong is very real. Even if we get the algorithm right, the odds that the implementation will be buggy is very high. If we add this feature then the compiler needs to be able to get the right answer all the time, and to give a sensible error message that helps the user if there is no right answer. Both problems seem very difficult.&lt;/p&gt; &lt;p&gt;Moreover, why are we stopping with base types? It seems like if we're going to do the feature of inferring constraints, then we ought to consume information about the entire class, not just the base types:&lt;/p&gt; &lt;p&gt;&lt;span class="code"&gt;class Mike&amp;lt;T&amp;gt; where T : November {}&lt;br&gt;class Oscar&amp;lt;V&amp;gt; { Mike&amp;lt;V&amp;gt; bv; }&lt;/span&gt;&lt;/p&gt; &lt;p&gt;Should we deduce that V must be November because Oscar&amp;lt;V&amp;gt; has a field of type Mike&amp;lt;V&amp;gt;? I don't see why the base types are any more special than the field types.  &lt;p&gt;What if Oscar&amp;lt;V&amp;gt; had used Mike&amp;lt;V&amp;gt; as a return type of a method? Or a parameter type? Or as the type of a local variable? Where does it stop?  &lt;p&gt;Basically, the feature is too much pain for too little gain. When you construct a generic, you are the one required to supply a proof to the compiler that you have satisfied the constraints. C# is not a "figure out what the developer meant to say and say it for them" language; it's a "tell the developer when they have supplied too little information" language.  &lt;p&gt;&lt;/code&gt;&lt;/p&gt;&lt;/div&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=10251901" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/b/ericlippert/archive/tags/C_2300_/">C#</category><category domain="http://blogs.msdn.com/b/ericlippert/archive/tags/Type+Inference/">Type Inference</category></item><item><title>Why are local variables definitely assigned in unreachable statements?</title><link>http://blogs.msdn.com/b/ericlippert/archive/2012/03/05/why-are-local-variables-definitely-assigned-in-unreachable-statements.aspx</link><pubDate>Mon, 05 Mar 2012 22:05:18 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:10277981</guid><dc:creator>Eric Lippert</dc:creator><slash:comments>23</slash:comments><wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://blogs.msdn.com/b/ericlippert/rsscomments.aspx?WeblogPostID=10277981</wfw:commentRss><comments>http://blogs.msdn.com/b/ericlippert/archive/2012/03/05/why-are-local-variables-definitely-assigned-in-unreachable-statements.aspx#comments</comments><description>&lt;div class="mine"&gt; &lt;p&gt;You're probably all familiar with the feature of C# which disallows reading from a local variable before it has been "definitely assigned":&lt;/p&gt; &lt;p&gt;&lt;span class="code"&gt;void M()&lt;br&gt;{&lt;br&gt;&amp;nbsp; int x;&lt;br&gt;&amp;nbsp; if (Q())&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; x = 123;&lt;br&gt;&amp;nbsp; if (R())&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; Console.WriteLine(x); // illegal!&lt;br&gt;}&lt;/span&gt;&lt;/p&gt; &lt;p&gt;This is illegal because there is a path through the code which, if taken, results in the local variable being read from before it has been assigned; in this case, if Q() returns false and R() returns true.&lt;/p&gt; &lt;p&gt;The reason why we want to make this illegal is not, as many people believe, because the local variable is going to be initialized to garbage and we want to protect you from garbage. We do in fact automatically initialize locals to their default values. (Though the C and C++ programming languages do not, and will cheerfully allow you to read garbage from an uninitialized local.) Rather, it is because the existence of such a code path is probably a bug, and we want to throw you in the pit of quality; you should have to work hard to write that bug.&lt;/p&gt; &lt;p&gt;The way in which the compiler determines if there is any path through the code which causes x to be read before it is written is quite interesting, but that's a subject for another day. The question I want to consider today is: &lt;strong&gt;why are local variables considered to be definitely assigned inside unreachable statements?&lt;/strong&gt;&lt;/p&gt; &lt;p&gt;&lt;span class="code"&gt;void M()&lt;br&gt;{&lt;br&gt;&amp;nbsp; int x;&lt;br&gt;&amp;nbsp; if (Q())&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; x = 123;&lt;br&gt;&amp;nbsp; if (false)&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; Console.WriteLine(x); // legal!&lt;br&gt;}&lt;/span&gt;&lt;/p&gt; &lt;p&gt;First off, obviously the way I've described the feature immediately gives the intuition that this ought to be legal. Clearly there is no path through the code which results in the local variable being read before it is assigned. In fact, there is no path through the code that results in the local variable being read, period!&lt;/p&gt; &lt;p&gt;On the other hand: &lt;strong&gt;that code looks wrong&lt;/strong&gt;. We do not allow syntax errors, or overload resolution errors, or convertibility errors, or any other kind of error, in an unreachable statement, so why should we allow definite assignment errors?&lt;/p&gt; &lt;p&gt;It's a subtle point, I admit. Here's the thing. You have to ask yourself "why is there unreachable code in the method in the first place?" Either that unreachable code is deliberate, or it is an error.&lt;/p&gt; &lt;p&gt;If it is an error, then something is deeply messed up here. The programmer did not intend the written control flow in the first place. It seems premature to guess at what the definite assignment errors are in the unreachable code, since the control flow that would be used to determine definite assignment state is wrong. We are going to give a warning about the unreachable code; the user can then notice the warning and fix the control flow. Once it is fixed, then we can consider whether there are definite assignment problems with the fixed control flow.&lt;/p&gt; &lt;p&gt;Now, why on earth would someone deliberately make unreachable code? It does in fact happen; actually it happens quite frequently when dealing with libraries made by another team that are not quite done yet:&lt;/p&gt; &lt;p&gt;&lt;span class="code"&gt;// If we can resrov the frob into a glob, do that and then blorg the result.&lt;br&gt;// Even if the frob is not a glob, we know it is definitely a resrovable blob,&lt;br&gt;// so resrov it as a blob and then blorg the result. Finally, fribble&lt;br&gt;// the blorgable result, regardless of whether it was a glob or a blob.&lt;br&gt;void BlorgFrob(Frob frob)&lt;br&gt;{&lt;br&gt;&amp;nbsp; IBlorgable blorgable;&lt;br&gt;&amp;nbsp; // TODO: Glob.TryResrov has not been ported to C# yet.&lt;br&gt;&amp;nbsp; if (false /* Glob.TryResrov(out blorgable, frob) */)&lt;br&gt;&amp;nbsp; {&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; BlorgGlob(blorgable);&lt;br&gt;&amp;nbsp; }&lt;br&gt;&amp;nbsp; else &lt;br&gt;&amp;nbsp; {&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; blorgable = Blob.Resrov(frob)&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; BlorgBlob(blorgable);&lt;br&gt;&amp;nbsp; }&lt;br&gt;&amp;nbsp; blorgable.Fribble(frob);&lt;br&gt;}&lt;/span&gt;&lt;/p&gt; &lt;p&gt;Should &lt;span class="code"&gt;BlorgGlob(blorgable)&lt;/span&gt; be an error? It seems plausible that it should not be an error; after all, it's never going to read the local. But it is still nice that we get overload resolution errors reported inside the unreachable code, just in case there is something wrong there.&lt;/p&gt;&lt;/div&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=10277981" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/b/ericlippert/archive/tags/C_2300_/">C#</category><category domain="http://blogs.msdn.com/b/ericlippert/archive/tags/reachability/">reachability</category><category domain="http://blogs.msdn.com/b/ericlippert/archive/tags/definite+assignment/">definite assignment</category></item><item><title>The C# 5.0 beta release is now available</title><link>http://blogs.msdn.com/b/ericlippert/archive/2012/02/29/the-c-5-0-beta-release-is-now-available.aspx</link><pubDate>Wed, 29 Feb 2012 16:07:08 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:10274522</guid><dc:creator>Eric Lippert</dc:creator><slash:comments>19</slash:comments><wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://blogs.msdn.com/b/ericlippert/rsscomments.aspx?WeblogPostID=10274522</wfw:commentRss><comments>http://blogs.msdn.com/b/ericlippert/archive/2012/02/29/the-c-5-0-beta-release-is-now-available.aspx#comments</comments><description>&lt;div class="mine"&gt; &lt;p&gt;I am super excited to announce that the beta release of Visual Studio version 11 (which includes the .NET CLR version 4.5, Visual Basic version 11 and C# version 5) is available for download right now. As you know if you've been following our CTP releases, in C# and VB we've greatly improved the ease of programming in an asynchronous "continuation-passing" style such that the compilers do the hard work of rearranging the code into a form amenable to asynchrony, so that you don't have to. VB also now has iterator blocks, and has surpassed C# in that &lt;strong&gt;VB iterator blocks can even be inside lambdas&lt;/strong&gt;. Pretty neat.&lt;/p&gt; &lt;p&gt;And that's just the big stuff for C# and VB; there is an enormous amount of new stuff here in the languages, the tools and the runtime. I can't wait to see what developers do with all of these amazing features. Please, &lt;a href="http://www.microsoft.com/visualstudio/11/en-us/downloads"&gt;download the beta&lt;/a&gt; and &lt;a href="http://social.msdn.microsoft.com/Forums/en-US/category/vsvnext"&gt;give us feedback&lt;/a&gt; on what you like and what you don't like.&lt;/p&gt; &lt;p&gt;See the following blog posts for more details:&lt;/p&gt; &lt;p&gt;* &lt;a href="http://blogs.msdn.com/b/jasonz/archive/2012/02/29/welcome-to-the-beta-of-visual-studio-11-and-net-framework-4-5.aspx"&gt;Jason Zander&lt;/a&gt;&lt;br&gt;* &lt;a href="http://blogs.msdn.com/b/csharpfaq/archive/2012/02/29/visual-studio-11-beta-is-here.aspx"&gt;C# Team&lt;/a&gt;&lt;br&gt;* &lt;a href="http://blogs.msdn.com/b/vbteam/archive/2012/02/28/visual-basic-11-beta-available-for-download.aspx"&gt;VB Team&lt;/a&gt;&lt;/p&gt;&lt;/div&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=10274522" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/b/ericlippert/archive/tags/C_2300_/">C#</category><category domain="http://blogs.msdn.com/b/ericlippert/archive/tags/C_2300_+5-0/">C# 5.0</category></item><item><title>The Solution To The Simple Puzzle</title><link>http://blogs.msdn.com/b/ericlippert/archive/2012/02/27/the-solution-to-the-simple-puzzle.aspx</link><pubDate>Mon, 27 Feb 2012 14:04:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:10270649</guid><dc:creator>Eric Lippert</dc:creator><slash:comments>8</slash:comments><wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://blogs.msdn.com/b/ericlippert/rsscomments.aspx?WeblogPostID=10270649</wfw:commentRss><comments>http://blogs.msdn.com/b/ericlippert/archive/2012/02/27/the-solution-to-the-simple-puzzle.aspx#comments</comments><description>&lt;div class="mine"&gt; &lt;p&gt;The first time I ran my histogram visualizer I asked for a Cauchy distribution with a minimum of -10 and a maximum of 10, and of course I got a graph that looks much like the one from my article of last week:&lt;/p&gt; &lt;p&gt;&amp;nbsp;&lt;/p&gt; &lt;p&gt;&lt;a href="http://blogs.msdn.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-29-89-metablogapi/5141.cauchywrong0_5F00_4.png"&gt;&lt;img style="background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: block; float: none; border-top-width: 0px; border-bottom-width: 0px; margin-left: auto; border-left-width: 0px; margin-right: auto; padding-top: 0px" title="cauchywrong0" border="0" alt="cauchywrong0" src="http://blogs.msdn.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-29-89-metablogapi/8787.cauchywrong0_5F00_thumb_5F00_1.png" width="470" height="371"&gt;&lt;/a&gt;&lt;/p&gt; &lt;p&gt;Looks perfectly reasonable; I guess my program is correct right out of the gate, because &lt;strong&gt;I am that awesome&lt;/strong&gt;!&lt;/p&gt; &lt;p&gt;Then I went to make a graph of a uniform distribution with a minimum of zero and a maximum of one, but I forgot to update the actual query; it still gave me a Cauchy distribution. Here's that same Cauchy distribution this time graphed only from 0 to 1. Oh, the pain:&lt;/p&gt; &lt;p align="center"&gt;&lt;a href="http://blogs.msdn.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-29-89-metablogapi/8032.cauchyWrong_5F00_4.png"&gt;&lt;img style="background-image: none; border-bottom: 0px; border-left: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top: 0px; border-right: 0px; padding-top: 0px" title="cauchyWrong" border="0" alt="cauchyWrong" src="http://blogs.msdn.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-29-89-metablogapi/4745.cauchyWrong_5F00_thumb_5F00_1.png" width="415" height="366"&gt;&lt;/a&gt;&lt;/p&gt; &lt;p&gt;Which is obviously neither uniform nor Cauchy. Equally obvious: &lt;strong&gt;I am not sufficiently awesome to write a twenty-line program without a trivial floating point bug the first time.&lt;/strong&gt;&lt;/p&gt; &lt;p&gt;The bug, which is very subtle in the first graph, was now obvious: &lt;strong&gt;the calculation to determine what the count is for the leftmost bucket is wrong&lt;/strong&gt;. Why? Because converting a double to an integer simply discards the fractional part, effectively truncating towards zero, and "towards zero" is not &lt;em&gt;downwards&lt;/em&gt; if any datum is negative. That means that the leftmost bucket got everything that was supposed to be in it, and everything that was supposed to be in the bucket to its left as well! The solution is either to take the floor of the number before turning it into an int, or to check to see if the double is in the right range &lt;em&gt;before&lt;/em&gt; truncating it, not after. &lt;/p&gt;&lt;/div&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=10270649" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/b/ericlippert/archive/tags/C_2300_/">C#</category><category domain="http://blogs.msdn.com/b/ericlippert/archive/tags/Floating+Point+Arithmetic/">Floating Point Arithmetic</category><category domain="http://blogs.msdn.com/b/ericlippert/archive/tags/Puzzles/">Puzzles</category><category domain="http://blogs.msdn.com/b/ericlippert/archive/tags/Mathematics/">Mathematics</category><category domain="http://blogs.msdn.com/b/ericlippert/archive/tags/Mistakes/">Mistakes</category><category domain="http://blogs.msdn.com/b/ericlippert/archive/tags/Charts/">Charts</category></item><item><title>A Simple Puzzle</title><link>http://blogs.msdn.com/b/ericlippert/archive/2012/02/24/a-simple-puzzle.aspx</link><pubDate>Fri, 24 Feb 2012 14:45:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:10270125</guid><dc:creator>Eric Lippert</dc:creator><slash:comments>24</slash:comments><wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://blogs.msdn.com/b/ericlippert/rsscomments.aspx?WeblogPostID=10270125</wfw:commentRss><comments>http://blogs.msdn.com/b/ericlippert/archive/2012/02/24/a-simple-puzzle.aspx#comments</comments><description>&lt;div class="mine"&gt; &lt;p&gt;My original version of the histogram-generating code that I whipped up for the previous episode of FAIC contained a subtle bug. Can you spot it &lt;strong&gt;without going back and reading the corrected code?&lt;/strong&gt;&lt;/p&gt; &lt;p&gt;&lt;span class="code"&gt;private static int[] CreateHistogram(IEnumerable&amp;lt;double&amp;gt; data, int buckets, double min, double max)&lt;br&gt;{&lt;br&gt;&amp;nbsp; int[] results = new int[buckets];&lt;br&gt;&amp;nbsp; double multiplier = buckets / (max - min);&lt;br&gt;&amp;nbsp; foreach (double datum in data)&lt;br&gt;&amp;nbsp; {&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; int index = (int) ((datum - min) * multiplier);&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; if (0 &amp;lt;= index &amp;amp;&amp;amp; index &amp;lt; buckets)&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; results[index] += 1;&lt;br&gt;&amp;nbsp; }&lt;br&gt;&amp;nbsp; return results;&lt;br&gt;}&lt;/span&gt;&lt;/p&gt; &lt;p&gt;Note that of course if this were production code, instead of demo code I whipped up in five minutes, it would be a lot more robust in its error detection; the bug that I am looking for is a bona fide error in the logic of the method, rather than things like "the method does not verify that min is smaller than max", and so on. &lt;/p&gt; &lt;p&gt;A hint: the first time I ran this code and displayed the results, the generated histogram looked fine. Then I made a small change to the arguments and the resulting histogram image was obviously wrong. Can you spot the defect?&lt;/p&gt;&lt;/div&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=10270125" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/b/ericlippert/archive/tags/C_2300_/">C#</category><category domain="http://blogs.msdn.com/b/ericlippert/archive/tags/Puzzles/">Puzzles</category><category domain="http://blogs.msdn.com/b/ericlippert/archive/tags/Mathematics/">Mathematics</category><category domain="http://blogs.msdn.com/b/ericlippert/archive/tags/Mistakes/">Mistakes</category></item><item><title>Generating Random Non-Uniform Data In C#</title><link>http://blogs.msdn.com/b/ericlippert/archive/2012/02/21/generating-random-non-uniform-data-in-c.aspx</link><pubDate>Tue, 21 Feb 2012 14:20:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:10270119</guid><dc:creator>Eric Lippert</dc:creator><slash:comments>14</slash:comments><wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://blogs.msdn.com/b/ericlippert/rsscomments.aspx?WeblogPostID=10270119</wfw:commentRss><comments>http://blogs.msdn.com/b/ericlippert/archive/2012/02/21/generating-random-non-uniform-data-in-c.aspx#comments</comments><description>&lt;div class="mine"&gt; &lt;p&gt;When building simulations of real-world phenomena, or when generating test data for algorithms that will be consuming information from the real world, it is often highly desirable to produce pseudo-random data that conform to some nonuniform probability distribution. &lt;/p&gt; &lt;p&gt;But perhaps I have already lost some readers who do not remember STATS 101 all those years ago. I sure don't. Let's take a step back. &lt;/p&gt; &lt;p&gt;The .NET Framework conveniently provides you with a pseudo-random number generator that produces an approximately &lt;strong&gt;uniform distribution&lt;/strong&gt; (*) of doubles between zero and one. By a "uniform distribution" we mean that if you took a whole lot of those random numbers and put them in two buckets, based on whether they were greater than one half or smaller than one half, then you would expect to see roughly half the numbers in each bucket; there is no bias towards either side. But moreover: if you took those same numbers and put them in ten buckets, based on whether they were in the range 0.0-0.1, 0.1-0.2, 0.2-0.3, and so on, you would also expect to see no particular bias towards any of those buckets either. In fact, &lt;strong&gt;no matter how many buckets of uniform size you make&lt;/strong&gt;, if you have a large enough sample of random numbers then &lt;strong&gt;each bucket will end up with approximately the same number of items in it&lt;/strong&gt;.&lt;/p&gt; &lt;p&gt;That's what we mean by a "uniform probability distribution": the number of items you find in the bucket is proportional to the &lt;em&gt;size&lt;/em&gt; of the bucket, and has nothing to do with the &lt;em&gt;position&lt;/em&gt; of the bucket. Here I've generated one hundred thousand pseudo-random numbers between zero and one, put them into fifty buckets, and graphed the number of items found in each bucket: (**)&lt;/p&gt; &lt;p&gt;&lt;a href="http://blogs.msdn.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-29-89-metablogapi/2867.uniform1_5F00_2.png"&gt;&lt;img style="background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: block; float: none; border-top-width: 0px; border-bottom-width: 0px; margin-left: auto; border-left-width: 0px; margin-right: auto; padding-top: 0px" title="uniform1" border="0" alt="uniform1" src="http://blogs.msdn.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-29-89-metablogapi/5670.uniform1_5F00_thumb.png" width="530" height="425"&gt;&lt;/a&gt;&lt;/p&gt; &lt;p&gt;&amp;nbsp;&lt;/p&gt; &lt;p&gt;&amp;nbsp;&lt;/p&gt; &lt;p&gt;Such a graph is called a &lt;strong&gt;histogram&lt;/strong&gt;. As you can see, each of the 50 buckets contains approximately 2% of the total, and there seems to be no particular pattern to which buckets were lucky enough to get a few more than average. This histogram is entirely characteristic of a uniform distribution.&lt;/p&gt; &lt;p&gt;Now let's consider for a moment the relationship between the geometry and probability. Suppose we chose one of those data at random, from the hundred thousand points. What is the probability that it was in bucket 25? Obviously, the "height" of that bucket on the graph divided by the total number of items. But geometrically, we could think of this probability as the &lt;strong&gt;area under the curve&lt;/strong&gt; limited to the region of bucket 25 divided by the &lt;strong&gt;total blue area&lt;/strong&gt;.&lt;/p&gt; &lt;p&gt;Do you notice something vexing about this graph though? The axes are completely determined by the arbitrary factors that I chose in running my simulation. Let's normalize this. We'll re-label the horizontal axis so that it represents the highest value that would have been put in that bucket. How shall we re-label the vertical axis? Well, we said that the probability of a particular datum being found in a particular bucket was proportional to the area of that bucket. Therefore, &lt;strong&gt;let's recalibrate the axis such that the total area of the graph is 100%.&lt;/strong&gt; That way, the answer to "what is the probability of a given item being found in such and such a region?" is precisely the area of that region:&lt;/p&gt; &lt;p&gt;&lt;a href="http://blogs.msdn.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-29-89-metablogapi/4812.uniform2_5F00_2.png"&gt;&lt;img style="background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: block; float: none; border-top-width: 0px; border-bottom-width: 0px; margin-left: auto; border-left-width: 0px; margin-right: auto; padding-top: 0px" title="uniform2" border="0" alt="uniform2" src="http://blogs.msdn.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-29-89-metablogapi/5001.uniform2_5F00_thumb.png" width="530" height="425"&gt;&lt;/a&gt;&lt;/p&gt; &lt;p&gt;&amp;nbsp;&lt;/p&gt; &lt;p&gt;This normalization has the beautiful property &lt;strong&gt;that no matter how many random numbers we generate&lt;/strong&gt;, and &lt;strong&gt;no matter how many buckets we put them in&lt;/strong&gt;, every histogram will look &lt;strong&gt;more or less the same&lt;/strong&gt;, and will have the same labels on the axes; the distribution will go from zero to one on both axes and the total area will always be 100%.&lt;/p&gt; &lt;p&gt;Now imagine what would happen in the limiting case. Suppose we kept generating more and more numbers; trillions of them. And we made the buckets smaller and smaller. If you squint a bit, you might as well just say that you're going to do a graph as though the whole thing were continuous, and just draw the extraordinarily simple line graph:&lt;/p&gt; &lt;p&gt;&lt;a href="http://blogs.msdn.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-29-89-metablogapi/0285.uniform3_5F00_2.png"&gt;&lt;img style="background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: block; float: none; border-top-width: 0px; border-bottom-width: 0px; margin-left: auto; border-left-width: 0px; margin-right: auto; padding-top: 0px" title="uniform3" border="0" alt="uniform3" src="http://blogs.msdn.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-29-89-metablogapi/3823.uniform3_5F00_thumb.png" width="540" height="436"&gt;&lt;/a&gt;&lt;/p&gt; &lt;p&gt;&amp;nbsp;&lt;/p&gt; &lt;p&gt;And there you have it; the continuous uniform distribution. But that is certainly not the only distribution! There is the famous "bell curve" or "normal" distribution. The range of the normal distribution is infinite on both ends; a normally-distributed random value can have any magnitude positive or negative. Here's &lt;a href="http://en.wikipedia.org/wiki/Normal_distribution"&gt;Wikipedia's image of four different bell curves&lt;/a&gt;:&lt;/p&gt; &lt;p&gt;&amp;nbsp;&lt;/p&gt; &lt;p&gt;&lt;a href="http://blogs.msdn.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-29-89-metablogapi/6177.500px_2D00_Normal_5F00_Distribution_5F00_PDF_5F00_svg_5F00_2.png"&gt;&lt;img style="background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: block; float: none; border-top-width: 0px; border-bottom-width: 0px; margin-left: auto; border-left-width: 0px; margin-right: auto; padding-top: 0px" title="500px-Normal_Distribution_PDF_svg" border="0" alt="500px-Normal_Distribution_PDF_svg" src="http://blogs.msdn.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-29-89-metablogapi/1362.500px_2D00_Normal_5F00_Distribution_5F00_PDF_5F00_svg_5F00_thumb.png" width="500" height="319"&gt;&lt;/a&gt;&lt;/p&gt; &lt;p&gt;Remember, these are just the limiting case of a histogram with a huge number of buckets and a huge number of samples. With these distributions, again, the probability of a value being found in a particular range is proportional to the area under the curve in that range. And of course, the area under each one is the same -- 100% -- which is why the skinny bell is so much taller than the wide one. And notice also how &lt;strong&gt;the vast majority of the area is very, very close to the peak in every case.&lt;/strong&gt; The normal distribution is a "thin, long tail" distribution. &lt;/p&gt; &lt;p&gt;There are other distributions as well. For example, the Cauchy distribution also has the characteristic bell shape, but its tail is a bit "fatter" than the normal distribution. Here, &lt;a href="http://en.wikipedia.org/wiki/Cauchy_distribution"&gt;again from Wikipedia we have four Cauchy distributions&lt;/a&gt;:&lt;/p&gt; &lt;p&gt;&amp;nbsp;&lt;/p&gt; &lt;p&gt;&lt;a href="http://blogs.msdn.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-29-89-metablogapi/2451.500px_2D00_Cauchy_5F00_pdf_5F00_svg_5F00_2.png"&gt;&lt;img style="background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: block; float: none; border-top-width: 0px; border-bottom-width: 0px; margin-left: auto; border-left-width: 0px; margin-right: auto; padding-top: 0px" title="500px-Cauchy_pdf_svg" border="0" alt="500px-Cauchy_pdf_svg" src="http://blogs.msdn.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-29-89-metablogapi/5187.500px_2D00_Cauchy_5F00_pdf_5F00_svg_5F00_thumb.png" width="500" height="400"&gt;&lt;/a&gt;&lt;/p&gt; &lt;p&gt;Notice how much fatter those tails are, and how the "bell" correspondingly has to be narrower to ensure that the area still adds up to 100%. &lt;/p&gt; &lt;p&gt;You know something that is still a bit odd about these graphs though? Because the probability is proportional to the area under the curve, the actual value of the curve at any given point isn't really &lt;em&gt;that&lt;/em&gt; meaningful. Another way to characterize a probability distribution is to work out the probability of a given number &lt;strong&gt;or any number less than it&lt;/strong&gt; being chosen; that is &lt;strong&gt;the area under the curve to the left of that number&lt;/strong&gt;. Of course, the area under the curve is simply the integral, so let's graph the integral. We know that such a curve is going to increase monitonically from a limit on the left of zero to a limit on the right of one, since the probability of getting any very, very small number is close to zero, and the probability of getting any number smaller than a very large number is close to 100%. If we graph the integral of these four Cauchy distributions we get:&lt;/p&gt; &lt;p&gt;&lt;a href="http://blogs.msdn.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-29-89-metablogapi/4431.500px_2D00_Cauchy_5F00_cdf_5F00_svg_5F00_2.png"&gt;&lt;img style="background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: block; float: none; border-top-width: 0px; border-bottom-width: 0px; margin-left: auto; border-left-width: 0px; margin-right: auto; padding-top: 0px" title="500px-Cauchy_cdf_svg" border="0" alt="500px-Cauchy_cdf_svg" src="http://blogs.msdn.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-29-89-metablogapi/1781.500px_2D00_Cauchy_5F00_cdf_5F00_svg_5F00_thumb.png" width="500" height="400"&gt;&lt;/a&gt;&lt;/p&gt; &lt;p&gt;That graph is called the cumulative distribution and notice how it is extremely easy to read! Consider the purple line; clearly the probability of getting a value of two or less is about 85%; we can just read it right off. This graph has exactly the same information content as the "histogram limit" probability distribution; its just a bit easier to read. It is quite difficult to tell from reading the original graph that about 85% of the area under the curve is to the left of the 2.&lt;/p&gt; &lt;p&gt;Apropos of nothing in particular, I note that we have a function here that is &lt;strong&gt;monotone increasing&lt;/strong&gt;; that means that we can &lt;strong&gt;invert&lt;/strong&gt; it. Let's swap the x and y axes:&lt;/p&gt; &lt;p&gt;&lt;a href="http://blogs.msdn.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-29-89-metablogapi/7142.Cauchy2_5F00_2.png"&gt;&lt;img style="background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: block; float: none; border-top-width: 0px; border-bottom-width: 0px; margin-left: auto; border-left-width: 0px; margin-right: auto; padding-top: 0px" title="Cauchy2" border="0" alt="Cauchy2" src="http://blogs.msdn.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-29-89-metablogapi/1307.Cauchy2_5F00_thumb.png" width="400" height="500"&gt;&lt;/a&gt;&lt;/p&gt; &lt;p&gt;This function -- the inverse of the integral of the probability distribution -- is called the &lt;strong&gt;quantile function&lt;/strong&gt;. It tell you "given a probability p from zero to one, what is the number x such that the probability of getting a number less than x is equal to p?" That graph should look familiar to anyone who has taken high school trig; the quantile function of the Cauchy distribution is simply the tangent function rejiggered to go from zero to one instead of -π / 2 to π / 2.&lt;/p&gt; &lt;p&gt;So that's pretty neat; the Cauchy bell-shaped curve is in fact just the derivative of the arctangent! But &lt;strong&gt;so what? How is this relevant?&lt;/strong&gt;&lt;/p&gt; &lt;p&gt;We started this fabulous adventure by saying that sometimes you want to generate numbers that are random but not uniform. &lt;strong&gt;The graph above transforms uniformly-distributed random numbers into Cauchy-distributed random numbers&lt;/strong&gt;. &lt;a href="http://en.wikipedia.org/wiki/Inverse_transform_sampling"&gt;It is amazing, but true!&lt;/a&gt; Check it out:&lt;/p&gt; &lt;p&gt;&lt;span class="code"&gt;private static Random random = new Random();&lt;br&gt;private static IEnumerable&amp;lt;double&amp;gt; UniformDistribution()&lt;br&gt;{&lt;br&gt;&amp;nbsp; while (true) yield return random.NextDouble();&lt;br&gt;}&lt;br&gt;&lt;br&gt;private static double CauchyQuantile(double p)&lt;br&gt;{&lt;br&gt;&amp;nbsp; return Math.Tan(Math.PI * (p - 0.5));&lt;br&gt;}&lt;br&gt;&lt;br&gt;private static int[] CreateHistogram(IEnumerable&amp;lt;double&amp;gt; data, int buckets, double min, double max)&lt;br&gt;{&lt;br&gt;&amp;nbsp; int[] results = new int[buckets];&lt;br&gt;&amp;nbsp; double multiplier = buckets / (max - min);&lt;br&gt;&amp;nbsp; foreach (double datum in data)&lt;br&gt;&amp;nbsp; {&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; double index = (datum - min) * multiplier;&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; if (0.0 &amp;lt;= index &amp;amp;&amp;amp; index &amp;lt; buckets)&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; results[(int)index] += 1;&lt;br&gt;&amp;nbsp; }&lt;br&gt;&amp;nbsp; return results;&lt;br&gt;}&lt;/span&gt;&lt;/p&gt; &lt;p&gt;and now you can tie the whole thing together with:&lt;/p&gt; &lt;p&gt;&lt;span class="code"&gt;var cauchy = from x in UniformDistribution() select CauchyQuantile(x);&lt;br&gt;int[] histogram = CreateHistogram(cauchy.Take(100000), 50, -5.0, 5.0);&lt;/span&gt;&lt;/p&gt; &lt;p&gt;And if you graph the histogram, sure enough, you get a Cauchy distribution:&lt;/p&gt; &lt;p&gt;&lt;a href="http://blogs.msdn.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-29-89-metablogapi/8132.cauchy_5F00_2.png"&gt;&lt;img style="background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: block; float: none; border-top-width: 0px; border-bottom-width: 0px; margin-left: auto; border-left-width: 0px; margin-right: auto; padding-top: 0px" title="cauchy" border="0" alt="cauchy" src="http://blogs.msdn.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-29-89-metablogapi/3835.cauchy_5F00_thumb.png" width="531" height="428"&gt;&lt;/a&gt;&lt;br&gt;&lt;/p&gt; &lt;p&gt;&amp;nbsp;&lt;/p&gt; &lt;p&gt;So, summing up: if you need random numbers that are distributed according to some distribution, all you have to do is &lt;/p&gt; &lt;p&gt;(1) work out the desired &lt;strong&gt;probability distribution function&lt;/strong&gt; such that the area under a portion of the curve is equal to the probability of&amp;nbsp; a value being randomly generated in that range, then&lt;br&gt;(2) integrate the probability distribution to determine the &lt;strong&gt;cumulative distribution&lt;/strong&gt;, then&lt;br&gt;(3) invert the cumulative distribution to get the &lt;strong&gt;quantile function&lt;/strong&gt;, then&lt;br&gt;(4) transform your uniformly distributed random data by running it through the quantile function, &lt;/p&gt; &lt;p&gt;and you're done. Piece of cake!&lt;/p&gt; &lt;p&gt;&lt;strong&gt;Next time&lt;/strong&gt;: a simple puzzle based on an error I made while writing the code above.&lt;/p&gt; &lt;hr&gt;  &lt;p&gt;(*) A commenter correctly points out that the set of real values representable by doubles is not uniformly distributed across the range of 0.0 to 1.0, and that moreover, the Random class is not documented as guaranteeing a uniform distribution. However, for most practical applications it is a reasonable approximation of a uniform distribution. &lt;/p&gt; &lt;p&gt;(**) Using the awesome &lt;strong&gt;Microsoft Chart Control&lt;/strong&gt;, which now ships with the CLR. It was previously only available as a separate download.&lt;/p&gt;&lt;/div&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=10270119" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/b/ericlippert/archive/tags/Rarefied+Heights/">Rarefied Heights</category><category domain="http://blogs.msdn.com/b/ericlippert/archive/tags/C_2300_/">C#</category><category domain="http://blogs.msdn.com/b/ericlippert/archive/tags/Mathematics/">Mathematics</category><category domain="http://blogs.msdn.com/b/ericlippert/archive/tags/Charts/">Charts</category></item><item><title>Bad Metaphors</title><link>http://blogs.msdn.com/b/ericlippert/archive/2012/02/13/bad-metaphors.aspx</link><pubDate>Mon, 13 Feb 2012 17:15:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:10265991</guid><dc:creator>Eric Lippert</dc:creator><slash:comments>36</slash:comments><wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://blogs.msdn.com/b/ericlippert/rsscomments.aspx?WeblogPostID=10265991</wfw:commentRss><comments>http://blogs.msdn.com/b/ericlippert/archive/2012/02/13/bad-metaphors.aspx#comments</comments><description>&lt;div class="mine"&gt; &lt;p&gt;The standard way to teach beginner OO programmers about classes is to make a metaphor to the real world. And indeed, I do this all the time in this blog, usually to the animal kingdom. A "class" in real life codifies a commonality amongst a certain set of objects: mammals, for example, have many things in common; they have backbones, can grow hair, can make their own heat, and so on. A class in a programming language does the same thing: codifies a commonality amongst a certain set of objects via the mechanism of inheritance. Inheritance ensures commonalities because, as we've already discussed, "inheritance" by definition means "all (*) the members of the base type are also members of the derived type".&lt;/p&gt; &lt;p&gt;Inheritance relationships amongst classes (**) are usually designed to model "&lt;strong&gt;is a special kind of&lt;/strong&gt;" relationships. A giraffe is a special kind of mammal, so the class &lt;span class="code"&gt;Giraffe&lt;/span&gt; inherits from the class &lt;span class="code"&gt;Mammal&lt;/span&gt;, which in turn inherits from &lt;span class="code"&gt;Animal&lt;/span&gt;, which inherits from &lt;span class="code"&gt;Object&lt;/span&gt;. And that's great; this clearly represents "is a special kind of" relationships. I have always, however, had a problem with the fundamental metaphor of "inheritance". Why "inheritance"? You inherit genetic information, property, and if you're a &lt;a href="http://en.wikipedia.org/wiki/Christopher_Guest"&gt;titular lord&lt;/a&gt;, your peerage, from your parents. And if you make a diagram of a class hierarchy, it looks a bit like a "family tree" in which the derived class is the "child" of the base "parent" class. And indeed, people often speak of the base class as the "parent" class of a "child" derived class, particularly when speaking to beginners.&lt;/p&gt; &lt;p&gt;But &lt;strong&gt;the "parent-to-child inheritance" metaphor is awful&lt;/strong&gt;. A giraffe is not "a child of mammal"; a giraffe is a child of Mr. and Mrs. Giraffe. A "child" is not "a special kind of parent". In reality, you only inherit half your genetic makeup from each parent, and you can inherit real property from any relation, or for that matter, from any non-relation. In programming languages you only "inherit" from related types, and you inherit all their members (*). In reality, everyone has two parents (***), but in programming languages some languages allow inheritance from arbitrarily many "parents", some allow exactly one. In reality, a single, specific person inherits specific property from a single, specific parent, and two different children can have entirely different inheritances from their parent; in programming languages, the "inheritance" relationship does not apply to individual objects, and every child inherits exactly the same thing from the parents. And in reality you only inherit real property when the decedent is &lt;em&gt;dead&lt;/em&gt;! &lt;/p&gt; &lt;p&gt;But wait, it gets worse. The parent-child metaphor is &lt;em&gt;ambiguous&lt;/em&gt; in any language that supports both &lt;em&gt;lexical nesting&lt;/em&gt; and &lt;em&gt;nominal subtyping&lt;/em&gt; of classes:&lt;/p&gt; &lt;p&gt;&lt;span class="code"&gt;class B&amp;lt;T&amp;gt;&lt;br&gt;{&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; class D&amp;lt;U&amp;gt; : B&amp;lt;U&amp;gt; { }&lt;br&gt;}&lt;/span&gt;&lt;/p&gt; &lt;p&gt;Quick, what type is the "parent" of type &lt;span class="code"&gt;B&amp;lt;string&amp;gt;.D&amp;lt;int&amp;gt;&lt;/span&gt;? Is it &lt;span class="code"&gt;B&amp;lt;T&amp;gt;&lt;/span&gt; or &lt;span class="code"&gt;B&amp;lt;string&amp;gt;&lt;/span&gt; or &lt;span class="code"&gt;B&amp;lt;int&amp;gt;&lt;/span&gt;? That type is &lt;em&gt;lexically&lt;/em&gt; inside &lt;span class="code"&gt;B&amp;lt;T&amp;gt;&lt;/span&gt;, &lt;em&gt;logically&lt;/em&gt; inside of &lt;span class="code"&gt;B&amp;lt;string&amp;gt;&lt;/span&gt;, and &lt;em&gt;derived from&lt;/em&gt; &lt;span class="code"&gt;B&amp;lt;int&amp;gt;&lt;/span&gt;; which of those three is its &lt;em&gt;parent&lt;/em&gt;? If you drew a graph showing either lexical or logical containment relationships, it would form a graph that looks every bit as much like a "family tree" as the graph showing inheritance relationships. And lexical containment allows access to all the properties of the container from the contained type, even including not-inherited and normally inaccesible members like private constructors! It is not at all clear that one kind of "parentage" is actually more "parent-like" than any other.&lt;/p&gt; &lt;p&gt;As we've seen before, having multiple different "parent" relationships for a given type can make for &lt;a href="http://blogs.msdn.com/b/ericlippert/archive/2007/07/27/an-inheritance-puzzle-part-one.aspx"&gt;some extremely confusing code&lt;/a&gt;. We have to be extraordinarily careful when writing the specification and the compiler to ensure that we unambiguously describe precisely the relationship we wish to describe. I therefore try hard to avoid "parent-child" metaphors entirely; it is much more clear when writing an example to describe the type relationship as "base type and derived type", rather than "parent type and child type".&lt;/p&gt; &lt;hr&gt;  &lt;p&gt;(*) Excepting constructors and destructors.&lt;/p&gt; &lt;p&gt;(**) I'm going to stick to talking about class-based inheritance here; my criticisms apply equally well to interface-based inheritance but I don't want to open the can of worms that is all the subtle differences between class and interface inheritance. And I've never much liked the inheritance metaphor on interfaces anyways; a "contractual obligation" metaphor is better.&lt;/p&gt; &lt;p&gt;(***) Assuming that we're talking about members of a sexually reproducing species.&lt;/p&gt;&lt;/div&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=10265991" width="1" height="1"&gt;</description></item><item><title>What is "binding" and what makes it late?</title><link>http://blogs.msdn.com/b/ericlippert/archive/2012/02/06/what-is-quot-binding-quot-and-what-makes-it-late.aspx</link><pubDate>Mon, 06 Feb 2012 18:39:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:10253567</guid><dc:creator>Eric Lippert</dc:creator><slash:comments>13</slash:comments><wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://blogs.msdn.com/b/ericlippert/rsscomments.aspx?WeblogPostID=10253567</wfw:commentRss><comments>http://blogs.msdn.com/b/ericlippert/archive/2012/02/06/what-is-quot-binding-quot-and-what-makes-it-late.aspx#comments</comments><description>&lt;div class="mine"&gt; &lt;p&gt;"Late binding" is one of those computer-sciency terms that, like "strong typing", means different things to different people. I thought I might describe what the term means to me.&lt;/p&gt; &lt;p&gt;First off, what is "binding"? We can't understand what it means to bind late if we don't know what it is to bind at all.&lt;/p&gt; &lt;p&gt;A compiler is by definition a device which takes in a text written in one language and outputs code that "means the same thing" in another language. The compiler I work on, for example, takes in C# and outputs CIL(*). All the principle tasks performed by a compiler falls into one of three broad buckets:&lt;/p&gt; &lt;ul&gt; &lt;li&gt;Syntactic analysis of the input text&lt;/li&gt; &lt;li&gt;Semantic analysis of the syntax&lt;/li&gt; &lt;li&gt;Generation of the output text -- we will be unconcerned with this step in this article&lt;/li&gt;&lt;/ul&gt; &lt;p&gt;Syntactic analysis is analysis of the input text that does not consider anything about the &lt;em&gt;meaning&lt;/em&gt; of that text; the syntactic analyzer is concerned with first determining the &lt;em&gt;lexical&lt;/em&gt; structure of the program (that is, where all the boundaries are between comments, identifiers, operators, and so on) and then from that lexical structure determining the &lt;em&gt;grammatical&lt;/em&gt; structure of the program: where are the boundaries between classes, methods, statements, expressions, and so on.&lt;/p&gt; &lt;p&gt;Semantic analysis then takes the results of the syntactic analysis and associates meanings with various syntactic elements. For example, when you say:&lt;/p&gt; &lt;p&gt;&lt;span class="code"&gt;class X {}&lt;br&gt;class B {}&lt;br&gt;class D : B&lt;br&gt;{&lt;br&gt;&amp;nbsp; public static void X() { }&lt;br&gt;&amp;nbsp; public static void Y() { X(); }&lt;br&gt;}&lt;/span&gt;&lt;/p&gt; &lt;p&gt;syntactic analysis determines that there are three classes, that one of them contains two methods, the second method contains a statement which is a call expression. Semantic analysis determines that the &lt;span class="code"&gt;X&lt;/span&gt; in &lt;span class="code"&gt;X();&lt;/span&gt; refers to the method &lt;span class="code"&gt;D.X()&lt;/span&gt;, rather than, say, the type &lt;span class="code"&gt;X&lt;/span&gt; declared above. That's an example of "binding" in its most widely-agreed-upon sense: &lt;strong&gt;binding is the association of a syntactic element that names a method with a logical element of the program&lt;/strong&gt;.&lt;/p&gt; &lt;p&gt;"Binding" in the sense of "early" or "late" binding is almost always used to refer to the evaluation of a name used as a method call. That, however, is far too strict a definition of "binding" for me. I would also use "binding" to describe how the compiler's semantic analyzer determines that class &lt;span class="code"&gt;D&lt;/span&gt; inherits from class &lt;span class="code"&gt;B&lt;/span&gt;; the name "B" is bound to the class. &lt;/p&gt; &lt;p&gt;Moreover, I would use "binding" to describe other analyses as well. If you had an expression &lt;span class="code"&gt;1 * 2 + 1.0&lt;/span&gt; in your program then I would say that the "+" operator is "bound" to the built-in operator that takes two doubles, adds them, and returns a third. People do not normally think of that analysis as associating the name "+" with a method, but nevertheless I consider it to be "binding". &lt;/p&gt; &lt;p&gt;Speaking even less carefully, I might also use "binding" to describe the association of types with expressions that do not directly name the type. In the example above if speaking casually I might say that the syntax &lt;span class="code"&gt;1 * 2&lt;/span&gt; is "bound" to the type int, even though obviously it does not name that type. The syntactic expression is strongly associated with that semantic element, even though it does not name it.&lt;/p&gt; &lt;p&gt;So, speaking generally I would say that "&lt;strong&gt;binding" is any association of some fragment of syntax with some logical program element&lt;/strong&gt;. (**)&lt;/p&gt; &lt;p&gt;What then is "late" vs "early" binding? People talk about these two kinds of bindings as though it was a binary choice, either early or late. As we'll see, actually there is more of a spectrum than you might imagine; some bindings are entirely early, some are partially early and partially late, and some are very late indeed. But before we get into that, earlier or later &lt;em&gt;than what&lt;/em&gt;?&lt;/p&gt; &lt;p&gt;Basically by "early binding" we mean "the binding analysis is performed by the compiler and baked in to the generated program"; if the binding fails then the program does not run because the compiler did not get to the code generation phase. By "late binding" we mean "some aspect of the binding will be performed by the runtime" and therefore a binding failure will manifest as a runtime failure. Early and late binding might better be called "static binding" and "dynamic binding"; static binding is binding performed using "static" facts known to the compiler, and dynamic binding is performed using facts "dynamically" known to the runtime. &lt;/p&gt; &lt;p&gt;So which is better? Clearly neither is unambiguously better than the other; if one was clearly the winner then we wouldn't be having this discussion in the first place. The benefit of early binding is that you gain confidence that your program will not fail at runtime; the down side is that you lose the flexibility that comes with late binding. Early binding requires that all information required to make the right binding decision be known before the program runs; but that information might not be available until runtime.&lt;/p&gt; &lt;p&gt;I said that early and late binding falls on a spectrum. Let's take a look at some examples in C# of how we can "turn the dial" on early-to-late binding.&lt;/p&gt; &lt;p&gt;We begin with our example above of calling the static method &lt;span class="code"&gt;X&lt;/span&gt;. This analysis is entirely, unambiguously early. There is no doubt at all that when &lt;span class="code"&gt;Y&lt;/span&gt; is called, it is going to call the method &lt;span class="code"&gt;D.X&lt;/span&gt;. No part of that analysis is deferred until runtime and the call will unambiguously succeed.&lt;/p&gt; &lt;p&gt;Next, consider:&lt;/p&gt; &lt;p&gt;&lt;span class="code"&gt;class B&lt;br&gt;{&lt;br&gt;&amp;nbsp; public void M(double x) {}&lt;br&gt;&amp;nbsp; public void M(int x) {}&lt;br&gt;}&lt;br&gt;class C&lt;br&gt;{&lt;br&gt;&amp;nbsp; public static void X(B b, int d) { b.M(d); }&lt;br&gt;}&lt;/span&gt;&lt;/p&gt; &lt;p&gt;Now we know less. We do a lot of early binding here; we know that &lt;span class="code"&gt;b&lt;/span&gt; is of type &lt;span class="code"&gt;B&lt;/span&gt; and that the call is to &lt;span class="code"&gt;B.M(int)&lt;/span&gt;. But unlike the previous example, we have no guarantee from the compiler that this invocation will succeed. &lt;span class="code"&gt;b&lt;/span&gt; could be null. We are essentially deferring the analysis of whether or not the receiver of the call is valid to the runtime to determine. One normally does not think of that as a "binding" decision though, because we are not &lt;em&gt;associating syntax with a program element&lt;/em&gt;. Let's make the call in &lt;span class="code"&gt;C&lt;/span&gt;; a little more late bound by changing &lt;span class="code"&gt;B&lt;/span&gt;:&lt;/p&gt; &lt;p&gt;&lt;span class="code"&gt;class B&lt;br&gt;{&lt;br&gt;&amp;nbsp; public &lt;font style="background-color: #ffff00"&gt;virtual&lt;/font&gt; void M(double x) {}&lt;br&gt;&amp;nbsp; public &lt;font style="background-color: #ffff00"&gt;virtual&lt;/font&gt; void M(int x) {}&lt;br&gt;}&lt;/span&gt;&lt;/p&gt; &lt;p&gt;We are now doing some binding analysis at compile time; we know that the invocation will be on the virtual method declared by &lt;span class="code"&gt;B.M(int)&lt;/span&gt;. We know that the call will succeed in the sense that there will be such a method to call. However, we do not know which method will actually be called at runtime! There could be a derived type that overrides that method; some completely other code could be called that appears nowhere in this program. &lt;strong&gt;Virtual dispatch is a form of late binding;&lt;/strong&gt; the decision about which method to associate with the syntax &lt;span class="code"&gt;b.M(d)&lt;/span&gt; is partially made by the compiler and partially made by the runtime. &lt;/p&gt; &lt;p&gt;How about this?&lt;/p&gt; &lt;p&gt;&lt;span class="code"&gt;class C&lt;br&gt;{&lt;br&gt;&amp;nbsp; public static void X(B b, &lt;font style="background-color: #ffff00"&gt;dynamic&lt;/font&gt; d) { b.M(d); }&lt;br&gt;}&lt;/span&gt;&lt;/p&gt; &lt;p&gt;Now the binding is almost entirely deferred until runtime. The compiler generates code that tells the Dynamic Language Runtime that static analysis has determined that the static type of &lt;span class="code"&gt;b&lt;/span&gt; is &lt;span class="code"&gt;B&lt;/span&gt; and that the method being invoked is called &lt;span class="code"&gt;M&lt;/span&gt;, but the actual overload resolution to determine whether it is &lt;span class="code"&gt;B.M(int)&lt;/span&gt; or &lt;span class="code"&gt;B.M(double)&lt;/span&gt; (or neither, if d is, say, a string) is done by the runtime based on that information. (***)&lt;/p&gt; &lt;p&gt;&lt;span class="code"&gt;class C&lt;br&gt;{&lt;br&gt;&amp;nbsp; public static void X(&lt;font style="background-color: #ffff00"&gt;dynamic&lt;/font&gt; b, dynamic d) { b.M(d); }&lt;br&gt;}&lt;/span&gt;&lt;/p&gt; &lt;p&gt;Now the only portion of the binding performed early is the fact that this is a method call on a method named &lt;span class="code"&gt;M&lt;/span&gt;. This is almost as late-bound as it gets, but we can in fact go one step further:&lt;/p&gt; &lt;p&gt;&lt;span class="code"&gt;class C&lt;br&gt;{&lt;br&gt;&amp;nbsp; public static void X(object b, object[] d, string m, BindingFlags f) &lt;br&gt;&amp;nbsp; { &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; b.GetType().GetMethod(m, f).Invoke(b, d); &lt;br&gt;&amp;nbsp; }&lt;br&gt;}&lt;/span&gt;&lt;/p&gt; &lt;p&gt;Now everything is late-bound; we don't even know &lt;em&gt;what name&lt;/em&gt; we are going to be associating with a method. All we can possibly know is that the author of &lt;span class="code"&gt;X&lt;/span&gt; expects that the object passed as &lt;span class="code"&gt;b&lt;/span&gt; has a single method named by the string in &lt;span class="code"&gt;m&lt;/span&gt; that match the flags in &lt;span class="code"&gt;f&lt;/span&gt;, and that it takes the arguments given in &lt;span class="code"&gt;d&lt;/span&gt;. There is nothing whatsoever we can do to analyze this call site at compile time. (****)&lt;/p&gt; &lt;hr&gt;  &lt;p&gt;(*) Of course the output is encoded into a machine-readable binary format, rather than the human-readable CIL format. &lt;/p&gt; &lt;p&gt;(**) You might then ask whether "binding" and "semantic analysis" are not synonyms; surely semantic analysis is nothing more than the association of syntactic elements with their meanings! Binding is much of the semantic analysis phase of the compiler, but there are many analyses that we must perform after a method body is entirely "bound". Definite assignment analysis, for example, cannot by any reasonable stretch be called "binding"; it is not associating syntactic fragments with specific program elements. Rather, it is associating lexical &lt;em&gt;locations&lt;/em&gt; with facts about program elements, facts like "the local variable blah is not definitely assigned at the beginning of this block". Similarly, optimizing arithmetic expressions is a form of semantic analysis but is clearly not "binding".&lt;/p&gt; &lt;p&gt;(***) The compiler could still do lots of static analysis here. For example, suppose &lt;span class="code"&gt;B&lt;/span&gt; were a sealed class with no methods named &lt;span class="code"&gt;M&lt;/span&gt;. Even with a dynamic argument to the call we could know statically that the runtime binding of &lt;span class="code"&gt;M&lt;/span&gt; will always fail, and tell you that at compile time. And in fact the compiler does some such analyses; precisely how it does so is a good topic for another day.&lt;/p&gt; &lt;p&gt;(****) In some sense this example gives a counterexample to my definition of binding; we are no longer even associating a &lt;em&gt;syntactic element&lt;/em&gt; with a method; we're associating the contents of a string with a method.&lt;/p&gt;&lt;/div&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=10253567" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/b/ericlippert/archive/tags/C_2300_/">C#</category><category domain="http://blogs.msdn.com/b/ericlippert/archive/tags/virtual+dispatch/">virtual dispatch</category></item><item><title>What's the difference? Trenchcoat vs Duster</title><link>http://blogs.msdn.com/b/ericlippert/archive/2012/02/03/what-s-the-difference-trenchcoat-vs-duster.aspx</link><pubDate>Fri, 03 Feb 2012 14:52:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:10260670</guid><dc:creator>Eric Lippert</dc:creator><slash:comments>17</slash:comments><wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://blogs.msdn.com/b/ericlippert/rsscomments.aspx?WeblogPostID=10260670</wfw:commentRss><comments>http://blogs.msdn.com/b/ericlippert/archive/2012/02/03/what-s-the-difference-trenchcoat-vs-duster.aspx#comments</comments><description>&lt;div class="mine"&gt; &lt;p&gt;Today, yet another episode in my ongoing series "&lt;a href="http://blogs.msdn.com/b/ericlippert/archive/tags/what_2700_s+the+difference_3f00_/"&gt;What's the difference?&lt;/a&gt;" This time, a &lt;a href="http://blogs.msdn.com/b/ericlippert/archive/tags/non_2d00_computer/"&gt;non-computer-related topic&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;I am often complimented on my choice of outerwear in the Seattle rainy season, and I hate to respond to a well-meant compliment with a correction. So I usually let all those "Nice trenchcoat!" comments slide and just say "Thanks!" But as a public service, let me lay it out for you so that you don't make the same mistake. Here we see David Tennant as the Tenth Doctor wearing a classic example of a trenchcoat: (Click for a larger version.)&lt;/p&gt; &lt;p&gt;&lt;a href="http://blogs.msdn.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-29-89-metablogapi/0827.TenthDoctor_5F00_2.jpg"&gt;&lt;img style="background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: block; float: none; border-top-width: 0px; border-bottom-width: 0px; margin-left: auto; border-left-width: 0px; margin-right: auto; padding-top: 0px" title="TenthDoctor" border="0" alt="TenthDoctor" src="http://blogs.msdn.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-29-89-metablogapi/5621.TenthDoctor_5F00_thumb.jpg" width="176" height="244"&gt;&lt;/a&gt;&lt;/p&gt; &lt;p&gt;The trenchcoat is a long waterproof coat, traditionally made of &lt;strong&gt;gabardine&lt;/strong&gt;. The term originated in the trenches of the First World War, due to the popularity of this style of coat amongst officers in the British armed forces. The trench coat is not merely a functional warm raincoat but also stylish, with long wide lapels and decorative buttons. The trenchcoat is often belted and might be tailored in at the waist, particularly for women's trenchcoats.&lt;/p&gt; &lt;p&gt;A duster is also a long waterproof coat that is often referred to as a "trenchcoat" -- but as you'll see, it is quite different in its details. Here's the duster I wear, an Australian-made Driza-Bone:&lt;/p&gt; &lt;p&gt;&lt;a href="http://blogs.msdn.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-29-89-metablogapi/2072.DrizaBone_5F00_2.jpg"&gt;&lt;img style="background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: block; float: none; border-top-width: 0px; border-bottom-width: 0px; margin-left: auto; border-left-width: 0px; margin-right: auto; padding-top: 0px" title="DrizaBone" border="0" alt="DrizaBone" src="http://blogs.msdn.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-29-89-metablogapi/1004.DrizaBone_5F00_thumb.jpg" width="154" height="240"&gt;&lt;/a&gt;&lt;/p&gt; &lt;p&gt;&amp;nbsp;&lt;/p&gt; &lt;p&gt;Note the lack of decorative elements, the flap over the closure, the no-lapel collar (which clasps shut, completely enclosing the neck if necessary) and the built-in extra rain protection on the shoulders. (*) Dusters are typically made of &lt;strong&gt;oilcloth &lt;/strong&gt;and are built for handling the practicalities of herding sheep in the rain, not for style (**). &lt;/p&gt; &lt;p&gt;Not shown in this view: the interior includes straps that let you attach the bottom of the coat to your legs, so that it does not blow around when you are on horseback. Also, the back is cut in such a way that you can cover both your legs and the rear portion of the saddle with the coat. I usually take the bus and not a horse to work, but still it's nice to know that options are available should I need them. These practical elements are usually not present in trenchcoats.&lt;/p&gt; &lt;p&gt;Right, glad we got that sorted out. &lt;strong&gt;Next time&lt;/strong&gt;: What is &lt;em&gt;binding&lt;/em&gt;, and why is it always either &lt;em&gt;early&lt;/em&gt; or &lt;em&gt;late&lt;/em&gt;? Can't it ever be on time?&lt;/p&gt; &lt;hr&gt;  &lt;p&gt;(*) Duster manufacturers always hasten to point out that the shoulders are already &lt;em&gt;waterproof&lt;/em&gt;; the extra layer keeps your shoulders &lt;em&gt;warmer&lt;/em&gt; by shedding rain more effectively.&lt;/p&gt; &lt;p&gt;(**) There are, of course, some dusters built for style; if you watch the "Matrix" series of movies you'll see the heroes wear an assortment of extremely stylish dusters and trenchcoats both.&lt;/p&gt;&lt;/div&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=10260670" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/b/ericlippert/archive/tags/Non_2D00_computer/">Non-computer</category><category domain="http://blogs.msdn.com/b/ericlippert/archive/tags/What_2700_s+The+Difference_3F00_/">What's The Difference?</category></item><item><title>Anonymous Types Unify Within An Assembly, Part Two</title><link>http://blogs.msdn.com/b/ericlippert/archive/2012/01/30/anonymous-types-unify-within-an-assembly-part-two.aspx</link><pubDate>Mon, 30 Jan 2012 16:20:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:10253507</guid><dc:creator>Eric Lippert</dc:creator><slash:comments>14</slash:comments><wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://blogs.msdn.com/b/ericlippert/rsscomments.aspx?WeblogPostID=10253507</wfw:commentRss><comments>http://blogs.msdn.com/b/ericlippert/archive/2012/01/30/anonymous-types-unify-within-an-assembly-part-two.aspx#comments</comments><description>&lt;div class="mine"&gt; &lt;p&gt;Last time I noted that any two usages of "the same" anonymous type within an assembly actually unify to be the same type. By "the same" we mean that the two anonymous types have the same property names and types, and that they appear in the same order. &lt;span class="code"&gt;new {X = 1, Y = 2 }&lt;/span&gt; and &lt;span class="code"&gt;new { Y = 2, X = 1 }&lt;/span&gt; do not unify to a single type. &lt;/p&gt; &lt;p&gt;Why is that? You'd think that we could make these unify.&lt;/p&gt; &lt;p&gt;The trouble is that doing so causes more problems than it mitigates. An anonymous type gives you a convenient place to store a small immutable set of name/value pairs, but it gives you more than that. It also gives you an implementation of Equals, GetHashCode and, most germane to this discussion, ToString. (*)&lt;/p&gt; &lt;p&gt;Imagine, for instance, that you have written a bunch of LINQ queries in your code that extract data from a table using an anonymous type. As part of your unit testing, you dump the results of the query as a string out to a file and compare it against a known baseline. Maybe you have hundreds of such tests. And then one day, someone in a completely different part of the code happens to write a LINQ query that has "the same" anonymous type, but with the properties in a different order. We have to pick some order for the properties to be written out by "ToString", and there is no telling which one we'd pick if forced to choose. It seems very strange that using an anonymous type in one part of the program would cause tests to fail in a completely different part of the program.&lt;/p&gt; &lt;p&gt;Well then, you might say, you could solve this problem by canonicalizing the implementation of ToString. Always write out the properties in alphabetical order, say. But that is hardly an attractive solution. First off, whose alphabetical order? There are dozens of different alphabetical orders, depending on what location in the world you are in. Should we choose the alphabetical order of the developer? The current user? The "culture neutral" order? Assuming we could solve that problem satisfactorily, we'd still be disappointing most users. Developers have a reasonable expectation that ToString will give them the properties in the order they appear in the source code.&lt;/p&gt; &lt;p&gt;Another option would be to not implement ToString for you at all. That is to say, remove a useful and relatively commonly used feature (dumping data to a string for testing or debugging) in order to effectively implement a less useful, rarely used feature (unification of types). &lt;/p&gt; &lt;hr&gt;  &lt;p&gt;(*) We give you Equals and GetHashCode so that you can use instances of anonymous types in LINQ queries as keys upon which to perform joins. LINQ to Objects implements joins using a hash table for performance reasons, and therefore we need correct implementations of Equals and GetHashCode.&lt;/p&gt;&lt;/div&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=10253507" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/b/ericlippert/archive/tags/C_2300_/">C#</category><category domain="http://blogs.msdn.com/b/ericlippert/archive/tags/anonymous+types/">anonymous types</category></item><item><title>Anonymous types unify within an assembly, Part One</title><link>http://blogs.msdn.com/b/ericlippert/archive/2012/01/23/anonymous-types-unify-within-an-assembly.aspx</link><pubDate>Mon, 23 Jan 2012 15:55:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:10253273</guid><dc:creator>Eric Lippert</dc:creator><slash:comments>14</slash:comments><wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://blogs.msdn.com/b/ericlippert/rsscomments.aspx?WeblogPostID=10253273</wfw:commentRss><comments>http://blogs.msdn.com/b/ericlippert/archive/2012/01/23/anonymous-types-unify-within-an-assembly.aspx#comments</comments><description>&lt;div class="mine"&gt; &lt;p&gt;Back &lt;a href="http://blogs.msdn.com/b/ericlippert/archive/2010/12/20/why-are-anonymous-types-generic.aspx"&gt;in my last post of 2010 I said that I would do an example of anonymous types unifying within an assembly&lt;/a&gt; "in the new year". I meant 2011, but here we are "in the new year" again, so, no time like the present.&lt;/p&gt; &lt;p&gt;The C# specification guarantees you that when you use "the same" anonymous type in two places in one assembly (*) the types unify into one type. In order to be "the same", the two anonymous types have to have the exact same member names and the exact same member types, in the exact same order. (**)&lt;/p&gt; &lt;p&gt;It is easy to see how this works in a method body; you could simply say:&lt;/p&gt; &lt;p&gt;&lt;span class="code"&gt;var anon1 = new { X = 1 };&lt;br&gt;var anon2 = new { X = 2 };&lt;br&gt;anon2 = anon1; &lt;br&gt;Console.WriteLine(anon2.X); // 1&lt;/span&gt;&lt;/p&gt; &lt;p&gt;And it is easy to see how you could verify this fact with reflection across method bodies within the same assembly:&lt;/p&gt; &lt;p&gt;&lt;span class="code"&gt;class C&lt;br&gt;{&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; public static object Anon1() { return new { X = 1 }; }&lt;br&gt;}&lt;br&gt;class D&lt;br&gt;{&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; static void M()&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; { &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; // True:&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Console.WriteLine(C.Anon1().GetType() == (new { X = 2 }).GetType()); &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; }&lt;br&gt;}&lt;/span&gt;&lt;/p&gt; &lt;p&gt;But... so what? Using reflection in this way seems supremely uninteresting. What would be more interesting is to merge the two examples together:&lt;/p&gt; &lt;p&gt;&lt;span class="code"&gt;class C&lt;br&gt;{&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; public static object Anon1() { return new { X = 1 }; }&lt;br&gt;}&lt;br&gt;class D&lt;br&gt;{&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; static T CastByExample&amp;lt;T&amp;gt;(T example, object obj)&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; {&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; return (T)obj;&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; }&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; static void M()&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; {&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; var anon1 = CastByExample(new { X = 0 }, C.Anon1());&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; var anon2 = new { X = 2 };&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; anon2 = anon1;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Console.WriteLine(anon1.X); // 1&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; }&lt;br&gt;}&lt;/span&gt;&lt;/p&gt; &lt;p&gt;As you can see, you can share instances of an anonymous type around an assembly if you want to. We do not expect that a lot of people will do so; typically if this is the sort of thing you like doing, you'd use either a tuple or a custom-built nominal type. Still, it is not expensive for us to make this guarantee, and it is kind of a nice property.  &lt;hr&gt;  &lt;p&gt;(*) Technically not exactly true. You could have an assembly made up of many different netmodules compiled at different times that do not know about each other; anonymous types defined in one netmodule will not necessarily unify with those in others.&lt;/p&gt; &lt;p&gt;(**) Next time I'll discuss why we have this latter requirement.&lt;/p&gt;&lt;/div&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=10253273" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/b/ericlippert/archive/tags/C_2300_/">C#</category><category domain="http://blogs.msdn.com/b/ericlippert/archive/tags/anonymous+types/">anonymous types</category></item><item><title>What is the defining characteristic of a local variable?</title><link>http://blogs.msdn.com/b/ericlippert/archive/2012/01/16/what-is-the-defining-characteristic-of-a-local-variable.aspx</link><pubDate>Mon, 16 Jan 2012 14:48:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:10253230</guid><dc:creator>Eric Lippert</dc:creator><slash:comments>11</slash:comments><wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://blogs.msdn.com/b/ericlippert/rsscomments.aspx?WeblogPostID=10253230</wfw:commentRss><comments>http://blogs.msdn.com/b/ericlippert/archive/2012/01/16/what-is-the-defining-characteristic-of-a-local-variable.aspx#comments</comments><description>&lt;div class="mine"&gt; &lt;p&gt;If you ask a dozen C# developers what a "local variable" is, you might get a dozen different answers. A common answer is of course that a local is "a storage location on the stack". But that is describing a local in terms of its implementation details; there is nothing in the C# language that requires that locals be stored on a data structure called "the stack", or that there be one stack per thread. (And of course, locals are often stored in registers, and registers are not the stack.) &lt;/p&gt; &lt;p&gt;A less implementation-detail-laden answer might be that a local variable is a variable whose storage location is "allocated from the temporary store". That is, a local variable is a variable whose lifetime is known to be short; the local's lifetime ends when control leaves the code associated with the local's declaration space. &lt;/p&gt; &lt;p&gt;That too, however, is a lie. The C# specification is surprisingly vague about the lifetime of an "ordinary" local variable, noting that its lifetime is only kinda-sorta that length. The jitter's optimizer is permitted broad latitude in its determination of local lifetime; a local can be cleaned up early or late. The specification also notes that the lifetimes of some local variables are necessarily extended beyond the point where control leaves the method body containing the local declaration. Locals declared in an iterator block, for instance, live on even after control has left the iterator block; they might die only when the iterator is itself collected. Locals that are closed-over outer variables of a lambda are the same way; they live at least as long as the delegate that closes over them. And in the upcoming version of C#, locals declared in async blocks will also have extended lifetimes; when the async method returns to its caller upon encountering an "await", the locals live on and are still there when the method resumes. (And since it might not resume on the same thread, in some bizarre situations, the locals had better not be stored on the stack!)&lt;/p&gt; &lt;p&gt;So if locals are not "variables on the stack" and locals are not "short lifetime variables" then what are locals?&lt;/p&gt; &lt;p&gt;The answer is of course staring you in the face. The &lt;strong&gt;defining&lt;/strong&gt; characteristic of a local is that &lt;strong&gt;it can only be accessed by name in the block which declares it&lt;/strong&gt;; it is &lt;strong&gt;local&lt;/strong&gt; to a block. What makes a local truly unique is that it can &lt;em&gt;only&lt;/em&gt; be a private implementation detail of a method body. The name of that local is never of any use to code lexically outside of the method body.&lt;/p&gt;&lt;/div&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=10253230" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/b/ericlippert/archive/tags/C_2300_/">C#</category><category domain="http://blogs.msdn.com/b/ericlippert/archive/tags/Memory+Management/">Memory Management</category><category domain="http://blogs.msdn.com/b/ericlippert/archive/tags/declaration+spaces/">declaration spaces</category><category domain="http://blogs.msdn.com/b/ericlippert/archive/tags/scope/">scope</category><category domain="http://blogs.msdn.com/b/ericlippert/archive/tags/local+variables/">local variables</category></item><item><title>Every public change is a breaking change</title><link>http://blogs.msdn.com/b/ericlippert/archive/2012/01/09/every-public-change-is-a-breaking-change.aspx</link><pubDate>Mon, 09 Jan 2012 17:08:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:10250547</guid><dc:creator>Eric Lippert</dc:creator><slash:comments>17</slash:comments><wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://blogs.msdn.com/b/ericlippert/rsscomments.aspx?WeblogPostID=10250547</wfw:commentRss><comments>http://blogs.msdn.com/b/ericlippert/archive/2012/01/09/every-public-change-is-a-breaking-change.aspx#comments</comments><description>&lt;div class="mine"&gt; &lt;p&gt;Here's an inconvenient truth: just about every "public surface area" change you make to your code is a potential breaking change. &lt;/p&gt; &lt;p&gt;First off, I should clarify what I mean by a "breaking change" for the purposes of this article. If you provide a component to a third party, then a "breaking change" is a change such that the third party's code compiled correctly with the previous version, but the change causes a recompilation to fail. (A more strict definition would be that a breaking change is one where the code recompiles successfully but has a different meaning; for today let's just consider actual "build breaks".) A "potential" breaking change is a change which might cause a break, if the third party happens to have consumed your component in a particular way. By a "public surface area" change, I mean a change to the "public metadata" surface of a component, like adding a new method, rather than changing the behaviour of an existing method by editing its body. (Such a change would typically cause a difference in runtime behaviour, rather than a build break.)&lt;/p&gt; &lt;p&gt;Some public surface area breaking changes are obvious: making a public method into a private method, sealing an unsealed class, and so on. Third-party code that called the method, or extended the class, will break. But a lot of changes seem a lot safer; adding a new public method, for example, or making a read-only property into a read-write property. As it turns out, almost any change you make to the public surface area of a component is a potential breaking change. Let's look at some examples. Suppose you add a new overload:&lt;/p&gt; &lt;p&gt;&lt;span class="code"&gt;// old component code:&lt;br&gt;public interface IFoo {...}&lt;br&gt;public interface IBar { ... }&lt;br&gt;public class Component&lt;br&gt;{&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; public void M(IFoo x) {...}&lt;br&gt;}&lt;/span&gt;&lt;/p&gt; &lt;p&gt;Suppose you then later add&lt;/p&gt; &lt;p&gt;&lt;span class="code"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; public void M(IBar x) {...}&lt;/span&gt;&lt;/p&gt; &lt;p&gt;to Component. Suppose the consumer code is:&lt;/p&gt; &lt;p&gt;&lt;span class="code"&gt;// consumer code:&lt;br&gt;class Consumer : IFoo, IBar&lt;br&gt;{&lt;br&gt;&amp;nbsp;&amp;nbsp; ...&lt;br&gt;&amp;nbsp;&amp;nbsp; component.M(this);&lt;br&gt;&amp;nbsp;&amp;nbsp; ...&lt;br&gt;}&lt;/span&gt;&lt;/p&gt; &lt;p&gt;The consumer code compiles successfully against the original component, but recompiling it with the new component suddenly the build breaks with an overload resolution ambiguity error. Oops.&lt;/p&gt; &lt;p&gt;What about adding an entirely new method? &lt;/p&gt; &lt;p&gt;&lt;span class="code"&gt;// old component code:&lt;br&gt;...&lt;br&gt;public class Component&lt;br&gt;{&lt;br&gt;&amp;nbsp; public void MFoo(IFoo x) {...}&lt;br&gt;}&lt;/span&gt;&lt;/p&gt; &lt;p&gt;and now you add &lt;/p&gt; &lt;p&gt;&lt;span class="code"&gt;public void MBar(IBar x) {...}&lt;/span&gt;&lt;/p&gt; &lt;p&gt;No problem now, right? The consumer could not possibly have been consuming MBar. Surely adding it could not be a build break on the consumer, right?&lt;/p&gt; &lt;p&gt;&lt;span class="code"&gt;class Consumer&lt;br&gt;{&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; class Blah&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; {&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; public void MBar(IBar x) {}&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; }&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; static void N(Action&amp;lt;Blah&amp;gt; a) {}&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; static void N(Action&amp;lt;Component&amp;gt; a) {}&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; static void D(IBar bar)&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; {&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; N(x=&amp;gt;{ x.MBar(bar); });&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; }&lt;br&gt;}&lt;/span&gt;&lt;/p&gt; &lt;p&gt;Oh, the pain. In the original version, overload resolution has two overloads of N to choose from. The lambda is not convertible to &lt;span class="code"&gt;Action&amp;lt;Component&amp;gt;&lt;/span&gt; because typing formal parameter x as &lt;span class="code"&gt;Component&lt;/span&gt; causes the body of the lambda to have an error. That overload is therefore discarded. The remaining overload is the sole applicable candidate; its body binds without error with x typed as &lt;span class="code"&gt;Blah&lt;/span&gt;.&lt;/p&gt; &lt;p&gt;In the new version of &lt;span class="code"&gt;Component&lt;/span&gt; the body of the lambda does not have an error; therefore overload resolution has two candidates to choose from and neither is better than the other; this produces an ambiguity error.&lt;/p&gt; &lt;p&gt;This particular "flavour" of breaking change is an odd one in that it makes almost every possible change to the surface area of a type into a potential breaking change, while at the same time being such an obviously contrived and unlikely scenario that no "real world" developers are likely to run into it. When we are evaluating the impact of potential breaking changes on our customers, we now explicitly discount this flavour of breaking change as so unlikely as to be unimportant. Still, I think its important to make that decision with eyes open, rather than being unaware of the problem.&lt;/p&gt;&lt;/div&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=10250547" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/b/ericlippert/archive/tags/C_2300_/">C#</category><category domain="http://blogs.msdn.com/b/ericlippert/archive/tags/Breaking+Changes/">Breaking Changes</category></item><item><title>He's So Dreamy</title><link>http://blogs.msdn.com/b/ericlippert/archive/2012/01/02/he-s-so-dreamy.aspx</link><pubDate>Mon, 02 Jan 2012 17:23:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:10252460</guid><dc:creator>Eric Lippert</dc:creator><slash:comments>5</slash:comments><wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://blogs.msdn.com/b/ericlippert/rsscomments.aspx?WeblogPostID=10252460</wfw:commentRss><comments>http://blogs.msdn.com/b/ericlippert/archive/2012/01/02/he-s-so-dreamy.aspx#comments</comments><description>&lt;div class="mine"&gt; &lt;p&gt;Happy New Year all!&lt;/p&gt; &lt;p&gt;It has just been brought to my attention that this blog and the &lt;a href="http://programmerryangosling.tumblr.com"&gt;Programmer Ryan Gosling photo blog&lt;/a&gt; share at least one reader:&lt;/p&gt; &lt;p&gt;&lt;a href="http://programmerryangosling.tumblr.com/post/14709857038"&gt;&lt;img style="background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="RyanGosling1" border="0" alt="RyanGosling1" src="http://blogs.msdn.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-29-89-metablogapi/8468.RyanGosling1_5F00_3.jpg" width="504" height="504"&gt;&lt;/a&gt;&lt;/p&gt; &lt;p&gt;I admit it, I LOL'd.&lt;/p&gt; &lt;p&gt;In the interests of total accuracy I'd like to point out that &lt;a href="http://programmerryangosling.tumblr.com/post/14706660907"&gt;the first entry on the blog contains a subtle error&lt;/a&gt;:&lt;/p&gt; &lt;p&gt;&lt;a href="http://programmerryangosling.tumblr.com/post/14706660907"&gt;&lt;img style="background-image: none; border-bottom: 0px; border-left: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top: 0px; border-right: 0px; padding-top: 0px" title="RyanGosling3" border="0" alt="RyanGosling3" src="http://blogs.msdn.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-29-89-metablogapi/7418.RyanGosling3_5F00_3.jpg" width="644" height="484"&gt;&lt;/a&gt;&lt;/p&gt; &lt;p&gt;.NET actually supported generic covariance and contravariance on interface and delegate types from the beginning; generics were introduced in version 2, and &lt;strong&gt;they always allowed variance&lt;/strong&gt;. You could write programs in MSIL and compile them with ILDASM and have generic variance, no problem. No "mainstream" language supported the feature until v4, and none of the interfaces or delegates in the class libraries were marked as variant, so effectively variance did not become a well-used feature until v4, but the capability was always there.&lt;/p&gt; &lt;p&gt;As a response to the unknown reader to submitted the first photo above, I give you the following:&lt;/p&gt; &lt;p&gt;&lt;a href="http://blogs.msdn.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-29-89-metablogapi/5621.RyanGosling2_5F00_4.jpg"&gt;&lt;img style="background-image: none; border-bottom: 0px; border-left: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top: 0px; border-right: 0px; padding-top: 0px" title="kinopoisk.ru" border="0" alt="kinopoisk.ru" src="http://blogs.msdn.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-29-89-metablogapi/6675.RyanGosling2_5F00_thumb_5F00_1.jpg" width="644" height="484"&gt;&lt;/a&gt;&lt;/p&gt; &lt;p&gt;&lt;strong&gt;Next time:&lt;/strong&gt; some depressing news about breaking changes: they're everywhere!&lt;/p&gt;&lt;/div&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=10252460" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/b/ericlippert/archive/tags/Covariance+and+Contravariance/">Covariance and Contravariance</category><category domain="http://blogs.msdn.com/b/ericlippert/archive/tags/bad+jokes/">bad jokes</category></item><item><title>Shadowcasting in C#, Part Six</title><link>http://blogs.msdn.com/b/ericlippert/archive/2011/12/29/shadowcasting-in-c-part-six.aspx</link><pubDate>Thu, 29 Dec 2011 15:05:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:10251276</guid><dc:creator>Eric Lippert</dc:creator><slash:comments>6</slash:comments><wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://blogs.msdn.com/b/ericlippert/rsscomments.aspx?WeblogPostID=10251276</wfw:commentRss><comments>http://blogs.msdn.com/b/ericlippert/archive/2011/12/29/shadowcasting-in-c-part-six.aspx#comments</comments><description>&lt;div class="mine"&gt; &lt;p&gt;OK, let's finish up this year and this series. We have an algorithm that can compute what cells in the zero octant are in view to a viewer at the origin when given a function that determines whether a given cell is opaque or transparent. It marks the visible points by calling an action with the visible cells. We would like that to work in any octant, and for the viewer at any point, not just the origin.&lt;/p&gt; &lt;p&gt;We can solve the "viewer at any point" problem by imposing a &lt;strong&gt;coordinate transformation&lt;/strong&gt; on the "what is opaque?" function and the "cell is visible" action. Suppose the algorithm wishes to know whether the cell (3, 1) is opaque. And suppose the viewer is not at the origin, but rather at (5, 6). The algorithm is actually asking whether the cell at (3 + 5, 6 + 1) is opaque. Similarly, if that cell is determined to be visible then the transformed cell is the one that is visible from (5, 6). We can easily transform one delegate into another:&lt;/p&gt; &lt;p&gt;&lt;span class="code"&gt;private static Func&amp;lt;int, int, T&amp;gt; TranslateOrigin&amp;lt;T&amp;gt;(Func&amp;lt;int, int, T&amp;gt; f, int x, int y)&lt;br&gt;{&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; return (a, b) =&amp;gt; f(a + x, b + y);&lt;br&gt;}&lt;/span&gt;&lt;/p&gt; &lt;p&gt;&lt;span class="code"&gt;private static Action&amp;lt;int, int&amp;gt; TranslateOrigin(Action&amp;lt;int, int&amp;gt; f, int x, int y)&lt;br&gt;{&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; return (a, b) =&amp;gt; f(a + x, b + y);&lt;br&gt;}&lt;/span&gt;&lt;/p&gt; &lt;p&gt;Similarly we can perform an octant transformation; if you want to map a point in octant one into a point in octant zero, just swap its (x,y) coordinates! Every octant has a simple transformation that reflects it into octant zero:&lt;/p&gt; &lt;p&gt;&lt;span class="code"&gt;private static Func&amp;lt;int, int, T&amp;gt; TranslateOctant&amp;lt;T&amp;gt;(Func&amp;lt;int, int, T&amp;gt; f, int octant)&lt;br&gt;{&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; switch (octant)&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; {&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; default: return f;&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; case 1: return (x, y) =&amp;gt; f(y, x);&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; case 2: return (x, y) =&amp;gt; f(-y, x);&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; case 3: return (x, y) =&amp;gt; f(-x, y);&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; case 4: return (x, y) =&amp;gt; f(-x, -y);&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; case 5: return (x, y) =&amp;gt; f(-y, -x);&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; case 6: return (x, y) =&amp;gt; f(y, -x);&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; case 7: return (x, y) =&amp;gt; f(x, -y);&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; }&lt;br&gt;}&lt;/span&gt;&lt;/p&gt; &lt;p&gt;(And similarly for actions.)&lt;/p&gt; &lt;p&gt;Now that we have these simple transformation functions we can finally implement the code that calls our octant-zero-field-of-view algorithm:&lt;/p&gt; &lt;p&gt;&lt;span class="code"&gt;public static void ComputeFieldOfViewWithShadowCasting(&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; int x, int y, int radius,&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; Func&amp;lt;int, int, bool&amp;gt; isOpaque,&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; Action&amp;lt;int, int&amp;gt; setFoV)&lt;br&gt;{&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; Func&amp;lt;int, int, bool&amp;gt; opaque = TranslateOrigin(isOpaque, x, y);&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; Action&amp;lt;int, int&amp;gt; fov = TranslateOrigin(setFoV, x, y);&lt;/span&gt;&lt;/p&gt; &lt;p&gt;&lt;span class="code"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; for (int octant = 0; octant &amp;lt; 8; ++octant)&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; {&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; ComputeFieldOfViewInOctantZero(&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; TranslateOctant(opaque, octant),&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; TranslateOctant(fov, octant),&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; radius);&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; }&lt;br&gt;}&lt;br&gt;&lt;/span&gt;&lt;/p&gt; &lt;p&gt;Pretty slick, eh?&lt;/p&gt; &lt;p&gt;One minor downside of this algorithm is that it computes the visibility of the points along the axes and major diagonals twice; however, the number of such cells grows worst case linearly with radius. The algorithm as a whole is worst case quadradic in the radius, so the extra linear cost is probably irrelevant.&lt;/p&gt; &lt;p&gt;I've posted &lt;a href="http://blogs.msdn.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-29-89/1777.SilverlightShadowCasting.zip"&gt;the project that builds the Silverlight control from the first episode here&lt;/a&gt;, if you want to get a complete working example.&lt;/p&gt; &lt;p&gt;Happy New Year everyone and we'll see you in 2012 for more Fabulous Adventures in Coding!&lt;/p&gt;&lt;/div&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=10251276" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/b/ericlippert/archive/tags/shadowcasting/">shadowcasting</category></item><item><title>Shadowcasting in C#, Part Five</title><link>http://blogs.msdn.com/b/ericlippert/archive/2011/12/27/shadowcasting-in-c-part-five.aspx</link><pubDate>Tue, 27 Dec 2011 18:05:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:10245642</guid><dc:creator>Eric Lippert</dc:creator><slash:comments>3</slash:comments><wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://blogs.msdn.com/b/ericlippert/rsscomments.aspx?WeblogPostID=10245642</wfw:commentRss><comments>http://blogs.msdn.com/b/ericlippert/archive/2011/12/27/shadowcasting-in-c-part-five.aspx#comments</comments><description>&lt;div class="mine"&gt; &lt;p&gt;I hope you all had a pleasant Christmas and Boxing Day; we chose to not travel to see family this year and had a delightful time visiting friends. We'll finish up 2011 here with a bit more on shadowcasting, and then pick up with more C# language design facts and opinions in January.&lt;/p&gt; &lt;hr&gt;  &lt;p&gt;OK, so we've found the top and bottom cells in a particular column portion, bounded by a top and bottom vector. Now we have two tasks. First, all cells in that portion that are in the radius need to be marked as visible. Second, we must figure out what portions of the next column are visible through this portion, and enqueue that work.&lt;/p&gt; &lt;p&gt;The first bit is easily done. Thanks to the great comments to my earlier entry I've learned that of course it is better to consider the question "is the bottom left corner of the cell within the visible radius?" That makes for a nicer looking circle. I've made a helper method "IsInRadius" that does that. (Note that of course this introduces yet another kind of artifact; suppose that of the four corners of a cell, only the bottom right corner of a cell is below the top vector, but only the bottom left corner is within the radius. There might be no point in the cell that is both within the radius and not blocked. We'll ignore this detail; the radius is inherently approximate.)&lt;/p&gt; &lt;p&gt;The second bit is harder. What we'll do is keep track as we move from top to bottom of this portion whether we have seen any transitions from transparent to opaque or opaque to transparent. If we find a transition from transparent to opaque then we have found a boundary for the next column portion; we'll enqueue that new portion. If we find a transition from opaque to transparent then we'll enqueue new work for the new region when either we find an opaque cell or the bottom of the portion.&lt;/p&gt; &lt;p&gt;&lt;span class="code"&gt;bool? wasLastCellOpaque = null;&lt;br&gt;for (int y = topY; y &amp;gt;= bottomY; --y)&lt;br&gt;{&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; bool inRadius = IsInRadius(x, y, radius);&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; if (inRadius)&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; {&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; // The current cell is in the field of view.&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; setFieldOfView(x, y);&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; }&lt;/span&gt;&lt;/p&gt; &lt;p&gt;A cell that was too far away to be seen is effectively an opaque cell; nothing "above" it is going to be visible in the next column, so we might as well treat it as an opaque cell and not scan the cells that are also too far away in the next column.&lt;/p&gt; &lt;p&gt;&lt;span class="code"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; bool currentIsOpaque = !inRadius || isOpaque(x, y);&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; if (wasLastCellOpaque != null)&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; {&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; if (currentIsOpaque)&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; {&lt;/span&gt;&lt;/p&gt; &lt;p&gt;We've found a boundary from transparent to opaque. Enqueue more work for later.&lt;/p&gt; &lt;p&gt;&lt;span class="code"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; if (!wasLastCellOpaque.Value)&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; {&lt;/span&gt;&lt;/p&gt; &lt;p&gt;The new bottom vector touches the upper left corner of opaque cell that is below the transparent cell. &lt;/p&gt; &lt;p&gt;&lt;span class="code"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; queue.Enqueue(new ColumnPortion(&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; x + 1,&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; new DirectionVector(x * 2 - 1, y * 2 + 1),&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; topVector));&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; }&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; }&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; else if (wasLastCellOpaque.Value)&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; {&lt;/span&gt;&lt;/p&gt; &lt;p&gt;We've found a boundary from opaque to transparent. Adjust the top vector so that when we find the next boundary or do the bottom cell, we have the right top vector. The new top vector touches the lower right corner of the opaque cell that is above the transparent cell, which is the upper right corner of the current transparent cell.&lt;/p&gt; &lt;p&gt;I normally don't modify a formal parameter like this, but it seems reasonably unconfusing in this context.&lt;/p&gt; &lt;p&gt;&lt;span class="code"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; topVector = new DirectionVector(x * 2 + 1, y * 2 + 1);&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; }&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; }&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; wasLastCellOpaque = currentIsOpaque;&lt;br&gt;}&lt;/span&gt;&lt;/p&gt; &lt;p&gt;And finally, we enqueue work for the lowest opaque-to-transparent transition, if there is one. &lt;/p&gt; &lt;p&gt;&lt;span class="code"&gt;if (wasLastCellOpaque != null &amp;amp;&amp;amp; !wasLastCellOpaque.Value)&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; queue.Enqueue(new ColumnPortion(x + 1, bottomVector, topVector));&lt;br&gt;&lt;/span&gt;&lt;/p&gt; &lt;p&gt;And there you go; that's shadowcasting from the origin in the first octant. Next time we'll deal with the other seven octants, and deal with points other than the origin.&lt;/p&gt;&lt;/div&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=10245642" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/b/ericlippert/archive/tags/shadowcasting/">shadowcasting</category></item><item><title>Shadowcasting in C#, Part Four</title><link>http://blogs.msdn.com/b/ericlippert/archive/2011/12/22/shadowcasting-in-c-part-four.aspx</link><pubDate>Thu, 22 Dec 2011 17:18:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:10247276</guid><dc:creator>Eric Lippert</dc:creator><slash:comments>6</slash:comments><wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://blogs.msdn.com/b/ericlippert/rsscomments.aspx?WeblogPostID=10247276</wfw:commentRss><comments>http://blogs.msdn.com/b/ericlippert/archive/2011/12/22/shadowcasting-in-c-part-four.aspx#comments</comments><description>&lt;div class="mine"&gt; &lt;p&gt;Last time we saw how many different ways there were to get the calculation of the top cell based on the top vector wrong. Today we'll take a briefer look at determining the bottom cell.&lt;/p&gt; &lt;p&gt;We know from our discussion of last time that the right way to determine what is the top-most visible cell in a column portion is to consider where the top vector &lt;strong&gt;leaves&lt;/strong&gt; the column. By similar logic, the right way to determine where the bottom-most cell is in a column portion is to look at where the bottom vector &lt;strong&gt;enters&lt;/strong&gt; the column. Clearly where the vector enters the current column is precisely the same place that it left the previous column, and we already know how to calculate that.&lt;/p&gt; &lt;p&gt;What sorts of things go wrong if you, say, round down instead of rounding to the cell where the vector enters the column? In my implementation of the other day I made a deliberate mistake: for the bottom direction vector I do the division and round down. This introduces some artifacts. For example, it means that you can see immediately behind nearby pillars but not behind far-away pillars:&lt;/p&gt; &lt;p&gt;&lt;a href="http://blogs.msdn.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-29-89-metablogapi/6765.ShadowCast18_5F00_2.png"&gt;&lt;img style="background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: block; float: none; border-top-width: 0px; border-bottom-width: 0px; margin-left: auto; border-left-width: 0px; margin-right: auto; padding-top: 0px" title="ShadowCast18" border="0" alt="ShadowCast18" src="http://blogs.msdn.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-29-89-metablogapi/7853.ShadowCast18_5F00_thumb.png" width="176" height="261"&gt;&lt;/a&gt;&lt;/p&gt; &lt;p&gt;&amp;nbsp;&lt;/p&gt; &lt;p&gt;Notice how the pillar to the left casts a full 90 degree shadow. However, the pillar to the right permits me to see the square immediately behind it, and &lt;em&gt;then&lt;/em&gt; casts a shadow. This is not very realistic; if I can see immediately behind the pillar then why can I not also see beyond it? &lt;/p&gt; &lt;p&gt;Note that this also presents a gameplay problem. Suppose there was a monster "hiding" behind the pillar east of me there. I could see the monster, but it could not see me! If the targeting system for arrow attacks requires line-of-sight, and line-of-sight is determined by simply computing the whole field of view and checking whether the cell in question is in the field, then I can shoot the monster but it cannot shoot me. By similar logic, a monster that is two cells to the west of the pillar west of me can shoot me, but I cannot see it or shoot it.&lt;/p&gt; &lt;p&gt;Does this problem go away if we rewrite the rounding algorithm to be less permissive by rounding more appropriately?&amp;nbsp; No! We still have a symmetry problem in a room full of pillars. Let's put both the "round down" and the "round to lowest column entry" algorithms next to each other for a comparison. Click on the control to play the game "in stereo":&lt;/p&gt; &lt;hr&gt; &lt;object data="data:application/x-silverlight-2," type="application/x-silverlight-2" width="500px" height="666px"&gt; 		  &lt;param name="source" value="http://blogs.msdn.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-29-89/2605.SilverlightShadowCasting.xap" /&gt; 		  &lt;param name="background" value="white" /&gt; 		  &lt;param name="minRuntimeVersion" value="4.0.50826.0" /&gt; 		  &lt;param name="autoUpgrade" value="true" /&gt; 		  &lt;a href="http://go.microsoft.com/fwlink/?LinkID=149156&amp;amp;v=4.0.50826.0" style="text-decoration:none"&gt;  			  &lt;img src="http://go.microsoft.com/fwlink/?LinkId=161376" alt="Get Microsoft Silverlight" style="border-style:none" /&gt; 		  &lt;/a&gt; 	    &lt;/object&gt; &lt;hr&gt;  &lt;p&gt;Notice how in both implementations it is possible to get into a position where you can see behind a pillar, but the monster standing there cannot see you:&lt;/p&gt; &lt;p&gt;&lt;a href="http://blogs.msdn.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-29-89-metablogapi/2437.ShadowCast19_5F00_4.png"&gt;&lt;img style="background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: block; float: none; border-top-width: 0px; border-bottom-width: 0px; margin-left: auto; border-left-width: 0px; margin-right: auto; padding-top: 0px" title="ShadowCast19" border="0" alt="ShadowCast19" src="http://blogs.msdn.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-29-89-metablogapi/5582.ShadowCast19_5F00_thumb_5F00_1.png" width="175" height="261"&gt;&lt;/a&gt;&lt;/p&gt; &lt;p&gt;&amp;nbsp;&lt;/p&gt; &lt;p&gt;Monsters standing at the blue arrow cells are visible from the player's location, but the monsters cannot see the player. Monsters standing at the orange arrow locations can see the player, but the player cannot see them.&lt;/p&gt; &lt;p&gt;The problem here is that the algorithm computes the set of cells where &lt;strong&gt;any&lt;/strong&gt; &lt;strong&gt;point&lt;/strong&gt; in the target cell is visible from the &lt;strong&gt;center&lt;/strong&gt; of the player's cell. In order for the algorithm to be symmetric we need to allow &lt;strong&gt;any&lt;/strong&gt; &lt;strong&gt;point&lt;/strong&gt; in the target cell to be visible from &lt;strong&gt;any&lt;/strong&gt; &lt;strong&gt;point&lt;/strong&gt; in the player's cell. That is the "permissive field of view" algorithm. It uses the same concepts that we've described here; it progressively scans a region (a quadrant, rather than an octant) and keeps a set of line segments (not vectors, since they need not go through the origin) that track what regions are in view as you get farther and farther away from the origin. It's a rather tricky algorithm to get right, and so I'm not going to present it here.&lt;/p&gt; &lt;p&gt;&lt;strong&gt;Next time&lt;/strong&gt;: Now that we know how to compute the top and bottom cells, actually running the algorithm is pretty straightforward. We'll go through the details of the code. &lt;/p&gt;&lt;/div&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=10247276" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/b/ericlippert/archive/tags/shadowcasting/">shadowcasting</category></item><item><title>Shadowcasting in C#, Part Three</title><link>http://blogs.msdn.com/b/ericlippert/archive/2011/12/19/shadowcasting-in-c-part-three.aspx</link><pubDate>Mon, 19 Dec 2011 16:06:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:10245340</guid><dc:creator>Eric Lippert</dc:creator><slash:comments>8</slash:comments><wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://blogs.msdn.com/b/ericlippert/rsscomments.aspx?WeblogPostID=10245340</wfw:commentRss><comments>http://blogs.msdn.com/b/ericlippert/archive/2011/12/19/shadowcasting-in-c-part-three.aspx#comments</comments><description>&lt;div class="mine"&gt; &lt;p&gt;Before we get started, thanks for all the great comments to the previous couple of posts. I'll be updating the algorithm to try to make even better-looking circles of light based on the comments. Like I said, there's a lot of subtleties to these algorithms and I am just learning about them myself.&lt;/p&gt; &lt;p&gt;To that end, in today's episode I am going to spend the entire prolix article analyzing a single division operation. You have been warned.&lt;/p&gt; &lt;p&gt;Before we begin though, some jargon. A cell which is invisible that by our physical interpretation ought to be visible, or a cell which is visible that ought to be invisible we will call an "artifact". An "artifact" is the product of our algorithm (or some detail of its implementation) not being a sufficiently accurate model of real-world physics. Today's article will be all about artifacts.&lt;/p&gt; &lt;p&gt;The actual workhorse that implements the field-of-view algorithm is this method that we mentioned last time:&lt;/p&gt; &lt;p&gt;&lt;span class="code"&gt;private static void ComputeFoVForColumnPortion(&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; int x,&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; DirectionVector topVector,&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; DirectionVector bottomVector,&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; Func&amp;lt;int, int, bool&amp;gt; isOpaque,&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; Action&amp;lt;int, int&amp;gt; setFieldOfView,&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; int radius,&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; Queue&amp;lt;ColumnPortion&amp;gt; queue)&lt;br&gt;{&lt;br&gt;&lt;/span&gt;&lt;/p&gt; &lt;p&gt;This method has two main purposes. First, it assumes that all points in the portion of column x bounded by the top and bottom vectors are in the field of view, and marks them accordingly; some of them might be outside the radius, but the rest of them are by assumption visible from the origin. Second, it determines which portions of column x+1 are in the field of view and adds them to the work queue for later processing.&lt;/p&gt; &lt;p&gt;We described the algorithm as working from top to bottom of the portion of the column under consideration. Therefore the very first question we must answer is "which exactly is the top cell in the column portion, given the column number and the top direction vector?"&lt;/p&gt; &lt;p&gt;If the center point of a cell in column x happens to fall &lt;em&gt;exactly&lt;/em&gt; on the top direction vector then it is pretty clear which is the top cell. Suppose for the sake of argument that's the case. The top cell is then computed by x * topVector.Y / topVector.X. This division is exact. Proving that is an easy bit of algebra and is left as an exercise. &lt;/p&gt; &lt;p&gt;So perhaps we should say that &lt;em&gt;even if the division is inexact&lt;/em&gt;, we compute the top cell by:&lt;/p&gt; &lt;p&gt;&lt;span class="code"&gt;int topY = x * topVector.Y / topVector.X;&lt;/span&gt;&lt;/p&gt; &lt;p&gt;(Note that we'll assume throughout that the numbers we are multiplying and dividing are small compared to the range of int, and therefore do not overflow.)&lt;/p&gt; &lt;p&gt;What happens if the division is inexact? We know that in C# an inexact integer division always rounds towards zero; it &lt;em&gt;rounds down&lt;/em&gt; if necessary.&lt;/p&gt; &lt;p&gt;Rounding down is a bad idea because it doesn't model the physical world well and makes for extraordinarily bad-looking gameplay. Consider this extremely common scenario:&lt;/p&gt; &lt;p align="center"&gt;Scenario One&lt;a href="http://blogs.msdn.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-29-89-metablogapi/1425.ShadowCast10_5F00_2.png"&gt;&lt;img style="background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: block; float: none; border-top-width: 0px; border-bottom-width: 0px; margin-left: auto; border-left-width: 0px; margin-right: auto; padding-top: 0px" title="ShadowCast10" border="0" alt="ShadowCast10" src="http://blogs.msdn.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-29-89-metablogapi/5710.ShadowCast10_5F00_thumb.png" width="231" height="240"&gt;&lt;/a&gt;&lt;/p&gt; &lt;p&gt;After processing column one we lower the slope of the top direction vector to one-third. Is point (2,1) in the field of view? It sure seems light it ought to be since its entire bottom surface is within the field of view. But if we do 2 * 1 / 3 we get zero, so no, the top cell that is visible in this column is (2, 0). We mark that as visible and continue on to column three without changing the slope. The top direction vector now intersects the center of cell (3, 1), so it and (3, 0) are visible. We lower the slope of the top vector to one-seventh, and now cell (4, 1) is not visible, again because we are rounding down. After processing all the columns shown here the state of affairs would be:&lt;/p&gt; &lt;p&gt;&lt;a href="http://blogs.msdn.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-29-89-metablogapi/1007.ShadowCast15_5F00_4.png"&gt;&lt;img style="background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: block; float: none; border-top-width: 0px; border-bottom-width: 0px; margin-left: auto; border-left-width: 0px; margin-right: auto; padding-top: 0px" title="ShadowCast15" border="0" alt="ShadowCast15" src="http://blogs.msdn.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-29-89-metablogapi/6215.ShadowCast15_5F00_thumb_5F00_1.png" width="231" height="240"&gt;&lt;/a&gt;&lt;/p&gt; &lt;p&gt;This matches neither the desired physics nor the desired gameplay; a straight corridor should not have weird "gap" artifacts. Notice also that the resulting top vector is a little bit too steep; we never considered the opaque cell in column four to be possibly shadowing anything beyond it; the rest of the world is only in the shadow of cell (3,1), not (4,1). &lt;strong&gt;Rounding down is clearly unacceptable.&lt;/strong&gt;&lt;/p&gt; &lt;p&gt;Well. What to do, if rounding down doesn't work? Maybe we should round up!&lt;/p&gt; &lt;p&gt;That solves the problem for long corridors; now what happens is cell (2,1) is determined to be visible by rounding up, so the top slope is lowered to one-fifth. Then cell (3, 1) is determined to be visible, so the slope is lowered to one-seventh. Then cell (4,1) is determined to be visible, and the slope is lowered to one-ninth.&amp;nbsp; That seems to be much better.&lt;/p&gt; &lt;p&gt;Moreover, we now also have the nice property that &lt;strong&gt;the corners of a room are visible&lt;/strong&gt;. Consider this common situation:&lt;/p&gt; &lt;p align="center"&gt;Scenario Two&lt;a href="http://blogs.msdn.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-29-89-metablogapi/8446.ShadowCast13_5F00_2.png"&gt;&lt;img style="background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: block; float: none; border-top-width: 0px; border-bottom-width: 0px; margin-left: auto; border-left-width: 0px; margin-right: auto; padding-top: 0px" title="ShadowCast13" border="0" alt="ShadowCast13" src="http://blogs.msdn.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-29-89-metablogapi/4645.ShadowCast13_5F00_thumb.png" width="231" height="240"&gt;&lt;/a&gt;&lt;/p&gt; &lt;p align="left"&gt;Cell (5,2) will be visible, which will render nicely, particularly if "box drawing" characters are used to represent interior corners as is the case in many roguelike games. This is a &lt;strong&gt;desirable artifact&lt;/strong&gt;.&lt;/p&gt; &lt;p&gt;But surely now we have the opposite problem; if we round up then we are potentially making cells visible that ought to be in the shadow of some opaque cell. Let's take a look at an example of that.&lt;/p&gt; &lt;p align="center"&gt;Scenario Three&lt;a href="http://blogs.msdn.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-29-89-metablogapi/7382.ShadowCast11_5F00_2.png"&gt;&lt;img style="background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: block; float: none; border-top-width: 0px; border-bottom-width: 0px; margin-left: auto; border-left-width: 0px; margin-right: auto; padding-top: 0px" title="ShadowCast11" border="0" alt="ShadowCast11" src="http://blogs.msdn.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-29-89-metablogapi/8053.ShadowCast11_5F00_thumb.png" width="231" height="240"&gt;&lt;/a&gt;&lt;/p&gt; &lt;p&gt;Cells (3,2) and (4,2) are unambiguously in the shadow of cell (2,1). But look carefully at column five. Even though the top vector does not pass through any part of (5,2) it does pass slightly above point (5,1) and therefore the division will round up such that (5,2) is considered visible! With this rounding algorithm you can "peek around a corner" a little bit. Visible point (5,2) is an artifact.&lt;/p&gt; &lt;p&gt;Even worse, consider what happens when the algorithm discovers that there is an opaque-to-transparent transition between (5,2) and (5,1). The top vector will be moved up!&lt;/p&gt; &lt;p&gt;&amp;nbsp; &lt;a href="http://blogs.msdn.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-29-89-metablogapi/7875.ShadowCast12_5F00_2.png"&gt;&lt;img style="background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: block; float: none; border-top-width: 0px; border-bottom-width: 0px; margin-left: auto; border-left-width: 0px; margin-right: auto; padding-top: 0px" title="ShadowCast12" border="0" alt="ShadowCast12" src="http://blogs.msdn.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-29-89-metablogapi/2260.ShadowCast12_5F00_thumb.png" width="231" height="240"&gt;&lt;/a&gt;&lt;/p&gt; &lt;p&gt;That top vector is now steeper than it used to be. (Also note that if this had been the top vector when processing column four then the point (4,2) ought to have been visible.)&lt;/p&gt; &lt;p&gt;Obviously this situation can continue; rounding errors in later columns can continue to make the top vector steeper and steeper. (Exercise for the reader: is there a maximum slope that the top vector can attain via repeated applications of rounding error?)&lt;/p&gt; &lt;p&gt;We could easily put a check into the algorithm implementation to say that the top direction vector must never go from a shallower slope to a steeper slope. If we decided to use always-round-up rounding then we might do that. But it gets worse. Consider this scenario that I mentioned last time:&lt;/p&gt; &lt;p align="center"&gt;Scenario Four&lt;a href="http://blogs.msdn.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-29-89-metablogapi/5807.ShadowCast6_5F00_2.png"&gt;&lt;img style="background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: block; float: none; border-top-width: 0px; border-bottom-width: 0px; margin-left: auto; border-left-width: 0px; margin-right: auto; padding-top: 0px" title="ShadowCast6" border="0" alt="ShadowCast6" src="http://blogs.msdn.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-29-89-metablogapi/2063.ShadowCast6_5F00_thumb.png" width="231" height="240"&gt;&lt;/a&gt;&lt;/p&gt; &lt;p&gt;Last time I made the simplifying assumption that cell (5,4) was out of range, and therefore not visible. But suppose the radius is larger; let's analyze this one in more detail. We'll round up to determine the highest visible cell in the column portion bounded by these vectors, so cell (5,4) is visible. We then find transitions from visible (5,4) to opaque (5,3) and opaque (5,2) to visible (5,1) (assuming that (5,1) is the bottom cell of the range; we'll discuss that assumption next time.) Therefore we have to divide this up into two sub-portions for column six. To compute the upper portion we keep the top vector the same and move the bottom vector up; to compute the lower portion we keep the bottom vector the same and move the top vector down. The result is this godawful mess:&lt;/p&gt; &lt;p&gt;&lt;a href="http://blogs.msdn.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-29-89-metablogapi/5710.ShadowCast14_5F00_2.png"&gt;&lt;img style="background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: block; float: none; border-top-width: 0px; border-bottom-width: 0px; margin-left: auto; border-left-width: 0px; margin-right: auto; padding-top: 0px" title="ShadowCast14" border="0" alt="ShadowCast14" src="http://blogs.msdn.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-29-89-metablogapi/1018.ShadowCast14_5F00_thumb.png" width="231" height="240"&gt;&lt;/a&gt;&lt;/p&gt; &lt;p&gt;&amp;nbsp;&lt;/p&gt; &lt;p&gt;The top direction vector of the upper column portion is now &lt;em&gt;below&lt;/em&gt; the bottom vector. Yikes!&lt;/p&gt; &lt;p&gt;This error again allows the player to "look around corners" in a weird way, but it really is not so bad. What will happen here is that as long as the mis-ordered vectors identify the same cell as the top and bottom cell of the visible portion for a particular column, that single cell will be visible. As soon as the portion is large enough that the top and bottom cells are different, the loop that goes from top to bottom will immediately exit.&lt;/p&gt; &lt;p&gt;Again, we could prevent this situation by doing a check that verifies that the bottom vector is never moved to be above the top vector. However, perhaps we'd decide that this situation is sufficiently rare, and the artifact is sufficiently benign, that we'd just allow it.&lt;/p&gt; &lt;p&gt;Rounding up seems better than rounding down, but this still isn't great. Hmm. What if we rounded to the &lt;strong&gt;nearest lattice point&lt;/strong&gt;? &lt;/p&gt; &lt;p&gt;Scenario One is unchanged. We still correctly compute visibility of the entire long straight wall.&lt;/p&gt; &lt;p&gt;Scenario Two is unchanged. We still make the desirable "corner artifact".&lt;/p&gt; &lt;p&gt;Scenario Three is improved. Cell (5,2) is not treated as visible, which is good because it is entirely in shadow. The top vector is not made more steep.&lt;/p&gt; &lt;p&gt;Scenario Four is unchanged. We can still end up in a situation where the top and bottom direction vectors are mis-ordered.&lt;/p&gt; &lt;p&gt;That's no change in three scenarios and a great improvement in one, so this is an unambiguous win, right?&lt;/p&gt; &lt;p&gt;Not quite. Consider this scenario:&lt;/p&gt; &lt;p align="center"&gt;Scenario Five&lt;a href="http://blogs.msdn.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-29-89-metablogapi/5001.ShadowCast16_5F00_2.png"&gt;&lt;img style="background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: block; float: none; border-top-width: 0px; border-bottom-width: 0px; margin-left: auto; border-left-width: 0px; margin-right: auto; padding-top: 0px" title="ShadowCast16" border="0" alt="ShadowCast16" src="http://blogs.msdn.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-29-89-metablogapi/4846.ShadowCast16_5F00_thumb.png" width="231" height="240"&gt;&lt;/a&gt;&lt;/p&gt; &lt;p&gt;If we round to nearest then (4,3) is not visible, even though a full 30% of its lower surface has line of sight from the origin. Furthermore, by not treating this cell as visible, we fail to lower the slope of the top vector from 3/5 to 5/9, possibly allowing more cells to be visible in higher columns that ought to be shadowed by (4,3). Round-up would have treated (4,3) as the topmost cell in the region, so round-to-nearest is not an unambiguous win over round-up. &lt;/p&gt; &lt;p&gt;At this point it would be wise to &lt;strong&gt;take a step back&lt;/strong&gt; and ask ourselves if continuing to tweak how the division rounds is the right thing to do. &lt;strong&gt;When you propose three different plausible calculations and they all turn out to be wrong in different ways, there might be an invalid assumption somewhere in the mix.&lt;/strong&gt;&lt;/p&gt; &lt;p&gt;The invalid assumption is that y = SomeKindOfRoundingOf(x, top.Y, top.X) is correct in the first place.&lt;strong&gt; It is not.&lt;/strong&gt; This calculation, no matter how you round it, is fundamentally calculating where the vector intersects the &lt;em&gt;center&lt;/em&gt; of the column. Why is the center of the column at all relevant? It is the &lt;strong&gt;edges&lt;/strong&gt; of the cell that cast shadows!&lt;/p&gt; &lt;p&gt;What we want to compute is "what is the highest &lt;strong&gt;cell&lt;/strong&gt; in the given column that is &lt;strong&gt;anywhere&lt;/strong&gt; intersected by the top direction vector?" The slope of the top direction vector is always positive; the line is always "sloping up", so the &lt;strong&gt;top&lt;/strong&gt; cell can be identified by figuring out where the vector &lt;strong&gt;leaves&lt;/strong&gt; the column. What we should be doing is working out the intercept of the vector with x + 0.5, not with x.&lt;/p&gt; &lt;p&gt;How are we going to do that? The first thing to observe is that (x + 0.5 ) * top.Y / top.X is the same thing as (2 * x + 1) * top.Y / (2 * top.X). Now everything is in integers. Let's work out the quotient and the remainder in integers:&lt;/p&gt; &lt;p&gt;&lt;span class="code"&gt;int quotient = ((2 * x + 1) * top.Y) / (2 * top.X);&lt;br&gt;int remainder = ((2 * x + 1) * top.Y) % (2 * top.X);&lt;/span&gt;&lt;/p&gt; &lt;p&gt;Let's look at a bunch of possible different possibilities. Suppose the direction vector is (5, 3), so we go five "right" for every three "up". The interesting points are the points where the vector exits the column on the right hand side. The quotient is the black horizontal line &lt;strong&gt;below&lt;/strong&gt; the interesting point. In this example the remainder is the number of tenths the interesting point is &lt;strong&gt;above&lt;/strong&gt; the quotient line. (Tenths because the denominator is 5 x 2.) The numbers at the bottom of each column are the remainders:&lt;/p&gt; &lt;p&gt;&lt;a href="http://blogs.msdn.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-29-89-metablogapi/5543.ShadowCast17_5F00_2.png"&gt;&lt;img style="background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: block; float: none; border-top-width: 0px; border-bottom-width: 0px; margin-left: auto; border-left-width: 0px; margin-right: auto; padding-top: 0px" title="ShadowCast17" border="0" alt="ShadowCast17" src="http://blogs.msdn.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-29-89-metablogapi/6114.ShadowCast17_5F00_thumb.png" width="231" height="240"&gt;&lt;/a&gt;&lt;/p&gt; &lt;p&gt;(Note that the dividend will always be an odd number and the divisor will always be an even number, and therefore the remainder will always be an odd number. Proving those assertions is left as an exercise. Hint: what possible y values can the top direction vector take on if restricted only to integers?)&lt;/p&gt; &lt;p&gt;So, how should we round? Consider the columns labeled 1 and 3. In those the rounded-down quotient correctly identifies the cell that the direction vector intersects. The columns labeled 7 and 9, however, have a problem; the quotient is one below the correct result; we have rounded down incorrectly. What about the column labeled 5? If the remainder is exactly the "run" value of the top direction vector then the vector passes exactly through the boundary where two cells meet; which one should be visible? Since this is the "top" bounding vector, we should round down; no area of the upper cell is visible.&lt;/p&gt; &lt;p&gt;So, in summary: &lt;strong&gt;use the rounded-down quotient as the top cell in the column if the remainder is top.X or less; otherwise, round it up by adding one to the quotient.&lt;/strong&gt;&lt;/p&gt; &lt;p&gt;Does that solve our problems?&lt;/p&gt; &lt;p&gt;Scenario One: &lt;strong&gt;Good&lt;/strong&gt;. We correctly put the whole corridor wall into the field of view.&lt;/p&gt; &lt;p&gt;Scenario Two: &lt;strong&gt;Good&lt;/strong&gt;. We "correctly" put the invisible corner cell into the field of view.&lt;/p&gt; &lt;p&gt;Scenario Three: &lt;strong&gt;Good&lt;/strong&gt;. Cell (5,2) is not identified as being visible, and therefore the top vector's slope is not increased.&lt;/p&gt; &lt;p&gt;Scenario Four: &lt;strong&gt;Bad&lt;/strong&gt;. We incorrectly identify cell (5,4) as being visible through cell (5,3), and thereby produce not only an artifact, but an "inverted" set of top and bottom vectors for the next column.&lt;/p&gt; &lt;p&gt;Scenario Five: &lt;strong&gt;Good&lt;/strong&gt;. We make cell (4,3) visible and lower the slope of the top vector accordingly.&lt;/p&gt; &lt;p&gt;&lt;strong&gt;This algorithm is not perfect; we still make some artifacts.&lt;/strong&gt; How might we solve the issue of scenario four?&lt;/p&gt; &lt;p&gt;A couple ways come to mind. One is that we could check to see if the direction vector enters the column at (5,3) and exits at (5,4). If it does then (5,4) is only the top cell if (5,3) is transparent. &lt;/p&gt; &lt;p&gt;Another way would be to allow cell (5,4) to be visible regardless -- this might have nice properties for showing corners, even if the cell is technically an artifact -- but to detect whether the new bottom vector is steeper than the old top vector and not allow the recursion.&lt;/p&gt; &lt;p&gt;In my actual implementation of last week I decided that solving the problem of scenario four is not worth it to me; I allow the artifacts and the inverted range. The results seem pretty decent.&lt;/p&gt; &lt;p&gt;As I warned you, it took me an extremely long article with nine complicated diagrams to figure out how to divide two numbers to determine the top cell. I said this was going to be excessively detailed! &lt;/p&gt; &lt;p&gt;&lt;strong&gt;Next time&lt;/strong&gt; we'll do the same analysis for determining the bottom cell in the column portion. Hopefully things will go a bit quicker now that we have the basic idea of how rounding produces artifacts down pat.&lt;/p&gt; &lt;p&gt;&amp;nbsp;&lt;/p&gt;&lt;/div&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=10245340" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/b/ericlippert/archive/tags/shadowcasting/">shadowcasting</category></item></channel></rss>
