I know the answer (it's 42)

A blog on coding, .NET, .NET Compact Framework and life in general....

December, 2006

Posts
  • I know the answer (it's 42)

    Is object oriented programming good?

    • 5 Comments

    When we use a tool for a long time we get used to it and think in its terms rather than using it to help materialize our thoughts. Mostly this is not a good idea.

    We were taught OOP in college and kind of told that even though functional programming sounds cool its only for funky stuff you never use in real life. We were told that dynamic languages were good for scripting and its too slow to use for real life projects. Soon we started believing in these stories. 

    Everything was going on fine, until I attended a training from Alexandar Stepanov couple of years ago. He is widely known as the father of generic programming and one of the key person behind C++ STL. Funny part is, in the training he started ridiculing the whole premise of OOP. His point was that computing consists of applying algorithms to data. So algorithms comes first and that is the key part in programming. He suggestes thinking in algorithms and not in terms of objects. If you ever used C++ Standard Template Library (STL) you must've noticed the perculiarity. The whole of STL is not OO. It talks about applying algorithms to containers. STL code looks like sort(v.begin(), v.end()); Not like v.Sort(); as one would expect in a OO system.

    I highly recommend reading his interview here where he talks about his thoughts on this.

    Object oriented system is frequently taught with examples like Employee, Employer, Car, Animals. I think as these examples suggest they are more suitable in business applications. If you are writing an kernel or maybe writing a Test suite (like I am) they make less sense. You want to be liberated of the burden of always thinking in objects and rather concentrate of the algorithms you need to implement.  

    With time I have started coding seriously on functional and dynamic languages. The more I use them, the more they seem natural.

  • I know the answer (it's 42)

    Why can we only use constants in a switch-case statement?

    • 15 Comments

    Why can we only use constants in a switch-case statement? The following code fails to compile with the error “A constant value is needed” for someStr as it is not a constant string.

    static void func(string str)
    {
        switch(str)
        {
            case "Zaphod": Console.WriteLine("The king"); break;
            case someStr: Console.WriteLine("The coder"); break;
            default: Console.WriteLine("None"); break;
        }
    }
    string someStr = "Noo";
    

    Here goes a long answer to this short question.

    The reason is simple and yet involved. Let’s take the following valid code which only has constants and see how it works.

    static void func(string str)
    {
        switch(str)
        {
            case "Zaphod": Console.WriteLine("The king"); break;
            case "Abhinab": Console.WriteLine("The coder"); break;
            default: Console.WriteLine("None"); break;
        }
    }
    

    If we open see the code in IL it looks something like this

          L_0007: ldstr "Zaphod"
          L_000c: call bool string::op_Equality(string, string)
          L_0011: brtrue.s L_0022
          L_0013: ldloc.0 
          L_0014: ldstr "Abhinab"
          L_0019: call bool string::op_Equality(string, string)
          L_001e: brtrue.s L_002f
          L_0020: br.s L_003c
          L_0022: ldstr "The king"
          L_0027: call void [mscorlib]System.Console::WriteLine(string)
          L_002c: nop 
          L_002d: br.s L_0049
          L_002f: ldstr "The coder"
          L_0034: call void [mscorlib]System.Console::WriteLine(string)
          L_0039: nop 
          L_003a: br.s L_0049
          L_003c: ldstr "None"
    

    See the usage of op_Equality in L_000C and L_0019. This indicates that even though we are using switch-case, ultimately the code is converted to multiple if-then-else by the compiler. So the switch-case is converted by the compiler to something like

    if (str == "Zaphod")
        Console.WriteLine("The king"); 
    else if (str == "Abhinab")
        Console.WriteLine("The coder");
    else
        Console.WriteLine("None");
    

    If this is the case then what is stopping the case statements from having non-constants? In case of a non-constant the code generated could be something like if (str == someNonConstantStr) which is valid code.

    The answer is simple. When the numbers of cases are larger, the generated code is very different and is not constituted of if-then-else. Isn’t that obvious? Otherwise why would anyone ever use switch-case and why would we call switch case to be faster??

    Lets see when we have a large number of case’s as follows what happens.

    switch(str)
    {
        case "Zaphod": Console.WriteLine("The king"); break;
        case "Abhinab": Console.WriteLine("The coder"); break;
        case "Ford": Console.WriteLine("The Hitchhiker"); break;
        case "Trilian": Console.WriteLine("The traveler"); break;
        case "Marvin": Console.WriteLine("The Robot"); break;
        case "Agrajag": Console.WriteLine("The Dead"); break; 
        default: Console.WriteLine("None"); break;
    }
    

    For this first a class is generated by the compiler which looks like

    Internal class CompilerGeneratedClass
    {
          internal static Dictionary CompilerGenDict;
    }
    

    And then for the switch case the following code is generated.

    if (CompilerGeneratedClass.CompilerGenDict == null)
    {
        Dictionary dictionary1 = new Dictionary(6);
        dictionary1.Add("Zaphod", 0);
        dictionary1.Add("Abhinab", 1);
        ...
        CompilerGeneratedClass.CompilerGenDict = dictionary1;
    }
    if (CompilerGeneratedClass.CompilerGenDict.TryGetValue(
                                               str, out num1))
    {
        switch (num1)
        {
        case 0: Console.WriteLine("The king"); return
        case 1: Console.WriteLine("The coder"); return;
        case 2: Console.WriteLine("The Hitchhiker"); return;
        ...
        }
    }
    Console.WriteLine("None");
    

    What this means is that first time the function is called a Dictionary of strings (key) and int (value) is created and all the cases are stored in this dictionary as the key and an integer as a value is stored against it. Then for the switch statement the string is taken and is queried in the dictionary and if it is present the number value for the string is returned. Using this number the compiler creates an efficient jump table and it jumps to the target Console.Writeline string.

    Now the answer :). The strings are pre-stored in the dictionary. If the strings in the case statement were not constants, changes in them won’t reflect in the dictionary and hence you’d land up comparing against stale value. To avoid this inconsistency the non-constant values are not supported at all.

    Obviously for dynamic values the dictionary cannot be used and hence there is no optimization possible for switch-case so one should anyway use if-then-else.

  • I know the answer (it's 42)

    Names of Indian Cities

    • 4 Comments

    The biggest issue with the names of Indian cities is that they change :). I know of some other countries where it changes as well.

    Of the 4 largest cities in India, 3 have changed names in the recent past. Calcutta became Kolkata (2001), Madras became Chennai (1996), Bombay became Mumbai (1995). The most recent being the change of Bangalore to Bengaluru (2006). So next time someone uses the term Bangalored they need to be corrected.

    The change of names is typically done to replace imperial names with a local names or to correct the spelling to match the local pronunciation. Frequently the name change is controversial.

    Additional problem is that the names of streets also change. I have a friend whose house shifted from Theatre Road, Calcutta to Shakespeare Sarani, Kolkata without them having to call a packer and mover.

    I guess that explains why I attended University of Calcutta in Kolkata or why Kolkata airport code is CCU.

  • I know the answer (it's 42)

    switches and jump tables

    • 2 Comments

    In my last post I had discussed about how only constants can be used with C# switches. From the post's comments and later discussing with other folks I learnt something that came to me as a surprise. A lot of people working on managed code consider switch-case to be a stylistic variant of if-else, and that is all.

    However, in the C/C++ world switch is not just a variant of if-else (neither is it in .NET), it's a fast (O(1)) variant of if-else (O(n)). Stating that switch is just a better way to express multiple comparison against the same variable is stating Dictionary<T> is just another form of List<T>. They are not (Dictionary can give you O(1) lookup results). 

    For example C restricts the case to have constant-expression. This is done so that the compiler can generate optimized jump-table for its execution. Let's consider the following code.

    switch(i) {
        case 4:... 
        case 5:... 
        case 6:... 
        case 7:... 
    // many more cases... default:... }

    For this code the machine code generated is similar to the steps below.

    1. Compile time jump table creation: For each case statement a fixed block of memory is reserved. Say 8 bytes. These 8 bytes contain a jump (jmp) instruction to the location where the actual code for the case resides. The base address of this table is labeled as say JMPTABLEBASE.
    2. Normalizes the value of i as i = i - 4 (the first value of the case) 
    3. Boundary check: For the i it sees if the value is larger than the largest case (7-4 = 3), in case it is the execution flows to default.
    4. Jump to address JMPTABLEBASE + (i * 8)

    As you can see from above the whole thing happens at constant time.

    Some embedded C compilers (like the TI C-compiler) generates separate code section named .switch for the jumptable. Later this section can be targetted to the high-speed internal DARAM for faster execution in case the switch needs such special treatment.

    Obviously compilers vary a lot in this manner

  • I know the answer (it's 42)

    Creating Immutable types

    • 3 Comments

    Someone asked on a C# DL "How do I create an immutable type"

    The answer was obviously by creating a type in which you don't allow writeable fields or properties or have methods that change the state of the type. However, the questions started revolving around how System.String have methods to update the string but in reality it creates a new instance. All of this is really simple to achieve.

    Lets think of an Employee immutable class, where the class has a Name and Salary property and allows an increment to be made to the Salary. However, since Employee is immutable the Increment results in a new instance of the class to be created with an incremented Salary.

    class Employee
    {
        public Employee(string name, uint salary)
        {
            this.name = name;
            this.salary = salary;
        }
    
        public Employee Increment(uint delta)
        {
            // A new instance is created
            return new Employee (this.Name, this.Salary + delta);
        }
    
        readonly string name;
        public string Name { get { return name; }}
    
        readonly uint salary;
        public uint Salary { get { return salary; }}
    }

    In the code above there is no field and the properties are get only. So there is no possibility of making state changes with these. The Increment method creates a new instance of the class with the increment without touching the class on which it is called. All of this makes Employee an immutable type.

  • I know the answer (it's 42)

    Conditional methods and their Parameter evaluation

    • 4 Comments

    In .NET the System.Diagnostics.ConditionalAttribute can be used for conditional compilation. A method can be marked with this attribute as follows

    using System.Diagnostics;
    static class Logger
    {
        [Conditional("TRACE")]
        static public void Info(string message)
        {
            Console.WriteLine(message);
        }
    }
    

    This means that the Logger.Trace is compiled into IL only when the TRACE sumbol is defined either in the command line build as /define:TRACE or in the VS IDE in Project->Properties->Build->General->Conditional Compilation symbols. This makes ConditionalAttribute an excellent choice for logger and debug utilities. For example the logger can have the Info and verbose methods marked with [Conditional("Debug")] and hence they do not add to the burden of in release bits. Conditional attributes is a much better choice than #ifdefs as they are always syntax verified.

    However, it brought up a very interesting discussion recently. Consider the following code

    MyClass mc = new MyClass();
    Logger.Info(mc.ToString());
    

    The question is does the mc.ToString evaluate when TRACE is not defined?? Or in other words are parameters of conditionally compiled methods evaluated? If it does then even for release bits overhead starts creeping in. The good news is that it doesn't. An easy way to verify is to override MyClass::ToString and put in a Console.WriteLine into it. This method will not be called if TRACE is not defined. Or you can just use ILDASM or .NET Reflector to see that the method call doesn't even go into the IL and hence there can be no overhead.

  • I know the answer (it's 42)

    Funny messages are not always funny

    • 8 Comments

    I used to work in a Company where Easter Eggs used to be considered a feature and there used to be a official maintainer for it (I used to maintain both the Easters listed here and more). I used to feel that Microsoft shouldn't have moved away from inserting Easter eggs and should've stuck with funny messages in its Software. Microsoft Max (now canned) use to give really funny messages like suggesting that I get coffee as the installation may take some time.

    However, with time I have realized that funny messages are not always funny (even though I still believe Easter eggs are). If you look up the demographic usage data of Orkut it is mostly used in Brazil (60%) and the 3rd highest usage is in India (12%) and growing astronomically. Server failures in Orkut is given out as a "Bad, Bad Server. No donut for you" message. Now the problem is in India we do not eat Donut and I have a ton of non-geek friends who've never been to the US and have no idea what a donut is. One of them got majorly irritated with the message.

    I guess for free services where there is no paying customers it's OK to have these kind of funny light-hearted messages, but still you need to target your jokes well. Lets hope in the Future we have India servers throwing up "Bad Bad server, no Vada for you" messages :)

  • I know the answer (it's 42)

    Binary banner

    • 2 Comments

    BoingBoing has a story of a store that sells t-shirts with offensive messages spelled out in binary. I think its a nice way to generate email signature or web-page banner where you can really say offensive stuff and get away with it. I wrote the following code in Ruby which takes any word and spits out the binary ASCII representation.

    ARGV.each do |str|
      def PrintBin(ch)
        mask = 0x80
        while mask != 0
          print (if (ch & mask) == 0 then "0" else "1" end)
          mask >>= 1
        end
      end
    
      str.each_byte do |ch|
        PrintBin ch
      end
      print "\n"
    end
    

    My Ruby is completely rusted so it took me some time to get this done. Later I tried writing the same code in C#. It took me over 3 times the LOC it took me to do it up in Ruby. I'm sure a experinced Ruby programmer can make this even more simpler and concise.

    Since this is my official blog, I won't post anything offensive here. But on my personal web-site no one stops me :)

    C:\MyStuff\Code\Ruby>BinaryBan.rb Krikkit
    01001011011100100110100101101011011010110110100101110100

    Update

    As I thought, I'm far more rusty in Ruby that I'd like to imagine :) Daniele Alessandri got it in way less number of lines via the printf (%b) (see comment below). His code is as follows.

    ARGV.each do |str|
       str.each_byte { |b| printf("%08b", b) }
       puts
    end
    
    
  • I know the answer (it's 42)

    Conditional Text

    • 0 Comments

    We have moved to a new Satellite TV provider some time back. It is time to pay the quarterly bill, so I dug up their manual to look for online payment options. Sure enough there was a section on "Internet payments". I opened the section and it had one line. "For Internet payment options click here"!! How the hell am I supposed to click on a paper?

    The reason for the line is simple enough, they are just distributing printed copies of their online documentation.

    I shared this with couple of friends and the discussion soon turned to the issues with maintaining multiple versions of the same document. I figured out soon enough that they have not heard about conditional text supported in most DTP software. I was in the Dev team for Adobe FrameMaker and it was one of the features in the product. It works very much like the following C* kind of code 

    #if Web
         Console.WriteLine("Click <a href=\"http://www.abhinaba.com\">Here</a>");
    #elif Doc
         Console.WriteLine("Visit http://www.abhinaba.com");
    #endif
    

    If the symbol WEB is defined then the fancy Click Here is printed else the URL is printed out. Conditional Content works very much like this. You can define document wide variables and associate text, images (any supported content) with these variables. Later you switch on/off one or more of these variables to print out various versions of the same doc. So all your common content remains common and you have the ability to pick/select from the rest.

    No idea if Office supports this. But with the powerful collaboration features supported in Word, I highly suspect that this is indeed supported.

  • I know the answer (it's 42)

    Test-driven development

    • 2 Comments

    A lot have already been said about Test Driven Development (TDD) by a lot of people, but I'd still like to add my 0.02paisa.

    We have an internal requirement of checking in UnitTests along with the product code and the code coverage for the unit tests needs to be high. Most of our developers have over 85% code coverage.

    In my sources I decided to try out TDD. I used the following steps

    1. Write the method's prototype so that it matches the design doc and throw a NotImplementedException in it.
    2. Write unit-tests in Visual Studio Team System unit-test framework. I try to cover all the requirements, even the ones required for negative testing like pass an invalid handle and catch an ArgumentNullException and verify that the correct argument is mentioned in ArgumentNullException.Param.
    3. After that  I run the tests. All the tests obviously fail with lots of X marks.
    4. Then I go about fixing each of the test failures by adding the functionality in the code.
    5. After each fix (or couple of them) I run the tests and the red marks keep changing into test passed green ticks.
    6. Once I'm done, I run with code coverage and then add more tests if required to cover the blocks which were not touched by the tests.

    Even though the system looks simple it has helped me enormously by catching multiple bugs at the beginning. Even trivial tests like tests for GetHashCode and operator overloads found issues :) The fun in seeing all those X marks disappear one after the other brings in a childish zeal to get them done even faster.

    The other goodness is that after every code change later I can easily fire these tests for sanity check.

  • I know the answer (it's 42)

    Change the world or go home

    • 1 Comments

    Saw this via Steve Clayton's blog. This is going to be my new wallpaper...

Page 1 of 1 (11 items)